CJR index of fake-news, clickbait, and hate sites

Step one in evaluating the credibility of a news story is: Consider the source? To help fight fake news, we've compiled an index of untrustworthy sources.

Goals

  1. Compile the most complete, up-to-date list of active fake-news sites.
  2. Make the list dynamic, auto-adding new sites and removing inactive ones.
  3. Build a blacklist for advertisers to keep their ads off bogus sites (and for researchers who study disinformation).
Icons: Distort the Facts, No evidence, False, Misleading, Spins the Facts
Rulings from FactCheck.org

We built the table below by merging the major curated fake-news site lists, then purging those sites no longer active. It has each site’s domain name, Alexa rank, and any tags (e.g., fake, clickbait) assigned by, with links to, these fake-news lists:

Fake-news, clickbait, and hate sites

Auto-updating the list

The next phase is to automate this list by:

  • Dynamically removing inactive sites.
  • Adding sites by following 301 redirects, which often lead to new fake-news schemes.
  • Harvesting related fake-news domains via their shared IDs and IPs.

For instance, the infamous clickbait and conspiracy disseminator, YourNewsWire.com, now redirects to NewsPunch.com. The SpyOnWeb research tool detects their relationship, along with other domains connected by IP address or Google Analytics and AdSense ID.

Icons: Fake News!, Fake Poll!, Clickbait, Unproven!;
Rulings from Lead Stories

Blacklisting bots, fraud and false-news sites

Fake news is a business. Much of that business is ad-supported. Advertisers don’t want to support shady publishers, but they often have no choice.

Most ad-tech dashboards make it hard for businesses to prevent their ads from appearing on (and funding) sketchy sites. Marketers can enter a blacklist, but those lists have been out-of-date and incomplete. Until now: A dynamically updated list, such as ours, could generate an accurate, current blacklist, allowing advertisers to stop supporting bogus sites.

Fake-news icons in Polish and Indonesian
Rulings from Demagog (Polish): fałsz (falsehood), nieweryfikowalne (unverifiable), and manipulacja (manipulation); and from Tempo (Indonesian): keliru (erroneous), tidak terbukti (not proven)

Methodology

Definitions of the above site classifications are at the Fake News Codex, OpenSources, and PolitiFact.

The lists combined had 1,125 unique domain names. Of these, as of November 2018, the 548 above were still active and another 577 (51 percent) inactive, either no longer online or no longer posting stories. We detected inactive sites programatically by retrieving HTTP status codes (404s or 301s), using auto-generated screenshots, and, in some cases, by visual inspection.

We curated the resulting list, trimming it a bit, by removing several sites whose stories, though highly partisan, were mostly not fake: alternet.org, cato.org, heritage.org, nationalreview.com, thedailybeast.com, theintercept.com, thinkprogress.org, and weeklystandard.com. We determined this by checking their stories at PolitiFact and Snopes.

Several sites we reviewed had mostly false fact-check judgments. These stayed on the list: addictinginfo.org, dailycaller.com, dailykos.com, and judicialwatch.org.

Our Google spreadsheet has additional data: the year of domain registration and the number of scripts each site uses for advertising and tracking (thanks to BuiltWith). There's also a sheet of correlations between factors and averages for individual factors.

OpenSources Data We thank all the people who compiled and curated the indices from which we pulled data. Special kudos to Professor Melissa Zimdars of Merrimack College who led the Herculean assembly at OpenSources.

Icons: Incorrect, Unsupported, Flawed Reasoning, Misleading
Rulings from Climate Feedback

Corrections?

If you have additions or corrections, please use this form to notify us. Remember, our list includes only sites whose stories are demonstrably false -- not merely biased or partisan. Send links to fact-checks demonstrating whether the site you’d like us to review publishes fake or fact-based news.

How we’re using the data

Compiling this data taught us lots about the fake-news business. We’re using the list to find out: