Shortly after a devastating earthquake struck Haiti in January, a small team of workers with Ushahidi, a project that enables people to crowdsource and map crisis information, started sifting through information online and mapping reports of damage, security threats, people in need of assistance, and other data.
“From the very initial hours after the earthquake, what we did was deploy the Ushahidi platform and started monitoring any available sources of information that were out there,” Jaroslav Valuch, the project manager for Ushahidi Haiti, told me.
We spoke when he was in Montreal earlier this week to take part in a panel, “Citizen use of new media for the defense of human rights,” at the Citizen Media Rendez-Vous conference. I was on a different panel and managed to grab a few minutes of his time to chat about the challenge of verifying crowdsourced information. We also discussed how this relates to the upcoming beta launch of SwiftRiver, a software project that grew out of Ushahidi and calls itself a “free and open source software platform that uses algorithms and crowdsourced interaction to validate and filter news.” (Mathew Ingram at GigaOm previously wrote a post with some good background on the project.) The SwiftRiver beta launches on Monday.
During the first hours after the earthquake, the Ushahidi Haiti team consisted of just a few people. Valuch focused on analyzing international and Haitian media, Twitter, and Facebook to look for reports of people trapped, violence, collapsed buildings, those in need of medical attention, or other pieces of information that could explain what was happening on the ground. In the end, the Ushahidi Haiti map and information helped Marines and other responders figure out where to go to provide help, especially outside of the capital. To make this happen, the Ushahidi Haiti team had to sift through the river of reports, tweets and information and figure out which items deserved to be added to the map, and which should be discarded. In the end they decided to err on the side of inclusion, and to use tags to highlight the level (or lack) of trust they had in a given piece of information.
“Even though the information from Twitter is not particularly reliable—and things are being retweeted so it’s kind of messy—the basic idea is if you crowdsource the information and put it on one map you can really see the clusters of incidents,” Valuch said. “So even though one particular tweet is not that important, if you have similar reports from the media … you can see where the incidents are clustering.”
The Ushahidi platform enables users to tag reports as “not verified” if they didn’t come from a reliable source. The Ushahidi Haiti team discovered that by mapping the unverified reports, they were able to see if different sources were reporting similar things in similar areas. It was verification by aggregation. They would also attempt to verify tweets by seeing if they were retweeted by trusted sources, checking if the originating Twitter account was followed by people in Haiti, and looking to see if the user had enabled location data in their tweets.
The team focused less on monitoring media once they had a short code that anyone in Haiti could use to submit information by cell phone. (In order to try and verify those reports, they often called back the phone number to try and speak with the person who sent the report.) In the end, over 2,000 reports submitted by cell phones were added to the map.
Valuch admitted the process wasn’t perfect; but it showcases some of the techniques that can be used in crowdsourced verification. It’s interesting to note how the team used a mass of unverified reports in order to achieve accuracy. Ushahidi is a map-driven project, so it chose to cluster the unverified reports in order to look for patterns, but there are other ways of collecting, analyzing, and presenting this information. The challenge is to find a way to quickly and accurately sort and evaluate a mass of incoming reports according to your preferences. This is a core element of distributed verification, which I called “the best way to engineer trust in today’s information environment” in a previous column about WikiLeaks’ Afghanistan documents.