One of the simplest ambitions of the Declassification Engine is also one of the toughest ones to pull off: show how, in the digital era, the government should keep more documents than it does already. The tools that the Declassification Engine is developing could, in theory, be useful not just to those of us on the receiving end of declassified documents but also to government workers trying to figure out what should go out to the public and what should stay secret. Connelly gets a little bit upset when he talks about how State Department archivists routinely trash documents after a cursory statistical analysis of their usefulness. These included migration records, like applications for passports and visas.
“At no point did it seem that they had any sense of the possibility of data mining,” he says. “You can learn things from data mining, even from seemingly mundane materials. You can look at patterns, in visas and passports, about how people move around the word, that you might not see looking at individual records.”
Declassification Engine researchers have already started talking to government archivists to understand better the work they do and to start making the case that their tools could be useful to to the government, too. “Especially now that we’re dealing with electronic records and the cost of storage is trivial, at least save it,” says Connelly. “Don’t destroy it. Just wait until we find ways of managing it.”
Disclosure: CJR has received funding from the Motion Picture Association of America (MPAA) to cover intellectual-property issues, but the organization has no influence on the content.