Panama Papers and beyond: Massive investigations with tons of documentsNetzwerk Recherche Jahrestagung 2016

Huge data-driven projects like the Panama Papers and Offshore Leaks investigations would not have been possible without the technical platforms, specially developed by the ICIJ. These platforms allow the network to make millions of leaked documents accessible in a secure manner to journalists all over the world. This workshop will give insights into the technical backbone of these heavily-coordinated investigations: How the OCR process of multiple terabytes of documents worked out within a matter of days and how they could be made securely accessible and searchable. Furthermore, journalists will learn what to do with similar data from other sources and document types – how to squeeze all the juice out of it by finding matches in other data sets and how to make the data accessible to collaborators using a web app, the same way the ICIJ does.