How machine learning is revolutionizing journalism

Machine learning

The rise of the machine has freed ICIJ members globally to pore over millions of documents in a custom-built search engine.

But even this next-level research has posed substantial challenges: for example, what to do when certain phrases return an indigestible 150,000 results? Clearly, the next step to speeding up our research was to intelligently filter information relevant to each investigation.

Here’s how we streamlined the previously daunting process, giving us both unprecedented flexibility and the required search success rate.

Step 1: Wrangle the big data

In leaks like the Paradise Papers, we dealt with millions of documents (including PDFs, photos, and emails)

