New/s/leak at WissensWerte 2018

We present new/s/leak at a panel discussion at WissensWerte 2018, Germany’s most important dialogue forum for science journalists. On November 20, 2018 together with panelists from journalism, IT startups, and other universities, we will discuss how artificial intelligence contributes to journalistic work. In case of new/s/leak, we employ machine learning to automatically extract relevant information such as named entities and keywords from texts. This enables us to create interactive comprehensive visualizations of large text collections which contribute to a fast exploration for investigative purposes.

The session description as well as the full conference program can be found here.

Paper accepted at SocInfo 2018 conference in St. Petersburg

Newsleak will be presented at the Social Informatics conference 2018 which takes place from 25-28th of September in St. Petersburg, Russia. The conference paper is published in LNCS series of Springer (here). A preprint can be found here.

Abstract: Investigative journalism in recent years is confronted with two major challenges: 1) vast amounts of unstructured data originating from large text collections such as leaks or answers to Freedom of Information requests, and 2) multi-lingual data due to intensified global cooperation and communication in politics, business and civil society. Faced with these challenges, journalists are increasingly cooperating in international networks. To support such collaborations, we present the new version of new/s/leak 2.0, our open-source software for content-based searching of leaks. It includes three novel main features: 1) automatic language detection and language-dependent information extraction for 40 languages, 2) entity and keyword visualization for efficient exploration, and 3) decentral deployment for analysis of confidential data from various formats. We illustrate the new analysis capabilities with an exemplary case study.

Newsleak 2.0 pre-release software demo

Since the first version of Newsleak, a lot has been improved behind the scenes as well as in the front-end of the software. We want to encourage journalists, to try out a pre-release of Newsleak 2.0 on their own. For this, we provide a software demonstration. This demo is populated with ca. 26,500 documents collected from Wikipedia in four languages (English, German, Hungarian and Spanish) and mostly centered on the topic of World War II. The idea behind this demo is to show you the analysis capabilities to quickly explore a large, multilingual collection.

For lazy clickers, we provide a Youtube video where you can follow a proceeding of an exploratory analysis and filtering process drilling down to some details of inner-Chinese political tensions during WW2.

Presentation at #EIJC18 & Dataharvest conference

This Saturday, we present new/s/leak at the European investigative journalism conference (EIJC). Here you can find the slides of our presentation about “Information Extraction and Visualisation for Investigative Journalism”.

If you are interested to try new/s/leak with your own data, visit the Github page containing the Docker setup of our application.

In June, we will publish a detailed blog post on how to setup Hoover and Newsleak to analyze collections on your own machines.

Dataharvest Conference #EIJC18

From Thursday 24 to Sunday 27 May 2018, the EIJC 2018 conference (European Investigative Journalism Conference) will take place in Michelen (Belgium). We as newsleak project will participate and discuss requirements and needs of our targeted user group. All about the conference you can find out on this website:

Funding extension

We are happy to announce that the new/s/leak project receives some additional funding from the Volkswagen Stiftung. Until summer 2018, new/s/leak will be extended and refactored to achieve the following goals:

  • easy deployment for own usage
  • comprehensive and detailed documentation
  • improved user interface
  • improved information extraction (better keyterm extraction, named entity recognition, support of user dictionaries)
  • support for multiple languages (among others english, german, spanish, french, arabic, chinese)

Follow the updates on this blog to see how far we got 🙂


new/s/leak demo @ SPIEGEL

Now that we’re in the middle of new/s/leak’s home stretch, we had a final demo at SPIEGEL in Hamburg. After some exciting and productive development sprints, we proudly introduced the software to journalists, documentarists and software developers, who gave us the best feedback by playing around with the tool and becoming absorbed in using it. Some evidence:

We also collected some more systematic feedback, which helped us prioritizing the remaining tasks. Thanks to everyone who came along, played and gave feedback – we had a blast at the meeting, and we learned a lot!

If you also want to see what changed  in new/s/leak since we have shown it to an academic audience at ACL: here is the link to the demo (please use the Chrome Browser!)

For a quick introduction, you can also watch a video (from our academic publication @ VIP):

During the upcoming weeks until christmas, we’ll add some more requested features, fix some bugs, and create an easy-to-deliver software package. Stay tuned for a deployable version!

new/s/leak @ VIP

Last week, new/s/leak had its academic debut in the visualization science community at the Visualization in Practice Workshop, co-located with the IEEE VIS 2016 conference.

Here is the paper documenting the software with a focus on visualization. Needless to say that it’s always fun to present new/s/leak and get more feedback:

Kathrin presenting new/s/leak

Kathrin presenting new/s/leak

Thanks to everyone who came and visited us!


Paper accepted @ VIS 2016

Our Paper “new\s\leak — A Tool for Visual Exploration of Large Text Document Collections in the Journalistic Domain” has been accepted for presentation at the poster session of the Visualization in Practice Workshop, which is part of the IEEE VIS 2016 conference. The workshop will take place in Baltimore Maryland, USA on October 24-25.

VIS is one of the most important conferences in visualization science. new/s/leak fits perfectly in this year’s VIP workshop, the focus of which is design, development, distribution, and application of open source visualization and visual analytics software.

Meet us at the demo session in Baltimore!

new/s/leak @ ACL 2016

Last week, we presented new/s/leak for the first time in public: we had our demo session at the annual meeting of the Association for Computational Linguistics, which was held in Berlin this year.
If you haven’t had the chance to attend (or you had and want to have some references now):

  • Here’s the paper documenting our software (you’ll find a large part of the information from the paper in this blog, too)
  • Here’s our poster (in PDF format)

And, of course, we took some pictures, testifying how much fun we had (click for larger versions):

Alex walking through the new/s/leak poster (with Seid listening)

Alex walking through the new/s/leak poster (with Seid listening)

Seid explaining new/s/leak

Seid explaining new/s/leak

Chris, Heiner and Seid busy discussing new/s/leak

Chris, Heine, Seid and Alex discussing new/s/leak with different people

Seid and many curious new/s/leak fans

Seid and many curious new/s/leak fans

The crew busy explaining, demonstrating and (secretly) playing with new/s/leak

The crew busy explaining, demonstrating and (secretly) playing with new/s/leak

Thanks to everyone who stopped by, especially for the great suggestions for improvements! Be sure that we’re working on that while you’re reading this post. If you have any more ideas, application visions or simply want to debate information extraction and / or journalism – please get in touch!