Monday, July 4, 2011

Google data mining tools for journalists and information specialists

 Google has free online data mining tools available that have the ability to impact our news operations in a far-reaching constructive manner! Journalists and information specialists need awareness of these tools that can be utilized to effectively mine information already available in the public domain.

We had the opportunity to attend a workshop by a Google team for journalists, editors and media workers on 7 June at the SABC.

The pitch of the workshop was as follows:
Google provides journalists with a powerful set of tools, to help find, extract and understand information. The workshop will cover ways to find people, organisations, and events all in real-time, as well as to track trends and opinions. And, to ensure journalists stay on top of issues or beats, Google also offers automated and personalized search agents or 'bots' that independently scour the web for you, issuing alerts the moment new information is found.

Journalism itself is changing though. Audiences are being swamped by the sheer volumes of information available online, especially as citizen journalists and agencies such as the UN, World Bank, and governments begin releasing raw data. Simply reporting the information is no longer the most important role for media. The best journalists are instead beginning to help audiences make sense of all this information, by analysing and organizing the raw data. 

Learn how free tools like Google Fusion Tables, Google Refine and even Public Data Explorer can make it easier for our audiences to understand complicated information by turning the raw numbers and text into animated maps, graphics, and graphs.
These tools also allow newsrooms to disaggregate or deconstruct news stories into geographic or demographic data, which allows us to build customized news products that are automatically tailored for people's location or their socio-economic profiles. This ability to personalize news, for consumption on mobile phones or the new tablet computers, gives us revolutionary opportunities for inventing new ways to tell our stories.
Peter Barron (Executive Director Communications Google) and Julie Taylor (Communications Manager Google South Africa) of Google Communications Africa presented the workshop about some of the online tools that are available to us.
Google’s advanced search capabilities were discussed – they call it a “surgical tool”.
- The search box can be used as calculator & currency converter & metrics conversions.
- Google Realtime search is an“up-to-the-second social updates about hot topics around the world”:
  • Twitter is a great way to find people to quote.
  • Discussions on blogs are a great way to find information.
Youtube can be utilized as a way of generating user-generated content (citizen journalism).
- The Journalist Toolbox is a compilation of all of the tools available to journalists at the moment (
- Google Timeline view is a great way to visualize a story over a certain period of time (
- Google Books, which already has made available over 7000 books, is available as well.  Journalists and writers are able to publish directly to World libraries, and to get a book out in two months (for example a journalist with an in-dept story).
Some more tools are available here: Options

Justin Arenstein spoke more about data journalism, which he successfully implemented being an award winning investigative journalist based in South Africa.

The tools that can be utilized for advanced storytelling were mentioned, and it needs an investigation to see what can be utilized for our purposes.

- Google Ngram – analyse phrases and concepts
- Google Public Data Explorer – tools to analyse government data
- Google Fusion Tables – turn structured data into graphics
- Google Refine – cleaning up messy data
- Google Maps - use to plot information and news correlated to geographical location, for example the census
- Google Data Visualization – “a dynamic chart to visualize several indicators over time”
- Google Public Data Explorer
- Google Data Wiki
- Open Data Kit
- Google City Tours - a cache of information where the landscape “speaks” to a person
- Google Goggles - mobile app for searching
- YouTube Feather – videos for low bandwidth (which can be successfully used in Africa with its bandwidth problems)
- Google Moderator – voting tool

Google’s blog about their work in Africa: It helps to stay up to date with the latest work of Google in Africa.

All these tools are available for free. Some are still in the beta and testing phase, and sometimes they can be revoked by Google. It is up to the users to make their voices heard when they have found a valuable tool.

South Africa is a country rich with raw data, waiting to be mined. The possibility is there for journalists [...and information specialists] to become industry leaders by way of data mining
                                                                           – Justin Arenstein

I realize that this is just a mention or a broad overview of some of the possibilities available to us. It needs investigation and testing of the tools to see how we can implement it. It is very difficult for busy journalists who are battling to get their next story in - while there are manpower shortages -  to also follow-up on tools like these. The same goes for us as the information specialists who need to look after our collections, and battle our growing backlog issues.

But can we continue to ignore the huge information possibilities available to us through online tools?

Blog post by Karen du Toit, Afrikaans Archivist in the SABC Radio Archives.


  1. Thanks for publishing - I use some - others will be tried. Thanks for putting this list together.

  2. This comment has been removed by a blog administrator.

  3. This comment has been removed by a blog administrator.

  4. This comment has been removed by a blog administrator.

  5. This comment has been removed by a blog administrator.

  6. This comment has been removed by a blog administrator.

  7. This comment has been removed by a blog administrator.

  8. This comment has been removed by a blog administrator.

  9. This comment has been removed by a blog administrator.

  10. This comment has been removed by a blog administrator.


We welcome any feedback and comments!