Tuesday, May 21, 2013

Is there really an open data El Dorado?

I was reading a tweet yesterday from Australia's CTO, John Sheridan, and it raised an interesting question for me.
Is government open data really a new goldmine for innovation?

The Economist's article, A new goldmine, makes a strong case for the value of open data through examples such as GPS, the Global Positioning System which is owned by the US government (who owns the satellites), but has been provided free to organisations around the world since 1983.

I've also seen fantastic studies in the UK and Australia talking about the value in releasing public sector information (PSI) as open data, and great steps have been taken in many jurisdictions around the world, from Australia to Uruquay, to open up government silos and let the (anonymised) data flow.

I agree there's fantastic value in open data; for generating better policy deliberations and decisions, for building trust and respect in institutions and even for stimulating innovation that leads to new commercial services and solutions.


However I do not believe in an open data El Dorado - the equivalent of the fabled city of gold - where every new dataset released unveils new nuggets of information and opportunities for innovation.

Indeed I am beginning to be concerned that we may be approaching a Peak of Inflated Expectations (drawing on Gartner's famous Hype cycle chart) for open data, expecting it to deliver far more than it actually will - a silver bullet, if you will, for governments seeking to encourage economic growth, transparency and end world hunger.

Data is a useful tool for understanding the world and ourselves and more data may be more beneficial, however the experience of the internet has been that people struggle when provided with too much data too quickly.

Information overload requires humans to prioritise the information sources they select, potentially reinforcing bias rather than uncovering new approaches. Data can be easily taken out of context, misused, distorted, or used to tell a story exactly the reverse of reality (as anyone closely following the public climate change debate would know).

Why assume that the release of more government data - as the US is doing - will necessarily result in more insights and better decisions, particularly as citizens and organisations come to grips with the new data at their fingertips?

A data flood may result in exactly the reverse, with the sheer volume overwhelming and obscuring the relevant facts, or the tyranny of choice leading to worse or fewer decisions, at least in the short-term.


The analogy of open data as a gold mine may be true in several other respects as well.

The average yield of a gold mine is quite low, with many mines reporting between one and five grams of gold per tonne of extracted material. In fact gold isn't even visible to the naked eye until it reaches 30 grams per tonne.

While several hundred years ago gold was easier to find in high concentrations and therefore easier to extract - leading to many of history's gold rushes - over time people have mined most of the highest gold concentrations.

Extraction has become laborious and costly, averaging US$317 per ounce globally in 2007.

There is definitely gold in open data, value in fresh insights and innovations, opportunities to build trust in institutions and reduce corruption and inefficiency in governance.

However if open data is at all like gold mining, the likelihood is that the earlier explorers will find the highest yields, exploring new datasets to develop insights and innovations.

By the gold mine comparison we are currently in the open data equivalent of the gold rushes, where every individual who can hoist a line of code can dig for riches as a data miner, while data analysis companies sell spades.

Following the analogy, data miners will shift from open data site to open data site, seeking the easy wins and quick insights.

However as the amount of open data grows and most of the easy wins have been found, it will get more expensive to sift increasing amounts of data for fewer insights, requiring greater and greater investments in time and effort to extract the few remaining nuggets of 'gold'.

At that point many government open data sites may become virtual ghost towns, dominated by large organisations with the ability to invest in a lower yield of insights.

Alongside these organisations, only a few tenacious data mining individuals will remain, still sifting the tailings and hoping to find their open data El Dorado.

No comments:

Post a Comment