Monday, February 01, 2010

UK launches (and how Australia could top it)

Just in case you missed this the other week, on 21 January the UK launched the website with 2,500 government datasets available for access and reuse by the public.

This leapfrogged the US's, which now has around 1,000 datasets available.

The UK site also extends the government open data space in several other directions, with a wiki and forum supporting discussion and collaboration between people reusing datasets in the site and a Ideas tool for submitting ideas on what data should be released and how it should be combined to provide new and useful insights.

The site also includes a gallery of applications developed to make use of government data, making it a central place to locate these applications.

I believe this is the new world leader for open data websites from government - though I look forward to the day when Australia tops it (in

How could we top it with

Here's some ideas:
  • Build in a data analysis and visualisation module that allows people without technical expertise to combine, model and view datasets, no matter their origin (like IBM's Manyeyes).
  • Then allow people to embed these visualisations into their own sites.
  • Support community submission of data that can then be shared and used by government alongside government datasets to improve insights and understanding - including allowing the appropriate Creative Commons copyright to be embedded into these datasets as part of the submission process.
  • Comments on datasets - allow every set of data to support a discussion to allow people to ask questions to clarify what the dataset contains and discuss how it could be presented in a more usable way.
  • Allow tagging of datasets and applications - so that over time there's a bottom-up folksonomy that people can use to find related data or search on, rather than relying on government metadata (which may not match the community's mental models).
  • Support data correction through the site - if someone detects an error in a dataset there should be a clear path to notify the submitter of the data and have it corrected.
  • Vote on applications, allowing the community to provide feedback on how useful and valuable they found them. The voting mechanism should be able to be embedded with applications in other sites, rather than rely on people returning to to vote.
  • GEOmapping engine, to map locations such that they can be placed on maps, rather than having to have people build their own tools to transform the data.
  • Collaborative data modelling projects - where the community is invited to work together to model data, assisting the government and community.
  • Data competitions with cash prizes. Similar to the NetFlicks Prize, provide the tools for government agencies - and even commercial entities - to create competitions to solve tricky data problems through crowdsourcing.
  • Create user profiles and including information on how many applications / data visualisations and other activities they have undertaken in relation to the data site. People respond to competitive challenge and recognition - like in the Australian National Library's Manyhands project.
  • Create webinars and run physical events to raise awareness of the site and to show Australians (developers, corporates, not-for-profits, interested parties) how easy it is to reuse government data.
  • Hold annual awards for the best applications, including peoples' choice awards based on user votes and awards for schools and students to encourage an interest in and innovative uses of data.
If you have other ideas on how could be better than the UK and US efforts, please add them in the comments below.

To finish up - here's a good presentation from Sir Tim Berners Lee (who has led the work on on why we need to make government data available in raw reusable form, to the public.


  1. Good thinking here, Craig. The UK has done a bang up job on its data under Sir Tim's steady hand. I dearly hope the Australian effort regains the momentum it appears to have lost.

    In fact, apart from the appointment of Tom Burton (a singualrly notable event in any case), the whole Australian Government 2.0 scene appears to have gone off the boil. I noted as much on my blog a couple of weeks ago.

    I hope there's much happening in the open data and Government 2.0 movement that we're not seeing here, as it would be a great shame for all the hard work of last year to disappear into a committee somewhere. Perhaps this momentum is something for a BarCamp discussion?

  2. Or perhaps adopt some of the suggestions being offered to the US/UK sites - most of them look site/country-agnostic.

  3. a framework then needs to be built to connect/link countries data. why stop at connecting a single nations data. connect the world

  4. It looks like the UK one just has metadata for the datasets, and provides a link to the actual data.

    A central hub like this is great, but works better if each Government department puts their data out there in reusable formats, with nill or reusable copyright terms.

    Take the example, the data is NOT provided in a reusable format like XML (okay the HTML may be XHTML but it is not really optimal), and the copyright status does not allow the reuse of data. A can do it's part by adding metadata and linking to, but has to do their part to make the data reusable.

    The government departments should never underestimate the value of a simple csv file.

  5. Thanks Graig for collecting the ideas.

    I made a short benchmarking tour to the worlds data catalogs while preparing to organize a virtual workshop between catalog people from several contries: (you can't cut.and.paste links here!?)

    The current catalogs are still at their infancy. I believe that the worlds data catalogs should work seamlessly togeather... some sort of data registry standard. After that the visualization tools etc. what you suggested would work for all.

    -Jogi (Finland)

  6. Nice article. Thought it was very interesting to read.