Just in case you missed this the other week, on 21 January the UK launched the data.gov.uk website with 2,500 government datasets available for access and reuse by the public.
This leapfrogged the US's data.gov, which now has around 1,000 datasets available.
The UK site also extends the government open data space in several other directions, with a wiki and forum supporting discussion and collaboration between people reusing datasets in the site and a Ideas tool for submitting ideas on what data should be released and how it should be combined to provide new and useful insights.
The site also includes a gallery of applications developed to make use of government data, making it a central place to locate these applications.
I believe this is the new world leader for open data websites from government - though I look forward to the day when Australia tops it (in data.gov.au).
How could we top it with data.gov.au?
Here's some ideas:
- Build in a data analysis and visualisation module that allows people without technical expertise to combine, model and view datasets, no matter their origin (like IBM's Manyeyes).
- Then allow people to embed these visualisations into their own sites.
- Support community submission of data that can then be shared and used by government alongside government datasets to improve insights and understanding - including allowing the appropriate Creative Commons copyright to be embedded into these datasets as part of the submission process.
- Comments on datasets - allow every set of data to support a discussion to allow people to ask questions to clarify what the dataset contains and discuss how it could be presented in a more usable way.
- Allow tagging of datasets and applications - so that over time there's a bottom-up folksonomy that people can use to find related data or search on, rather than relying on government metadata (which may not match the community's mental models).
- Support data correction through the site - if someone detects an error in a dataset there should be a clear path to notify the submitter of the data and have it corrected.
- Vote on applications, allowing the community to provide feedback on how useful and valuable they found them. The voting mechanism should be able to be embedded with applications in other sites, rather than rely on people returning to data.gov.au to vote.
- GEOmapping engine, to map locations such that they can be placed on maps, rather than having to have people build their own tools to transform the data.
- Collaborative data modelling projects - where the community is invited to work together to model data, assisting the government and community.
- Data competitions with cash prizes. Similar to the NetFlicks Prize, provide the tools for government agencies - and even commercial entities - to create competitions to solve tricky data problems through crowdsourcing.
- Create user profiles and including information on how many applications / data visualisations and other activities they have undertaken in relation to the data site. People respond to competitive challenge and recognition - like in the Australian National Library's Manyhands project.
- Create webinars and run physical events to raise awareness of the site and to show Australians (developers, corporates, not-for-profits, interested parties) how easy it is to reuse government data.
- Hold annual awards for the best applications, including peoples' choice awards based on user votes and awards for schools and students to encourage an interest in and innovative uses of data.
To finish up - here's a good presentation from Sir Tim Berners Lee (who has led the work on data.gov.uk) on why we need to make government data available in raw reusable form, to the public.