Showing posts with label big data. Show all posts
Showing posts with label big data. Show all posts

Monday, March 06, 2023

Artificial Intelligence isn't the silver bullet for bias. We have to keep working on ourselves.

There's been a lot of attention paid to AI ethics over the last few years due to concerns that use of artificial intelligence may further entrench and amplify the impact of subconscious and conscious biases.

This is very warranted. Much of the data humans have collected over the last few hundred years is heavily impacted by bias. 

For example, air-conditioning temperatures are largely set based on research conducted in the 1950s-70s in the US, on offices predominantly inhabited by men and folks wearing heavier materials than worn today. It's common for many folks today to feel cold in offices where air-conditioning is still set for men wearing three-piece suits.

Similarly, many datasets used to teach machine learning AI suffer from biases - whether based on gender, race, age or even cultural norms at the time of collection. We only have the data we have from the last century and it is virtually impossible for most of it to be 'retrofitted' to remove bias.

This affects everything from medical to management research, and when used to train AI the biases in the data can easily affect the AI's capabilities. For example, the incredibly awkward period just a few years ago when Google's image AI incorrectly identified black people as 'gorillas'. 

How did Google solve this? By preventing Google Photos from labelling any image as a gorilla, chimpanzee, or monkey – even pictures of the primates themselves - an expedient but poor solution, as it didn't fix the bias.

So clearly there's need for us to carefully screen the data we use to train AI to minimise the introduction or exacerbation of bias. And there's also need to add 'protective measures' on AI outputs, to catch instances of bias, both to exclude them from outputs and to use them to identify remaining bias to address.

However, none of this work will be effective if we don't continue to work on ourselves.

The root of all AI bias is human bias. 

Even when we catch the obvious data biases and take care when training an AI to minimise potential biases, it's likely to be extremely difficult, if not impossible, to eliminate all bias altogether. In fact, some systemic unconscious biases in society may not even be visible until we see an AI emulating and amplifying them.

As such no organisation should ever rely on AI to reduce or eliminate the bias exhibited by its human staff, contractors and partners. We need to continue to work on ourselves to eliminate the biases we introduce into data (via biases in the queries, process and participants) and that we exhibit in our own language, behaviours and intent.

Otherwise, even if we do miraculously train AIs to be entirely bias free, bias will get reintroduced through how humans selectively employ and apply the outputs and decisions of these AIs - sometimes in the belief that they, as humans, are acting without bias.

So if your organisation is considering introducing AI to reduce bias in a given process or decision, make sure you continue working on all the humans that remain involved at any step. Because AI will never be a silver bullet for ending bias while we, as humans, continue to harbour them.

Read full post...

Tuesday, January 24, 2017

You've Been Hacked - how far should governments go to protect against the influence of foreign states?

Like most people with a broad digital footprint I've been hacked multiple times, usually in fairly minor ways.

Around ten years ago I had my PayPal account hacked through malware in the Amazon site, costing me $300.

PayPal staff insisted this was a legitimate payment for goods (which I hadn't ordered) being delivered to my legitimate address in Norway (despite having provably never visited the country). I've been very cautious & limited in my PayPal use since, and never recommend them.

Over Christmas last year my Social Media Planner site was hacked and seeded with malware. Fortunately my IT team was able to identify, isolate and address the matter, without affecting visitors, but costing me financially (two weeks downtime). It's fine now BTW, with extra protections in place.

I've had a Skype account taken over by someone in Eastern Europe, who used it for phishing before I could reclaim it, had basic account details stolen in Yahoo, LinkedInDropBox and a range of other large-scale hacks of commercial services over the last five years - excluding the Ashley Madison hack (I've never been a member).

I'm not the only one affected by any means, well over 10 billion accounts were hacked in 2016 alone, with Australian politicians, police and judges outed as affected in at least one of these hacks (and a few in this one too).

Much of this widespread hacking results in the theft of limited personal information. On the surface it may appear to pose little risk to individuals or organisations. 

However the individual reuse of passwords and usernames can turn these hacks into a jackpot. This allows hackers, and clients they sell hacked data to, to access a wider range of accounts for individuals, potentially uncovering richer information that is useful for identity theft, economic theft, intelligence gathering or for influencing decisions and behaviour.

Despite all the reports of hacking, it seems many people still treat this lightly - the world's most popular password remains '123456'.

Most governments, however, do not. Securing their networks is a major challenge and a significant expense item. The data agencies hold has enormous political and economic value that could be easily misused to the detriment of the state if it falls into the wrong hands, or into the right hands at the wrong time.

It's not simply about troop movements or secret deals - early access to economic or employment data, access to the 'negotiables' and 'non-negotiables' for a trade deal, or even to the locations and movements of senior political figures (to know who they meet and for how long) can be used for the financial and political advantage of foreign interests at the expense of a state's own interests.

For the most part, Australia's government is decent at managing its own network security. This isn't perfect by any means, but there's a good awareness of the importance of security across senior bureaucrats and largely effective ongoing efforts by agencies to protect the secure data they hold.

However in today's connected world national interest goes far beyond the networks directly controlled and managed by governments. As we've seen from the US (and now Germany), political parties and individual politicians have also become hacking targets for foreign interests,

This isn't surprising. Politicians, potential politicians and even academics have long been targets for funding assistance and free or subsidised study trips to nations hoping to cultivate influence in various ways. In fact these approaches provide some positive benefits as well - by creating personal relationships between powerful people that can lead to improved national relationships, trade deals and even avert wars.

Hacking, however, has few of these positives, as we saw in the release of Democratic National Congress emails by Wikileaks, which were most likely obtained through Russian state-sponsored hacking and likely was designed to influence the US's election outcome.

Whether you believe the cumulative findings of the US intelligence community or not, it is certain that foreign states, and potentially large multi-nationals corporations, will continue to target political parties, and individual politicians, seeking insights into how they think and levers of overt and covert influence for economic and political gain.

Hacking will continue to grow as one of the major tools in this work.

The Australian Government is taking this seriously - and kudos to them for this.

However even this focus on political parties neglects a wide range of channels for influencing current and potential future politicians. What about their other memberships and personal accounts?

Politicians and potential politician are well-advised to position themselves in various community and business groups to improve their networks, build relationships and future support. They are also just as likely as other Australians to use the internet - for work and personal reasons.

This means they're likely to have numerous online accounts with both domestic and foreign-owned services, with varying levels of security and access control. 

On top of this, it's not simply politicians who may be the targets of influence. Political advisors and activists often shape and write party policy positions, despite never being publicly elected. Influence an advisor and you can influence policy, as the many registered lobbyists know only too well.

Equally bureaucrats across government often are exposed to material that could, if shared with foreign interests, cause some form of harm to a state. We've seen this in insider trading by an ABS staff member, where the economic gain to the individual public servant outweighed his good judgement and public duty.

While bureaucrats are security assessed to a significant degree (unlike our politician) and selection processes are in place, backed by rules and penalties, to screen out the 'bad eggs', the potential for public servants to be influenced through hacking their personal accounts has risen along with their internet use.

Right now we're in an environment where the number of attack vectors on a politician, an advisor and on individual public servants, is much higher than at any past time in history - while our tools for protecting against foreign influences have not kept up.

Of course this goes both ways - our government also has the capacity, and often the desire, to influence decisions or negotiations by other states. We've seen ample evidence of this, although it isn't really a topic our government wants to discuss.

The question for me, and I don't have a solid answer yet, is how far technically should a government go to limit the influence of foreign states.

Should governments merely advise political parties on how to secure themselves better?

Or should governments materially support parties with trained personnel, funding or even take over the operation of their networks (with appropriate Chinese walls in place)?

What type of advice, training or support should agencies provide to their staff and Ministerial advisors to help them keep their entire footprint secure, not just their use of work networks, but all their digital endeavours?

And what can be done to protect future politicians, advisors and bureaucrats, from wide sweeps of commercial services collecting data that could be useful for decades to come?

We need to have a more robust debate in this country about how foreign states and commercial interests may be seeking to influence our policies, and decide as citizens the level of risk we're prepared to accept.

Until this occurs, in a mature and informed fashion, Australia is hurtling forward into an unknown future. A future where our political system may be under constant siege from those who seek to influence it, in ways that are invisible to citizens but more wide-reaching and dangerous to our national interest than any expense scandal.

If this isn't the future that we want, then it is up to us to define what we want, and work across government and the community to achieve it.

Read full post...

Tuesday, July 21, 2015

How does government manage the consequences of an imbalance in speed of transparency & speed of accountability?

One of the emerging challenges for governments in the online age is managing the discrepancy between the speed of transparency and the speed of accountability.

With digitalisation and the internet, the speed at which government information is made public is becoming faster, with it being easier to collect, aggregate and publish information and data in near or even real-time.

We see this particularly in public transit data, where many cities around the world now publish real-time data on the location and load of their buses, trains and trams, and in the health industry where a number of states have begun offering near real-time data on the congestion in emergency waiting rooms.

We're also seeing similar near real-time reporting on river levels, dams, traffic congestion and closures, and estimated real-time reports on everything from population to national debt levels.

This trend is expanding, with the Sense-T network in Tasmania pioneering an economy-wide sensor network and data resource. Similarly the Department of Finance in Canberra is working on a system to provide real-time budget information on government expenditure down to every $500 for internal management and public transparency purposes.

This trend is a leap forward in government transparency, providing citizens, bureaucrats and politicians with far greater visibility on how our governance systems are performing and far more capability to identify trends or patterns quickly.

We're seeing a similar transparency event at the moment, with the expenses scandal enfolding the Speaker of the House of Representatives, Bronwyn Bishop, related to her use of a helicopter and several charter flights to attend political fund-raising events.

What this event has also highlighted is that while Australia's governance systems are increasing the speed of transparency, our capability to apply that information to accountable decision-making isn't consistently accelerating at the same rate.

In other words, while we increasingly can obtain the information needed for rapid decision-making, the entrenched processes and methods for decision-making in government are lagging far behind.

We see this in the failure rate of IT projects, which can drag on for years after it's clear they will fail, when laws fail to work as they should and it takes months or years to amend them, when the public has judged a politician's actions, but parliament can take no formal action for months due to being out of session.

Of course many sound reasons can and are given by bureaucrats and politicians as to why decisions need to take lots of time.

Decision-makers from the pre-internet world will say that they need to ensure they have all the necessary data, have digested it, reflected on it, considered alternatives and consequences, consulted widely and only then are able to tweak or change a decision.

This is a fair position with many defensible qualities - it reflects the world in which these people grew up, when decision-making could be undertaken leisurely while the world waited.

However both management theory and the behaviour of our communities have changed.

Start-ups grow and become huge companies based on their ability to make decisions rapidly. They are continuously experimenting and testing new approaches to 'tweak' their businesses for greater success. This is underpinned by streams of real-time data which show the consequences of each experimental change, allowing the organisations to adjust their approach in very short time-frames, minimising their potential losses from sub-optimal decisions.

The community equally reacts very quickly to evidence of poor decisions and bad outcomes, with the internet, particularly social media, fueling this trend.

While this doesn't mean the community is consistently in the right on these matters, it does require decision-makers to respond and address concerns far more rapidly than they've had to in the past - 'holding the line' or 'depriving an issue of oxygen' are no longer effective strategies for delaying decision-making into the leisurely timeframes that older decision-makers grew up with.

This issue in the disparate speed of transparency (data release) and accountability (clear and unequivocal response) is growing as more organisations release more data and more of the public is collecting, collating and releasing data from their interactions with organisations.

The imbalance is fast becoming a critical challenge for governments to manage and could lead to some very ugly consequences if politicians and agencies don't rethink their roles and update their approaches.

Of course governments could attempt to sit back and 'tough it out', trying to hold their line against the increasing speed of transparency and accountability. In my view this would result in the worst possible result in the long-term, with increasingly frustrated citizens resorting to more and more active means to have government take accountability for their decisions in the timeframes that citizens regard as appropriate.

My hope is that government can reinvent itself, drawing on both internal and external capabilities and expertise to find a path that matches fast transparency with appropriately fast accountability.

I'd like to see governments challenge themselves to test all of their historic assumptions and approaches - reconsidering how they develop policy, how they consult, how they legislate and how they engage and inform the community, in order to address a world where 'outsiders' (non-public servants) are identifying issues and worrying trends at an accelerated rate.

Perhaps we need a radical new ways to develop and enforce laws, that provide scope for experimentation within legislation for agencies to reinterpret the letter of a law in order to fulfill it's desired outcomes and spirit.

Perhaps we need continuous online consulting processes, supported by traditional face-to-face and phone/mail surveys, which allow government to monitor and understand sentiment throughout policy development and implementation and allow a 'board' of citizens to oversee and adjust programs to maximise their effectiveness over time.

Perhaps we need mechanisms for citizens to put forward policies and legislation for parliament to consider, tools that allow citizens to recall politicians for re-election or a citizen-led approach to determining what entitlements are legitimate for politicians and what they should be paid, with penalties and appropriate recourse for citizens to sack representatives who fail to uphold the values the community expects at a far greater speed than the current election cycle.

There's sure to be many other ideas and mechanisms which may help deliver a stable and sustainable democratic state in the digital age of high-speed transparency and accountability - we just need governments to start experimenting - with citizens, not on them - to discover which work best.

Read full post...

Wednesday, February 05, 2014

How Cancer Research UK is using mobile gaming to conduct medical research

Recently the World Health Organisation announced that cancer had overtaken heart disease as the number one killer of Australians, as well as being the number one killer of people globally.

The WHO had another message as well. That cancer was a largely preventable disease.

Humans have lots of medical data about cancer. With millions of cases each year there's a vast amount of data available to researchers that can help them understand how to prevent and treat the disease.

Much of this data needs to be analysed by the human eye as computers are not flexible or sophisticated enough to recognise the patterns that humans can detect.

This is where the bottleneck occurs. Lots of data, but few paid researchers.

To address this issue Cancer Research UK, a charity focused on cancer research, held a GameJam in March 2013 in London hoping to come up with game concepts that would help analyse cancer data.

Within 48 hours they had 9 working games and 12 game prototypes, different approaches combining cancer data analysis with fun and replayability.



Over the last year the charity has been working with a game developer to refine several of these games to the level where they could be publicly released.

Now, Cancer Research UK has just launched the first free mobile game (for Android and iOS) that has players analysing cancer data while they're having fun.

Named Genes in Space, players must map their way through subspace then fly the route in a custom spaceship, collecting a fictional substance called Element Alpha and dodging or blowing up asteroids on the way. The more Element Alpha they collect, the more money they make, allowing them to further customise their ship.

Meanwhile cancer researchers harvest the data created by players at two points, when they map their route and when they fly it. The subspace that players map is real genetic data, and while Element Alpha is fictional, what players are actually collecting is data that helps researchers make sense of the genetic structure.

I've long been a fan of combining data with gameplay. We need to make research and science fun to lead more people into the area. If people think they're simply playing a game rather than doing science, that's fine too.

I hope that one day soon we'll see an Grade A game developer take an interest in this area and set out to integrate elements of science data research into a high quality game.

However to get here, we'll also need to see research institutes and governments, who hold the data, interested in pursuing new ways to analyse data, rather than relying on a few expensive researchers.

Until that happens, I guess we'll have to be satisfied playing Genes in Space.

Or Cellslider, or FoldIt...


Read full post...

Wednesday, January 15, 2014

Rethinking government IT to support the changing needs of government

We recently saw a change in the federal government in Australia, with a corresponding reorganisation of agency priorities and structures.

Some departments ceased to exist (such as Department of Regional Australia), others split (DEEWR into two departments, Education and Employment) and still others had parts 'broken off' and moved elsewhere (Health and Ageing, which lost Ageing to the (renamed) Department of Social Services).

This isn't a new phenomenon, nor is it limited to changes in government - departments and agencies are often reorganised and reconfigured to serve the priorities of the government of the day and, where possible, create efficiencies - saving money and time.

These adjustments can result in the movement of tens, hundreds or even thousands of staff between agencies and regular restructures inside agencies that result in changing reporting lines and processes.

While these reorganisations and restructures - Machinery of Government changes (or MOGs) as they are known - often look good on paper, in reality it can take time for efficiencies to be realised (if they are actually being measured).

Firstly there's the human factor - changing the priorities and allegiances of staff takes time and empathy, particularly when public servants are committed and passionate about their jobs. They may need to change their location, workplace behaviours and/or learn a new set of processes (if changing agency) while dealing with new personalities and IT systems.

There's the structural factor - when restructured, merged or demerged public sector organisations need to revisit their priorities and reallocate their resources appropriately. This can extend to creating, closing down or handing over functions, dealing with legal requirements or documenting procedures that an agency now has to follow or another agency has taken over.

Finally there's the IT factor - bringing together or separating the IT systems used by staff to enable them to do their work.

In my view the IT component has become the hardest to resolve smoothly and cost-effectively due to how government agencies have structured their systems.

Every agency and department has made different IT choices - Lotus Notes here, Microsoft Outlet there, different desktop environments, back-end systems (HR and Finance for example), different web management systems, different security frameworks, programming environments and outsourced IT partners.

This means that moving even a small group of people from one department to another can be a major IT undertaking. Their personal records, information and archival records about the programs they work on, their desktop systems, emails, files and more must be moved from one secure environment to another, not to mention decoupling any websites they manage from one department's web content management system and mirroring or recreating the environment for another agency.

On top of this are the many IT services people are now using - from social media accounts in Facebook and Twitter, to their email list subscriptions (which break when their emails change) and more.

On top of this are the impacts of IT service changes on individuals. Anyone who has worked in a Lotus Notes environment for email, compared to, for example, Microsoft Outlook, appreciates how different these email clients are and how profoundly the differences impact on workplace behaviour and communication. Switching between systems can be enormously difficult for an individual, let alone an organisation, risking the loss of substantial corporate knowledge - historical conversations and contacts - alongside the frustrations of adapting to how different systems work.

Similarly websites aren't websites. While the quaint notion persists that 'a website' is a discreet entity which can easily be moved from server to server, organisation to organisation, most 'websites' today are better described as interactive front-ends for sophisticated web content management systems. These web content management systems may be used to manage dozens or even hundreds of 'websites' in the same system, storing content and data in integrated tables at the back-end.

This makes it tricky to identify where one website ends and another begins (particularly when content, templates and functionality is shared). Moving a website between agencies isn't as simple as moving some HTML pages from one server to another (or reallocating a server to a new department) - it isn't even as easy as copying some data tables and files out of a content management system. There's enormous complexity involved in identifying what is shared (and so must be cloned) and ensuring that the website retains all the content and functionality required as it moves.

Changing IT systems can be enormously complex when an organisation is left unchanged, let alone when when teams are changing agencies or where agencies merge. In fact I've seen it take three or more years to bring people onto an email system or delink a website from a previous agency.

As government increasingly digitalises - and reflecting on the current government's goal to have all government services delivered online by 2017 - the cost, complexity and time involved to complete  these MOG changes will only increase.

This risks crippling some areas of government or restricting the ability of the government of the day to adjust departments to meet their policy objectives - in other words allowing the (IT) tail to wag the (efficient and effective government) dog.

This isn't a far future issue either - I am aware of instances over the past five years where government policy has had to be modified to fit the limitations of agency IT systems - or where services have been delivered by agencies other than the ones responsible, or simply not delivered due to agency IT restrictions, costs or issues.

Note that this isn't an issue with agency IT teams. These groups are doing their best to meet government requirements within the resources they have, however they are trapped between the cost of maintaining ageing legacy systems - which cannot be switched off and they don't have the budget to substantially replace them - and keeping up with new technological developments, the increasing thirst for IT-enabled services and gadgets.

They're doing this in an environment where IT spending in government is flat or declining and agencies are attempting to save money around the edges, without being granted the capital amounts they need to invest in 'root and branch' efficiencies by rebuilding systems from the ground up.

So what needs to be done to rethink government IT to support the changing needs of government?

It needs to start with the recognition at political levels that without IT we would not have a functioning government. That IT is fundamental to enabling government to manage a nation as large and complex as Australia - our tax system, health system, social security and defence would all cease to function without the sophisticated IT systems we have in place.

Australia's Prime Minister is also Australia's Chief Technology Officer - almost every decision he makes has an impact on how the government designs, operates or modifies the IT systems that allow Australia to function as a nation.

While IT considerations shouldn't drive national decisions, they need to be considered and adequately resourced in order for the Australia government to achieve its potential, realise efficiencies and deliver the services it provides to citizens.

Beyond this realisation, the importance of IT needs to be top-of-mind for Secretaries, or their equivalents, and their 'C' level team. They need to be sufficiently IT-savvy to understand the consequences of decisions that affect IT systems and appreciate the cost and complexity of meeting the priorities of government.

Once IT's importance is clearly recognised at a political and public sector leadership level, government needs to be clear on what it requires from IT and CIOs need to be clear on the consequences and trade-offs in those decisions.

Government systems could be redesigned from the ground-up to make it easy to reorganise, merge and demerge departments - either using common IT platforms and services for staff (such as an APS-wide email system, standard web content management platform, single HR of financial systems), or by only selecting vendors whose systems allow easy and standard ways to export and import data - so that a person's email system can be rapidly and easily moved from one agency to another, or the HR information of two departments can be consolidated in a merger at low cost. User Interfaces should be largely standardised - so that email works the same way from any computer in any agency in government - and as much code as possible should be reused between agencies to minimise the customisation that results in even similar systems drifting apart over time.

The use of these approaches would significantly cut the cost of MOGs, as well as free up departmental IT to focus on improvements, rather than meeting the minimum requirements, a major efficiency saving over time.

Unfortunately I don't think we're, as yet, in a position for this type of significant rethink of whole of government IT to take place.

For the most part government still functions, is reasonably efficient and is managing to keep all the lights on (even if juggling the balls is getting progressively harder).

It took the complete collapse of the Queensland Health payroll project to get the government there to act to rethink their systems, and it is likely to take a similar collapse - of our Medicare, Centrelink or tax system - for similar rethinking to occur federally.

However I would not like to be a member of the government in power when (not if) this occurs.

Read full post...

Friday, October 18, 2013

Suggestions for governments stepping into open data

I've been completing a survey for the Spatial Industries Business Association (SIBA) related to the Queensland Government's open data initiative, where one of the questions asked Can you list or describe any learnings that would be useful in Queensland?

I've provided a number of my thoughts on this topic, having closely observed open data initiatives by government over the last five years, and written periodically on the topic myself, such as:


To share the thoughts I placed in the survey more broadly - for any value they have for other jurisdictions - I've included them below:

  • Data released in unusable formats is less useful - it is important to mandate standards within government to define what is open data and how it should be released and educate broadly within agencies that collect and release data.
  • Need to transform end-to-end data process. Often data is unusable due to poor collection or collation methods or due to contractual terms which limit use. To ensure data can be released in an open format, the entire process may require reinvention.
  • Open data is a tool, not a solution and is only a starting point. Much data remains difficult to use, even when open, as communities and organisations don't have the skills to extract value from it. There needs to be an ongoing focus on demonstrating and facilitating how value can be derived from data, involving hack events, case studies and the integration of easy-to-use analysis tools into the data store to broaden the user pool and the economic and social value. Some consideration should be given to integrating the use and analysis of open data into school work within curriculum frameworks.
  • Data needs to be publicly organised in ways which make sense to its users, rather than to the government agencies releasing it. There is a tendency for governments to organise data like they organise their websites - into a hierarchy that reflects their organisational structures, rather than how users interact with government. Note that the 'behind the scenes' hierarchy can still reflect organisational bias, but the public hierarchy should work for the users over the contributors.
  • Provide methods for the community to improve and supplement the open data, not simply request it. There are many ways in which communities can add value to government data, through independent data sets and correcting erroneous information. This needs to be supported in a managed way.
  • Integrate local with state based data - aka include council and independent data into the data store, don't keep it state only. There's a lot of value in integrating datasets, however this can be difficult for non-programmers when last datasets are stored in different formats in different systems.
  • Mandate data champions in every agency, or via a centre of expertise, who are responsible for educating and supporting agency senior and line management to adapt their end-to-end data processes to favour and support open release.
  • Coordinate data efforts across jurisdictions (starting with states and working upwards), using the approach as a way to standardise on methods of data collection, analysis and reporting so that it becomes possible to compare open data apples with apples. Many data sets are far more valuable across jurisdictions and comparisons help both agencies and the public understand which approaches are working better and why - helping improve policy over time.
  • Legislate to prevent politicians or agencies withholding or delaying data releases due to fear of embarrassment. It is better to be embarrassed and improve outcomes than for it to come out later that government withheld data to protect itself while harming citizen interests - this does long-term damage to the reputation of governments and politicians.
  • Involve industry and the community from the beginning of the open data journey. This involves educating them on open data, what it is and the value it can create, as well as in an ongoing oversight role so they share ownership of the process and are more inclined to actively use data.
  • Maintain an active schedule of data release and activities. Open data sites can become graveyards of old data and declining use without constant injections of content to prompt re-engagement. Different data is valuable to different groups, so having a release schedule (publicly published if possible) provides opportunities to re-engage groups as data valuable to them is released.

Read full post...

Wednesday, September 04, 2013

Weird and wonderful uses for open data - visualising 250 million protests and mapping electoral preferences

One of the interesting aspects about open data is how creatively it can be used to generate new insights, identify patterns and make information easier to absorb.

Yesterday I encountered two separate visualisations, designed on opposite sides of the world, which illustrated this creativity in very different ways.

First was the animated visualisation of 250 million protests across the world from 1979 to 2013 (see below).

Based on Global Database of Events, Language, and Tone (GDELT) data, John Beieler, a Penn State doctoral candidate, has created a visual feast that busts myths about the decline in physical protests as people move online and exposes the rising concerns people have around the world.

Imagine further encoding this data by protest topic and displaying trends of popular issues in different countries or states, or looking at the locations of protests in more detail to identify 'hot spots' - in fact John has done part of this work already, as can be read about in his blog (http://johnbeieler.org/)


Second is the splendid Senate preferences map for the 2013 Australian Government election, developed by Peter Neish from Melbourne.

Developed again from public information, this is the first time I have ever seen a map detailing the flow of preferences between political parties, and it illustrates some very interesting patterns.

The image below is of NSW Senate candidates, and thus is the most complex of the states, but shows how this type of information can be visualised in ways never before possible by citizens without the involvement of traditional media or large organisations.

For visualisations of all states and territories, visit Peter's site at http://peterneish.github.io/preferences/


These types of open data visualisation lend themselves to a change in the way the community communicates and offer both an opportunity and a threat to established interests.

Governments and other organisations who grasp the power of data visualisation will be able to cut through much of the chatter and complexity of data to communicate more clearly to the community, whereas agencies and companies who hang back, using complex text and tables, will increasingly find themselves gazumped by those able to present their stories in more visual and understandable forms.

We're beginning to see some government agencies make good use of visualisations and animation, I hope in the near future that more will consider using more than words to convey meaning. 

Read full post...

Monday, May 27, 2013

Australian academia beginning to learn to crawl in 2.0 social channels

I've long lamented the speed at which academia was embracing the internet, social channels and 2.0 approaches - with limited courses available on modern online techniques for under and post graduates, old fashioned-approaches to research and publication.

There's been hints of brilliance overseas - with US universities placing courses online and UK universities embracing social in a major way - however Australia has largely remained a backwater for higher education in a 2.0 world, with individual exceptions at specific universities, such as Dr Axel Bruns and Julie Posetti.

To demonstrate some of the impact of this Australian academic drought, a few months ago I was approached by a European professor about identifying an Australian academic working in the Gov 2.0 field to write a chapter in an upcoming book on Government 2.0. 

This professor, who I had previously worked with on a major report on global Gov 2.0 for the European Parliament (unfortunately not publicly available), had failed to identify anyone in Australia working in the Gov 2.0 space through her academic channels.

I made initial enquiries through a number of my Gov 2.0 contacts in government, as well as to a range of academics and universities, however was unsuccessful at finding anyone through their systems. In the end I was very lucky to encounter an academic in South Australia with relevant expertise at an event I was speaking at in Adelaide. This academic is now working on the book project and I'm very interested in how it turns out.

We have seen some recent stirring towards greater acknowledgement of 2.0 approaches in the recent ARC (Australian Research Council) moves towards open access publishing of public-funded research, however this is still a very small step.

We have also seen some good debates on the role of the public in science, and some pilots such as the Peer-to-Patent, which strike at the commercial end of the spectrum, and the Atlas of Living Australia, which involves citizens in mapping Australia's biodiversity.

We're also now seeing some steps to move beyond the traditional peer review process to consider new ways of measuring the reach and impact of academic research, with the 'altmetric' movement gaining steam.

What are altmetrics? I admit I hadn't heard about them until recently and when I first encountered the turn found the name little more than marketing buzz. 

Essentially the term describes the use of online social metrics to assist in measuring academic success - mentions on Facebook and Twitter, the level of reuse of raw research datasets via APIs, 'semantic publication' of specific passages and references to academic articles in blogs and forums, and more.

The term altmetrics was developed by the founders of one of the first companies that is spruiking altmetrics solutions to academics, and the biggest supporters of the term are other companies seeking to profit from the same rush to web statistics. Therefore I am still inclined to regard the term itself as marketing buzz for the types of social metrics commercial and public sector organisations have been using for years (see the chart below on the growth of use of the term in Google searches).

However it does signify an important and major change in how academic research is measured and valued.

If academics begin measuring their success in how well discussed and commented on their work is in the public sphere, they will likewise begin talking more about their research publicly in order to grow their buzz and their recognised academic prowess.

This will encourage academics to get out from their lecture theatres into the community, become more proficient at communicating their thoughts and work to a broader layman audience and making research more accessible, interesting and influential in public debates and policy work.

I also hope more publicly available research will also lead to more people interested in pursuing these careers, greater commercialisation of research work, improved scrutiny of findings and better social outcomes.

However I hope that at some point academics will realise that 'altmetrics' are simply no more than metrics - ones that are already becoming business-as-usual in commercial and public sector spheres - and focus more on involving people in and sharing their research than on the marketing buzz.

For more information on altmetrics, see:

Read full post...

Tuesday, May 21, 2013

Is there really an open data El Dorado?

I was reading a tweet yesterday from Australia's CTO, John Sheridan, and it raised an interesting question for me.
Is government open data really a new goldmine for innovation?

The Economist's article, A new goldmine, makes a strong case for the value of open data through examples such as GPS, the Global Positioning System which is owned by the US government (who owns the satellites), but has been provided free to organisations around the world since 1983.

I've also seen fantastic studies in the UK and Australia talking about the value in releasing public sector information (PSI) as open data, and great steps have been taken in many jurisdictions around the world, from Australia to Uruquay, to open up government silos and let the (anonymised) data flow.

I agree there's fantastic value in open data; for generating better policy deliberations and decisions, for building trust and respect in institutions and even for stimulating innovation that leads to new commercial services and solutions.


However I do not believe in an open data El Dorado - the equivalent of the fabled city of gold - where every new dataset released unveils new nuggets of information and opportunities for innovation.

Indeed I am beginning to be concerned that we may be approaching a Peak of Inflated Expectations (drawing on Gartner's famous Hype cycle chart) for open data, expecting it to deliver far more than it actually will - a silver bullet, if you will, for governments seeking to encourage economic growth, transparency and end world hunger.

Data is a useful tool for understanding the world and ourselves and more data may be more beneficial, however the experience of the internet has been that people struggle when provided with too much data too quickly.

Information overload requires humans to prioritise the information sources they select, potentially reinforcing bias rather than uncovering new approaches. Data can be easily taken out of context, misused, distorted, or used to tell a story exactly the reverse of reality (as anyone closely following the public climate change debate would know).

Why assume that the release of more government data - as the US is doing - will necessarily result in more insights and better decisions, particularly as citizens and organisations come to grips with the new data at their fingertips?

A data flood may result in exactly the reverse, with the sheer volume overwhelming and obscuring the relevant facts, or the tyranny of choice leading to worse or fewer decisions, at least in the short-term.


The analogy of open data as a gold mine may be true in several other respects as well.

The average yield of a gold mine is quite low, with many mines reporting between one and five grams of gold per tonne of extracted material. In fact gold isn't even visible to the naked eye until it reaches 30 grams per tonne.

While several hundred years ago gold was easier to find in high concentrations and therefore easier to extract - leading to many of history's gold rushes - over time people have mined most of the highest gold concentrations.

Extraction has become laborious and costly, averaging US$317 per ounce globally in 2007.

There is definitely gold in open data, value in fresh insights and innovations, opportunities to build trust in institutions and reduce corruption and inefficiency in governance.

However if open data is at all like gold mining, the likelihood is that the earlier explorers will find the highest yields, exploring new datasets to develop insights and innovations.

By the gold mine comparison we are currently in the open data equivalent of the gold rushes, where every individual who can hoist a line of code can dig for riches as a data miner, while data analysis companies sell spades.

Following the analogy, data miners will shift from open data site to open data site, seeking the easy wins and quick insights.

However as the amount of open data grows and most of the easy wins have been found, it will get more expensive to sift increasing amounts of data for fewer insights, requiring greater and greater investments in time and effort to extract the few remaining nuggets of 'gold'.

At that point many government open data sites may become virtual ghost towns, dominated by large organisations with the ability to invest in a lower yield of insights.

Alongside these organisations, only a few tenacious data mining individuals will remain, still sifting the tailings and hoping to find their open data El Dorado.

Read full post...

Wednesday, April 10, 2013

Web and social media reporting can help Communication get a seat at the decision-makers' table

Yesterday morning I attended the first OPC IT WebEx event for the year, where we heard from three great speakers on intranet development, accessibility and the changing face of the media in Australia.

One particular statement that stuck in my mind was from David Pembroke, CEO of Content Group, who said that it was important for communications people to bring numbers to the table to gain a seat alongside other decision-makers, such as CFOs and CIOs who already have numbers in hand to support their positions.

While most agencies now track the traffic to their website and report raw numbers of followers, comments and mentions on their social channels, I believe there's still a way to go before these numbers are provided in the right way to the right people at the right time to help Communications areas - and particularly Online Communications - have the impact and the influence it deserves.

This has been brought home to me by Slideshare, which recently began sending me reports on the number of views and interactions on the various presentations I've uploaded to the service over the years.

Simply being able to see these basic stats has made me take more notice of the material I'm putting on Slideshare and whether or not it has a wider audience that I should consider when developing my slides.

I'm even considering paying money for an account to get more detailed statistics that will help me finetune material to better match what audiences want.

When working in Government I did place a considerable amount of effort into providing web statistics back to the areas responsible for specific content. I believe this type of reporting is critical to help policy and program areas receive regular and actionable feedback on what they are putting online to help inform their customers, clients, stakeholders and other audiences.

In fact, without web reporting many of these areas only receive ad hoc and irregular feedback on the content they are producing - an annual survey, or some Ministerial Correspondence. This makes it harder for them to understand whether their content is targeted correctly and also means they place much lower emphasis on what they are communicating online - what isn't measured isn't managed or valued.

Now with social media in the picture, web reporting needs to jump to a higher level of competency. While agencies might have made some steps to ensure that various areas of their business are receiving reports on the content they are providing through websites, the new frontier is to provide them with actionable information on what people are saying about their programs and policies across the broader web.

This helps areas within agencies not only assess how people are responding to the information they do provide online, but also gives them some understanding of what questions and issues are being discussed due to the lack of content.

In other words, web reporting helps tell agencies the quality and effectiveness of their own website content. Social media reporting helps tell agencies about the community's content needs beyond existing content.

The benefits to agencies of this social media monitoring are immense, not only can we capture known unknowns, but also the unknown unknowns - intelligence that could shape the entire way a program or campaign is designed and communicated.

It is also very important to differentiate social media monitoring from media monitoring - something that is getting harder to do as media monitoring companies move to bundle social within their media offerings.

Media monitoring tracks what commentators say about an agency and its activities when posturing to a broad audience.

Social media monitoring tracks what your customers and stakeholders are saying about an agency and its activities to each other.

In other words social media monitoring can provides a granular and specific view on what your actual customers think and understand about specific programs and how they interact with them in the real world, while media monitoring only provides a shallow reputational view on what people are saying for an audience - which may simply be an act.

So while there is a clear incentive for Online and Communication teams to roll social media monitoring in as an extension to (traditional) media monitoring, it can be dangerous to consider the intelligence received through both avenues in the same light.

As agencies get better at both web reporting and social media monitoring, and develop standardised ways to communicate actionable insights to the right people, at the right time, we're likely to see more ability for the groups providing these insights to have meaningful influence on agency decisions. This is right and proper - better information leads to better decisions and outcomes.

However it is up to Communication and Online teams and their leadership to recognise how web and social monitoring can advance their ability to positively influence decisions and take the lead on providing insights, otherwise they will find themselves on the margins as more traditional numbers-orientated disciplines take over the responsibility for these activities.

Read full post...

Friday, March 22, 2013

Provide your feedback on the Australian Government's big data issues paper

The Australian Government Information Management Office (AGIMO) has released a Big Data Strategy Issues Paper and, while it's not clearly stated in the title of their blog post, is seeking public and industry comment until 5 April 2013.

You can find the paper and the ways in which they are accepting comments and formal responses, at AGIMO's blog, in the post, Released: Big data Strategy Issues Paper.

Read full post...

Bookmark and Share