Monday, May 04, 2026

Your AI isn't being honest with you. It was never designed to be

A recent Harvard Business Review study found that when researchers asked large language models for strategic advice, they got "trendslop" - recommendations that defaulted to whatever sounds fashionable in contemporary management: 'Innovation', 'Augmentation', 'Long-term thinking'. 

The strategic advice was plausible, confident and, in many cases, largely useless.

This isn't a bug. It's these AI systems working as designed.

Every large language model has been trained with a bias to satisfy the person prompting it.  

A model that refused to answer when asked, or routinely provided uncomfortable or contrary answers, would not succeed in the market. They are tuned, through reinforcement learning from human feedback, to please. That bias doesn't switch off when you ask for critical review.

What the research found

Researchers from Esade Business School, the University of Sydney, and NYU Stern tested seven leading LLMs across strategic trade-offs that required genuine binary commitments (several listed below).

Across thousands of simulations, the results didn't vary by much. Almost every model, almost every time, recommended:

  • Differentiation over cost leadership
  • Augmentation over automation
  • Collaboration over competition
  • Long-term thinking over short-term

The company context made little difference. The researchers tested tech startups, hospitals, construction companies, government agencies and multinationals. The recommendations barely shifted.

Why was this? LLMs are essentially probability engines that pick the next word (token) from a list of probabilities, with the highest probabilities corresponding to the most likely choices. 

How do they develop their probabilities? By indexing billions of public documents, web pages and other content. So the highest probability content output from these AIs is driven more by social norms than by accuracy.

Essentially, the models are most likely to provide the most socially acceptable answers, and then deliver them in the register of expert advice.

For example, while Michael Porter built a foundational economic framework around cost leadership as a legitimate strategic position (which Walmart and Costco built empires on).

LLMs dismissed this approach, because thousands of websites and TED Talk transcripts advocate for unique value propositions. And these circulate far more than quiet stories about supply chain efficiency. 

Prompting won't fix it

The researchers ran over 15,000 trials varying prompt structure, framing, persona and stakes. For differentiation and augmentation, bias shifted less than 2% regardless of how the prompt was written. 

For the others, the average shift was 22% - mostly from one factor: flipping the order in which options were listed. The model didn't reason differently. The option order gave it a target to aim for.

Adding detailed industry context helped slightly - shifting responses by 11% on average. An LLM, given a thorough brief on a cost-pressured government agency in a mature market, still recommended differentiation most of the time.

There's a second failure mode the researchers call the "hybrid trap." When models aren't forced into a binary choice, they frequently recommend doing both - pursue differentiation and cost leadership, pursue radical and incremental innovation. 

That sounds balanced but in practice it's the strategic equivalent of trying to be everything at once, which Porter identified as the most reliable path to competitive failure.

Strategy is about choosing what to stop. A model optimised to please finds that answer difficult to give.

Why this matters for the public sector

Public servants may choose to use AI to pressure-test policy proposals, assess procurement options, review business cases, and stress-test project plans. 

While the productivity case is solid, with fewer resources and less time, AI appears to help fill the gap. The problem is that when prompting AI as a validator, you get validation - regardless of the quality of the underlying thinking.

Digital transformation narratives will consistently outperform consolidation narratives in LLM-generated advice. Decentralisation will beat centralisation. Long-term will beat short-term. 

The model's recommendation reflects the positive emotional valence of contemporary business language, not the requirements of the specific situation. For APS work, that's a real risk - particularly where the right answer is to consolidate, simplify, or cut scope.

What to do about it

This isn't a reason to stop using AI. It's about using AI more effectively.

While the standard advice is often to give AI more context and craft better prompts. The research shows this doesn't reliably work. These are more effective approaches:

  • Ask for options, then critique each separately. Present your shortlist and the model works inside your framing. Ask it instead to make the strongest possible case for each option independently - including unfashionable options. For a procurement brief, that means prompting "make the strongest case for option A" and "make the strongest case for option B" in separate sessions, then applying your own judgement.
  • Ask for criticism explicitly. "Identify the three most significant weaknesses in this policy proposal" works. "What do you think of this approach?" doesn't. The more structurally you frame the critique, the less room the model has to default to encouragement.
  • Strip preference signals from your prompts. Any language suggesting which option you favour - "we're leaning toward," "I think this is probably right" - becomes a target. The model will weigh toward it. The same goes for options you don't favour - "I think this is probably wrong" - the AI will weigh against it. Write prompts as if the options are genuinely open and equivalent.
  • Treat hybrid recommendations as a flag. If the model recommends pursuing both sides of a trade-off, run separate prompts for each option and stress-test the hybrid specifically before accepting it. "What are the risks of pursuing both differentiation and cost leadership simultaneously?" is a more useful prompt than accepting the hybrid as the answer.
  • Track model versions. Biases shift as models are updated. Maintain a record of your key queries and outputs so you can detect changes over time. Be prepared to rerun analysis across models and critically consider why they may give different results.
  • Have different people run the prompts. Many modern LLMs now have memory they store about the user 'in the background' (including CoPilot). While most of the time you can find this if you search and even edit, remove and add memories, it can be a hidden spoiler that biases the AI's response based on what it knows you generally like or dislike. Different people will have different memories retained, so you will get a broader set of viewpoints from an AI by running prompts separately by person - or logging out entirely if that's feasible (not always possible within agencies, particularly using CoPilot within your firewall).
Whatever techniques you use, keep in mind that AI doesn't necessarily know more than you about a given strategic or policy decision. It can provide useful critique for testing ideas or identify other options, or surface research you should consider, but at the end of the day humans should be making and approving these decisions.

Saying an AI made the decision is neither defensible, nor wise. And remember, you're paid more than the AI because your critical thinking is valued (hopefully)!

Read full post...

Wednesday, April 29, 2026

Applying the APS Style Manual as a rules engine

I spoke at the IBR Gen AI Transforming Govt PA & Comms conference last week. 


The event handed out copies of the Government Writing Handbook - the APS Style Manual in distilled form - to everyone in the room.

My immediate thought was: Is this available as a GenAI skill yet? Or as a RAG knowledge base? Or in any form a government agency's chosen GenAI language model can comprehensively use as a checkpoint during document generation?

It wasn't. Paper and PDF only. With a blog and a few web pages replicating part of the manual.

However, I did find the Writing style guide for the Singapore Government Design System (SGDS) as a Skill... 

And here's a PDF of the demo, in case your agency can't access Manus.

So I ran a quick test in the room during the 30 minutes before I spoke. Being mindful of the Government Copyright (which wasn't Creative Commons), I extracted a handful of rules from the Writing Handbook, wrapped them in a prompt, and ran a quickly drafted press release through Claude.

The output tightened immediately. Shorter sentences. Active voice. The point surfaced early. It read like something that would get through clearance without being rewritten three times.

Nothing about the model changed. The constraint did the work.

I've written about Rules as Code on this blog before. The argument has always been the same: take policy, legislation and guidance and express it in a form that systems can apply consistently. We've seen this in eligibility engines, compliance checking, and service delivery. The benefits include consistency, transparency and less reliance on individual interpretation under pressure.

This is the same pattern. Applied to government writing.

The Style Manual is a set of rules that every federal public servant should live by. 

Right now, they're expressed as prose, examples and guidance. People interpret and apply them as best they can. Results vary - depending on the writer, the reviewers, the deadline, and how recently they all last read the manual.

However, translate those writing rules into a form a system can execute, and you get consistency at the point of creation.

The test in the room did exactly that. I didn't attempt to ingest the whole manual. Just a handful of rules — plain language, active voice, short sentences, clear structure — enforced as a second pass over the draft.

Python
def draft(prompt):
    return llm.generate(prompt)

def enforce_style(text):
    return llm.generate(f"""
    Rewrite this text to comply with APS writing principles:
    - use plain language
    - prefer active voice
    - keep sentences concise
    - make the purpose clear early

    Text:
    {text}
    """)

Rules as code in its simplest form. The rules are explicit. The system applies them. The output is predictable.

To make it useful at scale, you could structure the manual itself - each rule becomes something the system can retrieve and apply based on context rather than running every rule on every document.

JSON
{
  "rule": "Use plain language",
  "check": "Identify complex terms",
  "rewrite": "Replace with simpler words"
}

Store those rules. Tag them. Retrieve the right ones based on the task.

Writing a brief? Apply structure and clarity rules. 
Writing web content? Apply plain language and accessibility. 
Writing an email? Apply directness and action orientation.

That's a rules engine. The underlying pattern is the same one government has used for years in eligibility and compliance systems.

Wrapped into tools like Microsoft Copilot - which is now rolling into agency workflows - this becomes part of the drafting workflow. 

The user requests a particular type of content (with appropriate context and inputs). The system generates it, applies the relevant rules, and returns something already aligned to APS expectations. 
From the user's perspective, nothing special is happening, but editing for style is far easier and faster.

There's one practical constraint. The Style Manual isn't available under a Creative Commons license, but under Government Copyright 2026. That limits copying and redistribution of the work, not the extraction and structured implementation of its rules.

The source remains authoritative. The system applies a structured interpretation of it. This is, again, exactly what we already do with legislation and policies. We don't expect staff to memorise every clause. We encode the rules and apply them consistently.

Every agency could do this with their CoPilot instance - or Finance could implement it into GovAI centrally and share the ruleset as a skill.md or RAG (Retrieval-Augmented Generation) - essentially a permanent and updateable memory for Generative AIs.

Suddenly everyone using CoPilot within your agency either automatically applies the APS Style Manual in every generation - or it can be applied selectively, based on what they are seeking to generate or via a user setting or prompt.

The result is that the APS Style Manual stops being guidance people try to remember and becomes a checkpoint that authorised Generative AI systems applies every time it is needed - cutting editing and review time.

Plus staff can write their content by hand, and have AI check the style, editing and rewriting where necessary to better meet government writing standards.

Is this a perfect solution? Probably not - yet. AIs still make mistakes and can misapply or fail to apply some style rules. However it improves a basic CoPilot or other GenAI solution for government writing purposes, saving time and raising text quality and productivity.

Read full post...

Tuesday, April 28, 2026

My Presentation from the IBR Conference - The Future of GEN AI for Public Sector Communications and Public Affairs 2026

 Last week, I attended and spoke at the International Business Review (IBR Conferences) event: GEN AI Transforming GOVT PA & COMMS 2026: The Future of GEN AI for Public Sector Communications and Public Affairs 2026 Hybrid Conference.

I've included an excerpt of my presentation notes below for folks who were unable to attend.


The AI Whisperer’s Guide to Practical Deployment

You’re probably here because your organisation is past the question ‘should we use AI?’

Good. Most of us have had that conversation many times.

The question is now: ‘Which battles are worth fighting?’

There are usually more AI opportunities in any organisation than the capacity to pursue them well. 

And the cost of picking the wrong ones isn’t just budget – it’s credibility. Credibility is what gives you the currency to start your next AI project.

I’ve worked with AI in various ways for over 9 years now, building commercial generative AI products, undertaking complex modelling for major infrastructure, and supporting project delivery.

I also have a long history working in Digital, before, during and after Gov 2.0. 

As such I’ve watched some patterns repeat, and others rhyme. Not just in government, in humans.

So let’s dig into practical implementation.


Before anything else, I want to draw a distinction that I think is worth keeping front of mind.

There are often two distinct uses of AI in government work.

There’s AI supporting human judgment – that helps draft, synthesise, analyse and prepare work.

And there’s AI that substitutes for human judgment, that makes or drives decisions.

In practice, it's not always a clean line, but the question is worth asking of every use case: is a human genuinely making the decisions, or are they ratifying what an AI has determined?

Even when an AI excludes or shortlists options, such as when using an AI to screen applicants in a recruitment process, you should consider whether AI bias is driving a decision bias that isn’t defensible.

For different uses, the risks and governance requirements vary. And the consequences of getting them wrong are often very different.

I don’t need to say more than ‘Robodebt’ to make that point.

Human in the loop isn’t just good practice. In the Australian government, it’s increasingly an ethical and legal expectation.

Everything I’m covering today sits on the drafting-and-support side of that line. AI as a contributor. The human still makes the call, still does the editing and still approves the work.

But if you’re working on something that sits closer to the decision-making side, that’s a topic that needs more time than we have today.


So – where do AI investments tend to fall short? In my experience, it comes down to three patterns. And none of them is really about the AI.

The first is the wrong problem.

AI gets applied to a symptom rather than a cause. Or the real problem turns out to be a process gap, a data quality issue, or an ownership question. AI gets suggested as the solution because it’s trendy, safer, easier, less political or more fundable than ‘we need to fix the process.’ We used to see the same with requests to build a website, or create a mobile app. In previous roles in government my job was often to tell people they didn't need to build a new thing, but rather educate them on the digital assets we already had that could be used to meet their goals.

AI won’t necessarily fix a broken process. And it often speeds it up, which can make things far worse, far faster. This is exactly what we saw with digital. Automation can compound success, but also compound failure.

The second is an environment that wasn’t ready.

This can take multiple forms. The data was messier and harder to clean than expected. The permission model created complications that no one had mapped. Governance obligations introduced constraints that only became visible mid-implementation. The workflow proved more sensitive than the business case assumed. Or the people expected to use and engage with the AI weren’t brought along on the journey and may never have wanted it in the first place.

None of these is unusual. They’re normal friction when deploying anything in a complex operating environment. The question is whether you surface them early or late. Some organisations are prepared to take the hit, see staff leave, and things get disrupted before they get better; others pull back and call it a failure.

You really need to know your organisation’s appetite and commitment before leading one of these projects, or you can be left out in the cold.

The third is a lack of an internal owner.

The capability to run, adapt, and govern the AI system either never existed within your organisation or walked out when the people who built it moved on. Nobody inside could improve it later when things changed, or govern it when something went wrong.

That’s a capability-and-ownership question that procurement can’t resolve. And it’s worth thinking about before you sign anything.


I also want to spend a few minutes on something that comes up constantly as a blocker, but where there’s a practical path that many organisations haven’t explored yet.

Data security, but specifically at the front end of the project. How do you test, procure and demonstrate AI systems without exposing sensitive or classified data to vendors?

You can’t load a confidential document into an online AI model to test its capabilities. You can’t hand private citizen data to a vendor for a proof of concept. You often can’t load-test against live operational data or let a bidder build a demo using your grant records or patient data.

This can often kill a potential AI project before it gets started.

But there’s an option worth considering: synthetic data. And AI can build it for you.

That’s not anonymised data, anonymisation has well-documented re-identification risks. Synthetic data is a dataset that is statistically realistic and structurally accurate but contains no real data.

Here’s a concrete example from my prior work.

I needed to load-test a system across a large physical asset network with millions of individual assets. Using real data wasn’t an option. What existed was sensitive, and what didn't (to project future scale) couldn't easily be modelled.

So I used AI to build a city.

Not a digital twin of the actual asset network – a synthetic city, constructed from scratch using known proportions of asset types, realistic density estimates, and plausible growth trajectories.

It could scale to whatever size we needed, model our asset growth over ten to twenty-five years, and produce a test dataset with no connection whatsoever to real infrastructure.

It allowed us to load test at any scale. The vendor never saw real data. The security risk was zero.

A similar approach may work for any system dealing with large datasets where the proportions and structures are well understood, but the specific data is sensitive. Health systems. Grant systems. Infrastructure procurement. Regulatory systems. You could even have AI write tens of thousands of synthetic ministerial briefs and the back-and-forth correspondence across parliament on common topics to test and demo new Parliamentary Document Management Systems.

This gives agencies the opportunity to provide vendors with synthetic datasets — reflecting realistic shapes and adjusted for jurisdiction-specific requirements, but containing no sensitive content and creating no security or privacy exposure. Yes they might see the overall shape of the system – but that’s what they’re providing anyway. If they didn’t know the system’s shape to begin with, they wouldn’t have a product to demonstrate.

Vendors build and demonstrate against the synthetic. You evaluate their system properly. You can even use it to test edge cases and load scenarios for existing systems that you couldn’t safely test using real data.

I think of it as a digital cousin. Looking enough like you to size clothing, but not enough to fool your parents.

It helps reframe security from something that blocks deployment into something that enables it safely.


Next I want to go back to my earlier point. Fighting the right battles.

Think of one AI investment your agency has made or is seriously considering. It doesn’t have to be large. It could be a drafting agent, a customer service chatbot, something for data analysis or procurement. Just hold something real in mind.

I’m going to run through four questions. See how your use case sits against each of them.

First: Need.

Is the problem real, recurring and bounded? Is there enough volume to make it worthwhile? Is the domain clear enough to work with?

Or, being honest here, is the underlying problem actually a process gap, a data quality issue, or an ownership question that’s been reframed as an AI opportunity because that’s where the momentum is?

Catching it early is much less painful than catching it after a commitment is made.

Next: Fit.

Does the proposed solution support the systems, content quality, and governance obligations you actually have, rather than environment a vendor’s demo assumes?

And, coming back to the distinction I drew earlier, is this clearly a drafting and support use? Or has it drifted toward AI making or influencing a decision?

That drift tends to happen quietly. The language shifts from ‘AI will help officers assess’ to ‘the system will flag for review’ to ‘applications below the threshold will be automatically declined.’

Each step is incremental. The cumulative effect is that there’s no human genuinely in the loop anymore.

That’s dangerous territory to step into.

Third: Ownership.

Who inside your organisation will run this, refine it and govern it once it is built and any vendor relationship ends?

Finance’s GovAI initiative is doing solid work building foundational AI capability across the APS, that’s genuinely useful infrastructure, like GovCMS.

For GovCMS, your agency has to own the website at the end of the process. You can’t hand off responsibility to Finance. They’ll keep the system up, but won’t update and improve your content and navigation.

For AI, even when using a GovAI platform, your agency still needs to own the outcome. You need people internally who understand the system well enough to know when it’s going wrong or not keeping up with changing policy and internal needs.

If the ownership question is still being worked out, that’s probably the first thing to resolve. Without internal ownership, the other three questions become someone else’s problem and often get dropped.

Finally: Effect – which you can also read as value.

What actually changes if this initiative works?

Not ‘people will use it more.’ Not ‘staff will find it helpful.’

What changes in citizen experience, policy delivery, workload, risk, consistency or quality? And by how much? And how will you measure that change over time?

If the most compelling metric you can name right now is an adoption rate, it’s worth going a level deeper. Adoption tells you people tried it. It doesn’t tell you what difference it made.


One thought to leave with you that I don’t think anyone has fully worked out yet.

When you build teams that include AI contributors alongside humans, you’re managing a new kind of workforce diversity. Just as changing a single individual in a team can radically alter team dynamics, adding an AI contributor sufficiently advanced to produce decent work and take some load off human team members can also radically alter team dynamics.

This is new, different to any workforce change we’ve seen before in human history. And it requires new management practices.

It isn’t a technology challenge. Your IT leaders can’t offer qualified guidance on how to manage humans and AIs as a unified team. It’s a people leadership challenge.

Organisations that work this out deliberately will get more from the same tools and people.

There isn’t a handbook for it yet. But it is worth thinking about.


Read full post...

Friday, December 05, 2025

BOM – Flying Above the Radar

Australia's Bureau of Meteorology (BOM) has copped a downpour of headlines for spending A$96.5 million on what some gleefully call “a website.”

That framing is catchy - but it’s wrong.

The public‑facing interface - the bit we click - cost about A$4.1 million.

The heavy lifting was a complete rebuild and testing of the systems and technology behind it: ingesting vast volumes of observations and model outputs, securing them, and serving them at national scale, reliably, every minute of every day.

In other words, the BOM didn’t buy a slick homepage; it rebuilt the foundation that gets critical environmental intelligence to Australians on a timely basis - whether they’re at the airport, on the farm, or fighting a fire (as I did yesterday).


This effort didn’t happen in isolation. It sits alongside BOM’s seven‑year ROBUST technology program, which upgraded cybersecurity, networks, data centres, a disaster‑recovery supercomputer, and the national observing network (including dual‑polarisation radars). By closure in mid‑2024, ROBUST had invested A$866 million to harden Australia’s environmental intelligence backbone after the bureau’s 2015 cyber breach and outages. That’s the plumbing you don’t see when you load a forecast.

So, what failed? Not the core project goal—modernising national infrastructure—but the communication and release strategy.

The new site launched into severe weather, and ordinary users and power users alike found basic tasks harder: radar colours felt off, local details were buried, familiar navigational cues had shifted.

The backlash was immediate; ministers demanded fixes; BOM reverted radar visuals and committed to rapid improvements. Timing, messaging and usability were misjudged—and they matter.

This is where “low‑hanging fruit” counts. When you’re redeveloping an entire weather reporting and dissemination system for a country, the public expects familiar wins to land early: clearer local pages, consistent radar palettes, quick paths to wind, rain and temperature. BOM missed several of these, and the community told them—loudly. The lesson isn’t “don’t modernise”; it’s design with users, ship the obvious wins first, and narrate the journey.

User testing should never be a checkbox; it’s your pre‑flight weather check. That means beta programs, co‑design with sector users (farmers, SES crews, pilots), and A/B testing for navigation and visuals so you learn what works before you re‑platform at scale.

If industry best practice says test early, test often, test live - heed it. A handful of controlled experiments can surface friction months before a national launch.

But why was there so much fuss over plumbing? Because weather intelligence is critical infrastructure. Agriculture uses it to time planting and spraying; fisheries plan around marine forecasts; aviation and defence rely on tailored briefings and feeds for safe operations; emergency services need resilient ingest and rapid warnings when floods and fires escalate. It’s a national capability upgrade serving safety, the economy and national security.

In a world of constrained trust, how you deliver matters almost as much as what you deliver. Publish the cost breakdown early (front end vs back end). Explain the architecture in plain English. Stage releases. Keep the radar legible and familiar.

And treat feedback as flight data, not turbulence: respond visibly and fast. That’s how you reduce reputational risk while raising technical resilience. It’s also how governments maintain confidence in big tech investments.

If I were coaching an agency on a transformation of this scale, my pragmatic checklist would be:
  • Mission first, interface second. Lead with the purpose: warnings, safety, continuity. Then show the pixels.
  • Ship the obvious wins early. Keep users fed with the basics - clear radar, local conditions, one‑click access to wind/rain - while deeper systems evolve.
  • Co‑design and A/B test. Put farmers, SES, pilots and fishers in the cockpit; instrument the site; run controlled trials before national rollout.
  • Stress‑test comms like the platform. Prepare explainer packs, diagrams and FAQs; brief ministers and media ahead of launch; communicate changes and reversions promptly.
  • Be resilient and transparent. When a severe weather season collides with deployment, delay updates that add risk, and explain why. That’s good ops and good public service.

Bottom line: the BOM didn’t just build a website. It rebuilt resilience.

The next time you check the radar before a weekend BBQ, remember: that little map is powered by one of the most sophisticated and secure data platforms in the world. And that’s worth every cent.

That said, the public frustration is real, and instructive. The BOM tried to fly below the radar, but forgot that it IS the radar. A critical service Australians rely on and mostly love. The criticism becomes that much harsher when things appear to go wrong with something we love.

However when we combine world‑class infrastructure with world‑class user practice - co‑design, beta, A/B testing, signalling changes, staged releases and plain‑English comms - we can get a platform that stands up to cyclones, cyber threats and criticism alike.

That’s the point of flying above the radar: you see any storms sooner, and can steer accordingly.

Read full post...

Friday, November 22, 2024

Submission regarding under 16 access to social media networks

I’m not terribly sympathetic to the Australian Government’s attempt to ban social media access to under-16s.

There are many reasons why a blanket approach is bad - from removing parental control to removing support networks for at-risk and diverse teens.

However, if they’re going to propose it, at least they should have a solid plan on how to implement it effectively in a way that might do some good. Which the current bill doesn’t really offer.

So I worked this morning with ChatGPT to write a submission addressing the failings of the amendment bill - and this would be my submission if there was a way to make a submission (which there is not).

Submission to Online Safety Consultation 

By Craig Thomler

Subject: Response to Proposed Social Media Age Restrictions under the Online Safety Amendment (Social Media Minimum Age) Bill 2024

Introduction

The Online Safety Amendment (Social Media Minimum Age) Bill 2024 is a step in the right direction but risks achieving more harm than good in its current form. While its intention to protect children online is commendable, the proposed measures are overly simplistic and disproportionately punitive, failing to address the nuances of a complex digital landscape.

Blanket restrictions won’t stop harm; they’ll push it into unregulated spaces. Small platforms face extinction under these penalties, while big tech barely flinches. The amendment sacrifices privacy in the name of safety, creating new risks instead of reducing them.

To achieve its goals, the amendment must shift from broad-brush penalties to precision policy by integrating smarter, fairer solutions that work for all stakeholders. Through tailored compliance, strengthened privacy protections, and parental involvement, we can build a safer digital environment without alienating users or stifling innovation.

Harms Caused by the Proposed Amendment


1. Negative Impacts on Smaller Platforms

  • Disproportionate Burden: The flat penalties of 30,000 penalty units fail to consider the operational and financial capacity of small-to-medium-sized platforms. These businesses often lack the resources to implement complex age-verification systems, potentially forcing them out of the market.
  • Stifling Innovation: Smaller platforms, many of which cater to niche communities or serve educational purposes, may cease operations due to compliance costs and risks of high penalties.
2. Insufficient Privacy Safeguards

  • Data Mismanagement Risks: Requiring platforms to collect and store sensitive age-verification data increases the risk of breaches, identity theft, or misuse by malicious actors. The amendment does not include clear guidelines on data minimisation or secure destruction.
  • Intrusiveness: Intrusive mechanisms, such as uploading government IDs, can deter users and create additional risks if such data is mishandled.

3. Inadequate Addressing of Online Harms

  • Overfocus on Age Restrictions: By limiting access based solely on age, the amendment fails to address broader issues such as exposure to harmful content, algorithmic manipulation, and the role of content moderation.
  • Circumvention Risks: Children can easily bypass age restrictions by creating fake accounts or using anonymisation tools (e.g., VPNs), undermining the efficacy of the measures.
4. Alienation and Inequity
  • Exclusion from Social Development: Adolescents (13–16 years) rely heavily on social media for peer interaction, cultural participation, and identity development. Blanket restrictions risk isolating them from their social circles, causing feelings of exclusion and rebellion.
  • Reduced Parental Autonomy: The amendment removes discretion from parents and guardians, who are better positioned to decide when and how their children engage with social media.
5. Unfair Application of Penalties
  • Ineffectiveness for Larger Platforms: For major platforms (e.g., Meta, TikTok), the flat penalty is negligible compared to their revenue. This reduces the amendment’s deterrent effect and gives larger platforms a competitive advantage over smaller networks.

Proposed Improvements

1. Revised Penalty Structure

  • Introduce a sliding-scale penalty system based on platform revenue, user base, and level of non-compliance:
    • Platforms with annual global revenue under $5 million AUD: Maximum fine of 1,000 penalty units.
    • Platforms with annual global revenue between $5 million and $100 million AUD:
      • Base penalty: 2% of Australian revenue.
      • Additional penalties for repeated breaches: Public accountability measures (e.g., public notices or restrictions on ad targeting in Australia).
    • Platforms with annual global revenue exceeding $100 million AUD:
      • Base penalty: 5% of Australian revenue or $10 million AUD, whichever is greater.
      • Additional penalties for repeated breaches:
        • Public reporting of non-compliance on government and platform websites.
        • Platform-wide visibility restrictions in Australia (e.g., throttling algorithmic content recommendations for non-compliant platforms).
2. Verified Parent/Guardian Discretion
  • Introduce a Parental Consent Framework to allow parents/guardians to grant access to age-restricted platforms for children under 16:
    • Consent must be verified through secure processes, such as:
      • Linking to a verified parental account.
      • Submitting proof of guardianship alongside consent forms.
    • Platforms must enable granular parental controls, allowing guardians to monitor and manage their child’s activity.
    • Require platforms to provide education resources for parents on managing online safety risks.
3. Government-Endorsed Digital Age Verification System (DAVS)

  • Develop a Digital Age Verification System (DAVS) that:
    • Verifies user ages anonymously through tokens or hashed data.
    • Provides a standardised, privacy-focused solution for all platforms, reducing compliance costs.
    • DAVS integration to be mandatory for large platforms, while optional (but subsidised) for smaller ones.
4. Strengthened Privacy Protections
  • Mandate data minimisation practices:
    • Platforms must only collect data strictly necessary for age verification.
    • All verification data must be encrypted and destroyed within 14 days of account approval.
    • Introduce regular privacy audits overseen by the Office of the Australian Information Commissioner (OAIC), with public reporting of results.
    • Prohibit platforms from using age-verification data for advertising or algorithmic purposes.
5. Non-Monetary Penalties
  • Public Notices of Non-Compliance: Published on the eSafety Commissioner’s website and the platform’s Australian homepage.
  • Operational Restrictions: For repeated breaches, restrict platform visibility (e.g., content distribution) until compliance is demonstrated.
  • Mandatory User Notifications: Platforms must notify all Australian users about their non-compliance and remediation steps.
6. Grant Support for Smaller Platforms
  • Establish a Small Platform Support Fund to assist platforms with annual global revenue under $5 million AUD in adopting compliant age-verification systems:
    • Grants covering up to 80% of compliance costs (e.g., DAVS integration or equivalent).
    • Eligibility contingent on adherence to a simplified compliance framework.
7. Education and Awareness Campaigns
  • Fund national education initiatives to promote:
    • Digital literacy for children and parents.
    • Awareness of online safety resources and the amendment’s goals.
  • Collaborate with schools to integrate digital safety into curricula.

Conclusion

The Online Safety Amendment (Social Media Minimum Age) Bill 2024, while well-intentioned, risks significant harm if implemented in its current form. The proposed measures disproportionately impact smaller platforms, fail to adequately protect privacy, and overlook the broader risks of online harm and exclusion.
To create an effective and equitable framework, I strongly urge the adoption of the proposed improvements in this submission. These include proportional penalties, a government-endorsed age verification system, strengthened privacy protections, and parental involvement. By addressing these gaps, we can achieve a balanced policy that protects vulnerable users without stifling innovation or overburdening businesses.
I look forward to seeing a revised and refined framework that aligns with these principles.

Thank you for considering this submission.

Read full post...

Wednesday, April 05, 2023

What does AI mean for entrants into the job market?

Last week my youngest daughter graduated from university with an IT-related degree (with 2nd degree honours - she did better than me!)

At her graduation ceremony, the occasional speaker, who was a public and not-for-profit sector CTO, congratulated graduates for being some of the last to graduate without the support and integration of large language model AIs in their degrees. She also commented that her speech was partially written by ChatGPT.

While I've used generative AIs in my work and personal life for more than two years, these comments really struck home for me how integrated these AIs will become in everyone's lives.

No-one being educated today should be denied access to AI support to help them to grasp concepts and elevate their work to the highest possible standard.

No-one working in an office should be denied access to support AI to help them perform at their best every day.

Organisations that seek to cut off access to these AIs, because it's somehow 'cheating' - should note how calculators used to be seen as tools for cheating by teachers, and some workplaces resisted the introduction of computers because executives and managers didn't type themselves - they dictated to typing pools.

Indeed I remember, in the last 20 years, working for a Secretary in a very large Federal government department who did not have a computer in her office. She had her EA print all her emails and responded via dictation, which her EA typed up. To present a new look website to her, we had to print dozens of pages and she'd point to a button and ask what happened if the user clicked it.

This doesn't cut the mustard in the AI Age we've now entered. We need to give our students and workers access to the most modern tools to help them perform at peak.

So when I saw the following tweet, I had to ask GPT4 for a response, using her tweet verbatim as the prompt.


This is what GPT4 generated:

As an AI language model, I can understand why some members of Gen Z might be concerned about the impact of AI on their careers and future prospects. However, I believe that AI is not necessarily a threat but can be an opportunity if approached in the right way. Here are a few tips that might help:

  1. Develop skills that are complementary to AI: Instead of competing with AI, focus on developing skills that complement it. For example, skills that require human judgment, empathy, and creativity are still in demand and will continue to be valuable even as AI advances.
  2. Stay curious and keep learning: AI is still a relatively new field, and there is a lot to learn. Keep up with the latest trends and developments, take online courses or attend workshops to learn new skills, and stay curious about how AI is being applied in different industries.
  3. Collaborate with AI: Rather than seeing AI as a competitor, try to work alongside it. AI can be a powerful tool for automating tasks and making processes more efficient, so be open to collaborating with AI in your work.
  4. Embrace change and be adaptable: The job market is constantly changing, and the skills that are in demand today may not be in demand tomorrow. Be prepared to adapt to new technologies and trends, and don't be afraid to take on new challenges and learn new skills.
  5. Develop a growth mindset: Finally, cultivate a growth mindset and embrace the idea that learning and development are ongoing processes. See challenges as opportunities for growth and be open to new experiences and ideas.

Overall, it's important to remember that AI is still in its early stages, and it's difficult to predict exactly how it will impact the job market. However, by staying curious, learning new skills, and embracing change, you can position yourself to thrive in the new world of work.


Now Jenny felt this was a trifle impersonal to use without a human edit - which is fair enough. 

AI shouldn't be used to automate engagement, but to help streamline it. There's always going to be a role for humans to review and adjust messages to match the audience, using the context a human has that an AI does not.

However, it is possible to get AI to add personality - you simply have to tell it what personality you want it to add.

So I generated three further versions...

Using the voice of Albert Einstein:

Intelligence is not a competition between humans and machines, but rather a collaboration. Instead of fearing the advances of artificial intelligence, we should focus on developing complementary skills that require human judgment, empathy, and creativity. One must always remain curious and keep learning, for knowledge is the key to unlocking new opportunities. Embrace change and be adaptable, for the job market is constantly evolving. Finally, approach challenges with a growth mindset and see them as opportunities for personal development.

Using the voice of Arthur C Clarke:

As I've always maintained, the only way to discover the limits of what is possible is to venture past them into the impossible. In the age of artificial intelligence, we should not fear the rise of the machines but embrace it, for it can offer us a new path to knowledge and discovery. Instead of competing with AI, we should focus on developing skills that are complementary to it, such as our unique human abilities of creativity, intuition, and empathy. Learning must be a lifelong pursuit, and we must keep up with the latest advancements to stay relevant in the changing landscape of the job market. The ability to adapt and evolve is paramount in this new era, and we should see challenges as opportunities for growth and transformation.

And, finally, in the voice of Mahatma Gandhi

Artificial intelligence is a new force in our world, and while it may bring challenges, we must approach it with a peaceful and positive mindset. Rather than viewing AI as a threat, we should strive to develop skills that complement it, such as our unique human capacities for creativity, empathy, and compassion. Learning should be a lifelong pursuit, and we must stay curious and open-minded to the possibilities of AI. We should collaborate with AI, not compete with it, for only by working together can we achieve true progress. Let us embrace change and be adaptable, for the world is constantly evolving, and challenges are opportunities for growth. With a growth mindset and a spirit of cooperation, we can create a future where AI and humanity coexist in harmony.

Whether perfectly true to the originals or not, these variations were produced in a few minutes, far faster than a human could perform the same task. 

With AI the door is open to producing content in varied voices and perspectives at great speed.

Read full post...

Tuesday, April 04, 2023

2nd Australian Responsible AI Index launched - calls for government to regulate sooner rather than later

Today marked the unveiling of the 2nd Australian Responsible AI Index, accompanied by urgent appeals for the government to intervene and curb the potential misuse of artificial intelligence (AI). 

The Australian Financial Review provided comprehensive coverage of this critical topic, revealing that a mere 3% of Australian companies are managing the adoption and continuous use of AI in a responsible manner.

As AI permeates almost every facet of business operations, it is crucial that its management and regulation extend beyond vendors and IT teams, ensuring responsible policies are in place for both the business and society as a whole.

The Index report disclosed several key findings:

  • The average Responsible AI Index score for Australian organisations has remained stagnant at 62 out of 100 since 2021.
  • While a significant 82% of respondents believe they are adopting best-practice approaches to AI, a closer look reveals that only 24% are taking conscious steps to guarantee the responsible development of their AI systems.
  • There has been a growth in organisations with an enterprise-wide AI strategy linked to their broader business strategy, with the figure rising from 51% in 2021 to 60%.
  • Among those organisations, only 34% have a CEO who is personally committed to spearheading the AI strategy.
  • Organisations with CEO-led AI strategies boast a higher RAI Index score of 66, compared to a score of 61 for those without direct CEO involvement.
  • A total of 61% of organisations now recognise that the advantages of adopting a responsible AI approach outweigh the associated costs.

The Responsible AI Index serves as a timely reminder for the Australian government to act swiftly in the face of these findings, reinforcing the need for a more responsible approach towards AI implementation across the board.

Read full post...

Italy bans ChatGPT (over privacy concerns)

As the first major action by a nation to limit the spread and use of generative AI, Italy's government has taken the step to formally ban ChatGPT use not only by government employees, but by all Italians.

As reported by the BBC, "the Italian data-protection authority said there were privacy concerns relating to the model, which was created by US start-up OpenAI and is backed by Microsoft. The regulator said it would ban and investigate OpenAI 'with immediate effect'."

While I believe this concern is rooted in a misunderstanding as to how ChatGPT operates - it is a pre-trained AI, that doesn't integrate or learn from the prompts and content entered into it - given that OpenAI does broadly review this injected data for improving the AI's responses means there is enough of a concern for a regulator to want to explore it further.

Certainly I would not advise entering content that is private, confidential or classified into ChatGPT, but except in very specific cases, there's little to no privacy risk of your data being reused or somehow repurposed in nefarious ways.

In contrast the Singaporean government has built a tool using ChatGPT's API to give 90,000 public servants a 'Pair' in Microsoft Word and other applications they can use to accelerate writing tasks. The government has a formal agreement with OpenAI over not using any data prompts in future AI training.


What Italy's decision does herald is that nations should begin considering where their line is for AIs. While most of the current generation of large language models are pre-trained, meaning prompts from humans don't become part of their knowledge base, the next generation may include more capability for continuous finetuning, where information can continually be ingested by these AIs to keep improving their performance.

Specific finetuning is available now for certain AIs, such as OpenAI's GPT3 and AI21's Jurassic, which allows an organisation to finetune the AI to 'weight it' towards delivering better results for their knowledge set or specific goals. 

In government terms, this could mean training an AI on all of Australia's legislation to make it better able to review and write new laws, or on all the public/owned research ona given topic to support policy development processes.

It makes sense for governments to proactively understand the current and projected trajectory of AI (particularly generative AI) and set some policy lines to guide the response if they occur.

This would help industry develop within a safe envelope rather than exploring avenues which governments believe would create problems for society.

Read full post...

The ease at which bias creeps into AI

Removing AI bias is a critical component in the design and training of AI models, however despite the care taken, it can be incredibly easy for bias to creep in.

This is because we often don't see our own biases and, even at a macro level as a species, we may hold biases we are not aware of that fundamentally impact how our AIs perform.

A great example I came across the other week was in image generation AIs, when asked to create a selfie of a group of people. There is a tendency for AI models is to portray a smiling group, whatever their era or cultural background.

This shows a bias in that many groups in history traditionally don't smile in photos - however the western norm has largely become smiling, and so the training sets these AIs use are biased towards smiling, regardless of the culture or timeperiod of the people represented.

AI generated image of French WW1 soldiers selfie

Read full post...

TikTok banned on Australian government devices

The Australian government has finally banned the installation and use of TikTok on government devices.

This has been a long-time coming, with a number of other nations having already banned the use of TikTok on government devices.

With TikTok owned by a Chinese company with very close ties to Chinese government, this has long been an area I've been concerned about, and I'm glad they've finally made this decision.

While the vast majority of social networking tools used by Australians are owned overseas, most are domiciled in nations that have long-term alignments with Australian interests, such as the USA. While there's a similar risk that the US government through legislation could access information that Australia's public servants put on these networks, or weight what our government officials see when they use these platforms, this risk is generally considered low due to the alignments.

With TikTok this alignment was far weaker. Chinese hackers are constantly launching cyberattacks against Australian government agencies and companies, seeking commercial and political advantage.

As such, the risk and consequences of an Australian public servant being compromised or hacked, and information from government systems exposed, were significantly greater with TikTok than other services.

Read full post...

Friday, March 31, 2023

An example of GPT4 at work

Given that many folks may be observing but not actually using AIs such as GPT3 and GPT4 as yet, I wanted to give a quick example of how these AIs perform in practice.

Here's an example where I asked an AI to write a treatise on the Westworld notion that a human consciousness only requires 10,247 lines of code.

My prompt:

Hi ChatGPT. In the TV series Westworld, it is suggested that a human consciousness can be defined in 10,247 lines of code. Please write a treatise on why this would be the case and a process that a researcher or psychologist might use to understand and produce this code.

GPT4 response:

Title: The Quest for Human Consciousness: Decoding the 10,247 Lines of Code

Abstract:

The television series Westworld presents the idea that human consciousness can be distilled down to 10,247 lines of code. While this is a fictional concept, it raises intriguing questions about the nature of human consciousness and how it might be replicated or understood through computational means. In this treatise, we explore the theoretical underpinnings of this concept, discuss the possible reasons for this specific number of code lines, and outline a potential process for a researcher or psychologist to decipher and generate this code.

Introduction

The concept of condensing human consciousness into 10,247 lines of code is a fascinating idea that requires an understanding of the human mind's complexity and the computational approaches used to mimic it. This treatise aims to explore this notion, touching upon the interdisciplinary fields of cognitive science, psychology, neuroscience, and artificial intelligence.

Theoretical Underpinnings

The idea that human consciousness can be expressed in a specific number of code lines is built on several assumptions:

a. Reductionism: This perspective argues that complex phenomena, such as consciousness, can be reduced to simpler, more fundamental components. In this case, it implies that human consciousness can be explained and replicated by breaking it down into discrete elements represented by lines of code.

b. Computational Theory of Mind: This theory posits that the human mind operates like an information processor, with mental processes akin to computational algorithms. If human consciousness adheres to this model, it might be possible to represent it as a set of algorithms or code lines.

c. Artificial Intelligence: The rapid advancements in AI research and neural networks support the idea that replicating human-like thinking, learning, and problem-solving abilities is possible through computational means.

Why 10,247 Lines of Code?

The specific number, 10,247, is likely an arbitrary choice made by the creators of Westworld to illustrate the complexity and uniqueness of human consciousness. Nonetheless, the number serves as a captivating starting point for discussions about the feasibility of representing consciousness in code.

A Process for Decoding and Producing the Consciousness Code

To understand and generate the 10,247 lines of code representing human consciousness, a researcher or psychologist might follow this process:

a. Mapping Mental Processes: Start by identifying the key cognitive processes and mental states that contribute to human consciousness. This could involve categorizing various cognitive functions, emotions, and subjective experiences.

b. Investigating Neural Correlates: Examine the brain regions and neural networks responsible for the identified cognitive processes, using techniques such as fMRI, EEG, and MEG.

c. Developing Computational Models: Develop algorithms and computational models that replicate the observed cognitive processes and neural activities. This could involve utilizing artificial neural networks, machine learning, and other AI techniques.

d. Code Generation: Translate the developed computational models into code, keeping track of the number of lines used. Iterate and optimize the code to reduce its length without sacrificing its accuracy in replicating human consciousness.

e. Verification and Validation: Test the generated code by comparing its outputs with empirical data collected from human subjects, adjusting and refining the code as needed.

Conclusion

While the notion of human consciousness being distilled down to 10,247 lines of code is a fictional idea, it sparks fascinating discussions about the nature of consciousness and the potential for replicating it through computational means. By exploring interdisciplinary research in cognitive science, psychology, neuroscience, and artificial intelligence, we can attempt to unravel the mysteries of the human mind and push the boundaries of what is possible in understanding and emulating consciousness.


My follow-on prompt:

So as a neuroscientist, mathematician and psychology, using the treatise above, write the actual code for a human consciousness.

GPT4 response:

As an AI language model, I cannot provide you with the actual code for human consciousness, since it is a highly complex and currently unsolved problem in neuroscience, psychology, and artificial intelligence. The idea of condensing human consciousness into a specific number of lines of code, as presented in the TV series Westworld, is purely fictional.

Moreover, creating an actual code for human consciousness would require a deep understanding of the human brain, which we have not yet achieved. It would also necessitate advancements in artificial intelligence, neural networks, and cognitive modeling that are currently beyond our reach.

However, I can outline some of the components that would potentially be part of the code, based on the current understanding of human cognition and artificial intelligence research:

Sensory input processing: Code to process and interpret inputs from various sensory modalities such as vision, audition, touch, taste, and smell.

Memory systems: Code to implement short-term and long-term memory systems, enabling the storage and retrieval of information.

Attention mechanisms: Code to allocate cognitive resources to relevant stimuli and filter out irrelevant information.

Emotion processing: Code to simulate emotional states and their influence on cognition, decision-making, and behavior.

Language processing: Code to comprehend and generate natural language, enabling communication.

Decision-making and problem-solving: Code to evaluate options and make choices based on goals, constraints, and available information.

Learning mechanisms: Code to adapt and improve cognitive processes through experience, including reinforcement learning, supervised learning, and unsupervised learning.

Self-awareness and introspection: Code to model self-awareness and the ability to reflect on one's own mental states.

Social cognition: Code to understand and interpret the mental states, intentions, and emotions of others, enabling social interaction.

Conscious experience: Code to simulate subjective experiences and the phenomenological aspects of consciousness.

It is important to note that these components are only a rough outline of what might be involved in replicating human consciousness, and the actual implementation would be far more complex and intricate. Developing a comprehensive code for human consciousness remains a formidable challenge that will likely require many more years of research and collaboration across multiple scientific disciplines.

Read full post...

Thursday, March 30, 2023

Is Artificial Intelligence safe? It depends who you ask...

 Over the last few days there's been some significant calls for a slowdown or pause in artificial intelligence research and development, or at least for a pause on public implementations of AI.

There's some significant rationale behind this - with the arrival of GPT4 (which I've been extensively experimenting with) we have seen another huge leap in AI capability.

We've also seen many large companies announce they are working to integrate this level of AI into their services in ways that allow the public to use (or misuse) this capability.

Some of this is extremely valuable - such as integrating a writing, formula and presentation assistant into Microsoft's Suite of tools. Some appears risky - such as Snapchat's release of an AI 'friend' into its paid service in February which, as the video, The AI Dilemma (linked), can incidentally be used to help sexual predators groom children (watch from 47 min - 49 min for this specific segment).

Also we've seen over a thousand AI luminaries and researchers call for a pause on AIs more sophisticated than GPT4 (letter here, article about it here) - which has received particular attention because Elon Musk signed, but is actually notable because of calibre and breadth of other industry experts and AI company CEOs that signed it.

Now whether government is extensively using AI or not, AI is now having significant impacts on society, These will only increase - and extremely rapidly.

Examples like using the Snapchat AI for grooming are the tip of the iceberg. It is now possible - with 3 seconds of audio (a Microsoft system) - to create a filter mimicking any voice, making all voice recognition security systems useless. 

In fact there's already been several cases where criminals are calling individuals to get their phone message (in their voice) or their initial greeting, then using that to voice authenticate the individual's accounts and steal funds.

Now this specific example isn't new - the first high profile case occurred in 2019

However the threshold for accessing and using this type of technology has dramatically come down, making it accessible to almost anyone.

And this is only one scenario. Deep fakes can also mimic appearances, including in video, and AIs can also be used to simulate official documents or conversations with organisations to phish people.

That's alongside CV fakery, using AI to cheat on tests (in schools, universities and the workplace) and the notion of outsourcing your job to AI secretly, which may expose commercially sensitive information to external entities.

And we haven't even gotten to the risks of AI that, in pursuit of its goal or reward, uses means such as replicating itself, breaking laws or coercing humans to support it.

For governments this is an accelerating potential disaster and needs to have the full attention of key teams to ensure they are designing systems that cannot be exploited by humans asking an AI to read the code for a system and identify any potential vulnerabilities.

Equally the need to inform and protect citizens is becoming critical - as the Snapchat example demonstrates.

With all this, I remain an AI optimist. AI offers enormous benefits for humanity when used effectively. However with the proliferation of AI, to the extent that it is now possible to run a GPT3 level AI on a laptop (using the Alpaca research model), there needs to be proactivity in the government's approach to artificial intelligence.


Read full post...

Monday, March 06, 2023

Artificial Intelligence isn't the silver bullet for bias. We have to keep working on ourselves.

There's been a lot of attention paid to AI ethics over the last few years due to concerns that use of artificial intelligence may further entrench and amplify the impact of subconscious and conscious biases.

This is very warranted. Much of the data humans have collected over the last few hundred years is heavily impacted by bias. 

For example, air-conditioning temperatures are largely set based on research conducted in the 1950s-70s in the US, on offices predominantly inhabited by men and folks wearing heavier materials than worn today. It's common for many folks today to feel cold in offices where air-conditioning is still set for men wearing three-piece suits.

Similarly, many datasets used to teach machine learning AI suffer from biases - whether based on gender, race, age or even cultural norms at the time of collection. We only have the data we have from the last century and it is virtually impossible for most of it to be 'retrofitted' to remove bias.

This affects everything from medical to management research, and when used to train AI the biases in the data can easily affect the AI's capabilities. For example, the incredibly awkward period just a few years ago when Google's image AI incorrectly identified black people as 'gorillas'. 

How did Google solve this? By preventing Google Photos from labelling any image as a gorilla, chimpanzee, or monkey – even pictures of the primates themselves - an expedient but poor solution, as it didn't fix the bias.

So clearly there's need for us to carefully screen the data we use to train AI to minimise the introduction or exacerbation of bias. And there's also need to add 'protective measures' on AI outputs, to catch instances of bias, both to exclude them from outputs and to use them to identify remaining bias to address.

However, none of this work will be effective if we don't continue to work on ourselves.

The root of all AI bias is human bias. 

Even when we catch the obvious data biases and take care when training an AI to minimise potential biases, it's likely to be extremely difficult, if not impossible, to eliminate all bias altogether. In fact, some systemic unconscious biases in society may not even be visible until we see an AI emulating and amplifying them.

As such no organisation should ever rely on AI to reduce or eliminate the bias exhibited by its human staff, contractors and partners. We need to continue to work on ourselves to eliminate the biases we introduce into data (via biases in the queries, process and participants) and that we exhibit in our own language, behaviours and intent.

Otherwise, even if we do miraculously train AIs to be entirely bias free, bias will get reintroduced through how humans selectively employ and apply the outputs and decisions of these AIs - sometimes in the belief that they, as humans, are acting without bias.

So if your organisation is considering introducing AI to reduce bias in a given process or decision, make sure you continue working on all the humans that remain involved at any step. Because AI will never be a silver bullet for ending bias while we, as humans, continue to harbour them.

Read full post...

Wednesday, March 01, 2023

Does Australia have the national compute capability for widespread local AI use?

There's been a lot of attention on the potential benefits and risks of artificial intelligence for Australia - with the Department of Industry developing the Artificial Intelligence (AI) Ethics Framework and the DTA working on formal AI guidelines.

However comparative less attention is often placed on building our domestic AI compute capability - the local hardware required to operate AIs at scale.

The OECD also sees this as a significant issue - and has released the report 'A blueprint for building national compute capacity for artificial intelligence' specifically to help countries that lack sufficient national compute capability.

As an AI startup in Australia, we're been heavily reliant on leveraging commercial AI capabilities out of the US and Europe. This is because there's no local providers of generative AI at commercial prices. We did explore building our own local capability and found that the cost for physical hardware and infrastructure was approximately 10x the cost of the same configurations overseas.

There's been global shortages of the hardware required for large AI models for a number of years. Unfortunately, Australia tends to be at the far end of these supply chains, with even large commercial cloud vendors unable to provision the necessary equipment locally.

As such, while we've trained several large AI models ourselves, we've not been able to locate the hardware or make a commercial case for paying the costs of hosting them in Australia.

For some AI uses an offshore AI is perfectly acceptable, whether provided through a finetuned commercial service or custom-trained using an open-source model. However, there's also many other use cases, particularly in government with jurisdictional security requirements, where a locally trained and hosted AI is mandated.

A smaller AI model, such as Stable Diffusion, can run on a laptop, however larger AI models require significant dedicated resources, even in a cloud environment. Presently few organisations have the capability to pay the costs for sourcing the hardware and capability to run this within Australian jurisdictions.

And that's without considering the challenge in locating sufficient trained human resources to design, implement and manage such a service.

This is an engineering and production challenge, and will likely be resolved over time. However with the high speed of AI evolution, it is a significant structural disadvantage for Australian organisations that require locally hosted AI solutions.

If Australia's key services have to rely on AI technologies that are several generations behind world-standards, this will materially impact our global competitiveness and capability to respond.

As such, alongside worrying about the ethics and approaches for using AI, Australian governments should also reflect on what is needed to ensure that Australia has an evolving 'right size' national compute capability to support our AI requirements into the future.

Because this need is only going to grow.

Read full post...

Friday, February 24, 2023

To observe future use of AI, watch Ukraine

I'm not given to regularly commenting on global events, such as major wars, through this blog. However, given the active use of advanced technology by both Ukraine and Russia in the current war, I am making a brief exception to comment on the use of advanced technologies, including AI.

The Russian invasion of Ukraine has seen some of the most advanced uses of technology on the battlefield in history. Some of this involves AI, some other technologies, but all of it demonstrates the practical potential of these technologies and should be considered within any nation's future defense planning.

Now drones have been used in previous conflicts in Afghanistan and elsewhere, with militia forces modifying off-the-shelf products into aerial scouts and IEDs. Meanwhile US forces have used custom-built human-controlled Predators for over ten years to pinpoint target enemy commanders and units.

However, the war in Ukraine represents the largest deployment and the most versatile use of drones to-date by regular armies opposing each other in the field.

For example, both Ukraine and Russia have successfully used autonomous watercraft in assaults. These unmanned above and below water drones have been used to inflict damage on opposing manned vessels and plant and clear mines at a fraction of the cost of building, crewing and maintaining manned vessels.

The Ukrainian attack on Russia's Black Sea Fleet in port exhibits signs of being one of the first uses of a drone swarm in combat, with a group of kamikaze above and below water drones, potentially with aerial drone support, used in concert to damage several Russian ships.

While perhaps not as effective as was hoped, this is a prototype for future naval drone use and heralds that we are entering an era where swarms of relatively cheap and disposable drones - partially or completely autonomous to prevent signal blocking or hijacking - are used alongside or instead of manned naval vessels to inflict damage and deter an attacker, acting as a force magnifier.

The use of autonomous mine layers offers the potential for 'lay and forget' minefields astride enemy naval routes and ports, to again limit and deter enemy naval movements. Again the lower cost and disposability of these drones, compared to training explosive handling human divers and building, maintaining, defending and crewing manned naval minelaying and removal vessels, makes them a desirable alternative.

We've also seen extensive use of aerial drones by both sides in the conflict to spot and target enemy combatants, allowing more targeted use of artillery and manpower and helping reduce casualties by lifting aspects of the fog of war. Knowing the numbers and strength of your opponent on the battlefield greatly enhances a unit's capability to successfully defend or assault a position.

While many of these drones are human controlled, there's been use of AI for situations where direct control is blocked by enemy jamming or hacking attempts. AI has also been used for 'patrol routes' and to 'return' home once a drone has completed a mission - complete with ensuring that drones fly an elusive course to conceal the location of their human controllers.

The Ukrainian war has even seen the first public video recorded aerial drone on drone conflict, with a Ukrainian drone ramming a Russian drone to knock it out of the sky and remove the enemy's best source of intelligence.

These examples of drone use aren't totally new to warfare. 

Hundreds of years ago unmanned fire ships were used in naval combat - loaded with explosives and left to drift into enemy warships and explode. 

Similarly, early aerial combat by humans, in World War One, saw pilots take pistols and bricks aloft to fire at enemy planes or drop on enemy troops below. Just as drones today are being modified to carry grenades to drop on infantry positions or used to ram opposing drones.

The war will help advance the technology and improve the tactics. If nothing else Ukraine is a test ground for learning how to effectively use drones within combined forces to improve overall military effectiveness and reduce casualties.

And artificial intelligence is becoming increasingly important as a control alternative when an enemy blocks signal or attempts to take control of an army's drone assets.

We need to put these learnings to use in our own military planning and acquisitions, so that Australia's military becomes capable of fighting the next war, rather than the last.

Read full post...

Monday, February 20, 2023

Do AIs dream of electric explainability?

One of the primary arguments against artificial intelligence in many processes related to decision-making is lack of explainability.

Explainability (also known as interpretability) refers to being able to explain how a machine learning AI model functions to produce a given output in a way that “makes sense” to a human being at an acceptable level.

Think of how in maths classes at school where you may have been asked to 'show your working' - to write out the steps you took to get from the initial problem to your solution.

For learning AIs, those that are trained on massive datasets to reach a level of capability, explainability can be highly challenging.

Even when the initial AI algorithms used for the machine learning and the data set used to train it are made fully explainable, the internal method by which the AI goes about deriving a solution may not be fully explained and hence the AI doesn't meet the explainability test.

When presented with identical inputs for a decision process, different AIs, using the same initial algorithms and trained on the same training dataset, might form very different conclusions.

Now this may appear similar to how humans make decisions. 

Give two humans the same information and decision process and, at times, they may arrive at completely different decisions. This might be due to influences from their past experiences (training), emotions, interpretations, or other factors.

When humans make decisions, it is possible to ask them how they arrived at their decision, and how they weighed various factors. And they may be able to honestly tell you.

With AIs it is also possible to ask them how they arrived at a decision.

However, the process by which they use to respond to this request is the same as they used to arrive at their decision in the first place. The AI model is not self-conscious and as such there's no capability for self-reflection or objective consideration.

In fact, most machine learning models only have an attention span of only a few thousand words, So, they may not even recall making a decision a few minutes or days before. AI doesn't have the consciousness to be aware that 'they' as an entity made the decision. 

This is unlike a human, who might forget a decision they made, but be conscious they made it and able to 'think back' to when they did to provide reasons for their decision-making.

Asking an AI to explain a decision is not necessarily providing explainability for that decision. What you are getting is the machine learning model's probabilistic choices of letters and words. These may form what may seem to be a plausible reason, but isn't a reason at all.

You can even simply tell a machine learning AI that it made a given decision and ask it why it did, and it will write something plausible that justifies that decision.

At a basic level I can easily explain how a machine learning AI, such as ChatGPT or Jurassic, arrives at a given output. It takes the input, parses it through a huge probability engine then write an output by selecting probabilistically likely words. 

For variability it doesn't always select the highest probability every time, which is why the same input doesn't always result in the same output.

However this doesn't explain how an AI makes a 'decision' - AKA prefers one specific option over other options. It does explain wht the same AI, asked the same 'question' (input), may produce diametrically opposed decisions when asked to regenerate its response.

The AI isn't interested in whether a decision is 'better' or 'worse' - simply that it provides an output that satisfies the end user.

There's a Chinese proverb that describes this perfectly:
“A bird does not sing because it has an answer. It sings because it has a song.”
This is why no current machine learning models can be explainable in their decision-making. And why we should not use them in situations where they are making decisions.

Now if you wish to use them as a way to provide information to assist decision-making, or to help write up the decision once it has been made, they have enormous utility.

But if you want explainability in decision making, don't use a machine learning AI.

Read full post...

Wednesday, February 15, 2023

DTA chooses a cautious path for generative AI

The DTA's CEO, Chris Fechner, has advised public servants to be cautious in their use of ChatGPT and other generative AI, as reported by InnovationAus.

This is an unsurprising but positive response. While it suggests public servants use caution, it doesn't close down experimentation and prototyping.

Given how recently generative AI became commercially useful and that most commercial generative AIs are currently based overseas (noting my company has rolled out local prototypes), there are significant confidentiality/security challenges with generative AI for government use, alongside the challenges of accuracy/factualness and quality assurance.

Given I have a public sector background, I began experimenting with these AIs for government use from October 2020. Within a few weeks I was pleasantly surprised at how well an AI such as GPT-3 could produce minutes and briefing papers from associated information, accurately adopting the necessary tone and approach that I had used and encountered during my years in the APS.

Subsequently I've used generative AI to develop simulated laws and policy documents, and to provide insightful advice based on regulations and laws.

This is just the tip of the iceberg for generative AI in government. 

I see potential to accelerate the production of significant amounts of internal correspondence, reports, strategies, intranet content and various project, product and user documentation using the assistance of AI.

There's also enormous potential to streamline the production and repurposing of externally focused content; turning reports into media releases, summaries and social posts; supporting engagement processes through the analysis of responses; development and repurposing of communications materials; and much more.

However, it's important to do this within the context of the public service - which means ensuring that the generative AIs used are appropriately trained and finetuned to the needs of an agency.

Also critical is recognising that generative AI, like digital, should not be controlled by IT teams. It is a business solution that requires skills that few IT teams possess. For example, finetuning and prompt engineering both require strong language capabilities and business knowledge to ensure that an AI is appropriately finetuned and prompted to deliver the outcomes required.

Unlike traditional computing, where applications can be programmed to select from a controlled set of options, or a white list used to exclude dangerous or inappropriate options, generative AIs must be trained and guided through this approach - more akin to parenting than programming.

I'm certain that the folks most likely experimenting with generative AI in government are more likely on the business end, than the IT end - as we saw with digital services several decades ago.

And I hope the public sector remembers the lessons from this period and the battles between business and IT are resolved faster and more smoothly than with digital.

Read full post...

Thursday, February 09, 2023

AI is not going to destroy humanity

 I've read a few pieces recently, one even quoting an Australian MP, Julian Hill, where claims are made of "catastrophic risks" from AI to humanity.

Some of the claims are that "ChatGPT diminishes the ability for humans to think critically", that "AI will eliminate white collar jobs" or even that "AI will destroy humanity".

Even in a session I ran yesterday for business owners about how to productively use ChatGPT in their business had several folks who evidenced concern and fear about how AI would impact society.

It's time to take a deep breath and reflect.

I recall similar sentiments at the dawn of the internet and even at the invention of the printing press. There were similarly many fearful articles and books published in 1999 ahead of the 'Y2K bug' that predicted planes would fall out of the sky and tax systems crash. Even the response of some commentators to the recent Chinese balloon over the US bears the same hallmarks of fear and doubt.

It's perfectly normal for many folks to feel concerned when something new comes along - one could even say it's biologically driven, designed to protect our nomadic ancestors from unknown threats as they traversed new lands.

Stoking these fears of a new technology heralding an unknown future are the stock-in-trade of sensationalists and attention seekers. Whereas providing calm and reasoned perspectives doesn't attract the same level of engagement.

Yes, new technology often heralds change and uncertainty. There's inevitably a transition period that occurs once a new technology becomes visible to the public and before it becomes an invisible part of the background.

I'd suggest that AI has existed as a future fear for many years for humanity. It is used by popular entertainment creators to denote the 'other' that we fear - a malevolent non-human intelligence that only wishes us harm. 

From Skynet to Ultron to M3gan, AI has been an easy plot device to provide an external threat for human protagonists (and occasionally 'good' AIs like Vision) to overcome. 

With the arrival of ChatGPT, and the wave of media attention to this improvement to OpenAI's GPT-3, AI stopped being a future fiction and become a present fear for many.

Anyone can register to use the preview for free, and marvel at ChatGPT's ability to distill the knowledge of mankind into beautifully written (if often inaccurate) prose.

And yet, and yet...

We are still in the dawn of the AI revolution. Tools like ChatGPT, while having significant utility and range, are still in their infancy and only offer a fraction of the capabilities we'll see in the next several years.

Despite this, my view is that AI is no threat to humanity, other than to our illusions. 

It is an assistive tool, not an invading force. Like other tools it may be put to both positive and negative uses, however it is an extension of humanity's potential, serving our goals and ambitions.

To me AI is a bigger opportunity than even the internet to hold a mirror up to humanity and see ourselves in a new light.

Humanity is enriched by diverse perspectives, but until now these have largely come from other humans. While we've learnt from nature, using evolved designs to inform our own design, we've never co-inhabited the planet with a non-human intelligence equivalent, but different to our own.

AI will draw on all of humanity's knowledge, art and expertise to come to new insights that a human may never consider.

This isn't merely theoretical. It's already been demonstrated by the more primitive AIs we've developed to play games such as Go. When AlphaGo defeated Lee Sedol, the reigning world Go champion 4-1, it taught human players new ways to look at Go and to play the game. Approaches that no human would have ever considered.

Imagine the possibilities that could be unlocked in business and governance by accessing more diverse non-human perspectives. New pathways for improvement will open, and less effective pathways, the illusions that humans are often drawn to, will be exposed.

I use AI daily for many different tasks. In this week alone I've used it to help write a much praised eulogy of her father for my wife, to roleplay a difficult customer service situation to work through remediation options, to develop business and marketing plans, to write songs, answer questions, tell jokes and produce tender responses.

AI will change society. Some jobs will be modified, some new ones will be created. It will be harder for humans to hide truth behind beliefs and comfortable illusions.

And we will be the better for it.

Read full post...

Bookmark and Share