Behind the Scenes Blog

How Should We Learn in an Age of ‘AI’?

‘How Should We Learn in an Age of ‘AI’?’ is the first in a series of articles addressing this topical subject.


I’ve created and taught courses on technical subjects for about 20 years now.  I started when I inherited a half-finished course on software supportability in 2001. The Royal Air Force relied on software in all its combat aircraft but knew precious little about software, and less about how to support it.  We needed that course.

After I left the Air Force, I joined a firm called QinetiQ. I discovered that we had a contract to teach safety to all UK Ministry of Defence staff that required it; the classroom was just down the road from our office.  I joined the instructing team.

With that experience, I created and taught bespoke safety courses for the Typhoon, Harrier and Raytheon Sentinel platforms.  I also helped create a safety course for the UK Military Aviation Authority.  Since moving to Australia, I have created and sold courses commercially, teaching home workers online for the first time.

It’s still difficult to access system safety training in Australia, and that’s why I started the Safety Artisan.  In my business, I am only teaching online.

The Problem

Recently I’ve been in discussions with colleagues in industry and academia about improving system safety education in Australia.  Because of the COVID-19 pandemic, learning has gone through a revolution.  We are now learning online much more than we ever did; in fact, it’s the ‘New Normal’.

Now another revolution has occurred: generative Artificial Intelligence (AI).

“Generative AI is a set of algorithms, capable of generating seemingly new, realistic content—such as text, images, or audio—from the training data. The most powerful generative AI algorithms are built on top of foundation models that are trained on a vast quantity of unlabeled data in a self-supervised way to identify underlying patterns for a wide range of tasks.”

© 2023 Boston Consulting Group,

This presents a challenge to anyone designing an online course that leads to a certification or award. How do we assess students online, when we know that they can use an AI to help them answer the questions?

In some circumstances, the AI could be generating the entire answer and the student would not be tested at all.  What we would really be testing them on is how good they were at using the AI.  (I’m not being facetious. As AI is such a wonderful research assistant, perhaps we should be training students to use it – wisely.)

Enter Chat GPT-4

OpenAI, the creators of Chat GPT-4, make some big claims for their product.

“GPT-4 is more creative and collaborative than ever before. It can generate, edit, and iterate with users on creative and technical writing tasks, such as composing songs, writing screenplays, or learning a user’s writing style.”


“GPT-4 can accept images as inputs and generate captions, classifications, and analyses.”


“GPT-4 is capable of handling over 25,000 words of text, allowing for use cases like long form content creation, extended conversations, and document search and analysis.”


But perhaps most significant of all is GPT-4’s claimed ‘safety’:

“We spent 6 months making GPT-4 safer and more aligned. GPT-4 is 82% less likely to respond to requests for disallowed content and 40% more likely to produce factual responses than GPT-3.5 on our internal evaluations.”


In other words, GPT-4:

  • Is less likely to regurgitate nasty sludge from the bottom of the web; and
  • Is more likely* to not make stuff up.

*Notice that they said “more likely” – this is not certain or assured.  (More on this in a later article.)

This is because the creators were more selective about the data they used to train the model.  Presumably, this implies that previous efforts just used any old rubbish scraped off the web, but nobody is admitting to that!

The Beginning of an Answer…

One of the academics I’ve met (sorry, but I can’t give them credit, yet) has studied this problem.  They’ve come up with some interesting answers.

In their experiments with GPT-4, they found that it was very good at the things you would expect it to be. It was great at answering questions by gathering and collating facts and presenting written answers.

But it wasn’t good at everything.  It was not good at reflecting on learning, for example. GPT-4 could not reflect on the learning that the student had experienced.  Similarly, it could not extrapolate what the student had been taught and apply it to new scenarios or contexts.

Therefore, the way to assess whether students really know their stuff is to get them to do these things. Most assessment marks can still be straightforward questions, which an AI could help answer. But a few marks, maybe only 20%, should require the student to reflect on what they had learnt and to extrapolate it to a new situation, which they must come up with. This bit of the assessment would separate the also-rans from the stars.

…And a Lot More Questions

Now there are obvious, mechanistic, reasons why the AI could not perform these tasks.  It had not been exposed to a student’s learning and therefore could not process it.  Even more difficult would be to take a student’s life and work experience – also unknown to the AI – and use that to extrapolate from the taught content.

(Okay, so there are possible countermeasures to these mechanistic problems.  The next stage is that the AI is exposed to all the online learning alongside the student.  The student also uploads their resume and as much detail as they can about their work to teach the AI.  But this would be a lot of work for the student, just to get those last 20% of the marks. That would probably negate the advantage of using an AI.)

However, the fact is that GPT-4 and its brethren struggle to do certain things. Humans are great at recognising patterns and making associations, even when they are not logical (e.g. ‘whales’ and ‘Wales’).  We also have imagination and emotion. And we can process problems at multiple levels of cognition, coming up with multiple responses that we can then choose from.  We also have personal experience and individuality. We are truly creative – original. Most AI still struggles to do these things, or even pretend to.

So, if we want to truly test the human learner, we have to assess things that an AI can’t do well.  This will drive the assessment strategies of all educators who want to teach online and award qualifications.  

And, guess what?  This is where the $$$ are, so it will happen. Before COVID-19, education was a massive export earner: “Australia’s education exports totalled $40bn in 2019.” This is according to the Strategy, Policy, and Research in Education (SPRE).  

This then begs the question:

What Else Can Humans do that AI Can’t (Yet)?

Why? Because if these are the skills on which we will be assessed, then we need to focus on being good at them. They will get us the best marks, so we can compete for the best jobs and wages.  These skills might also protect us from being made redundant (from those well-paid jobs) by some pesky AI!

This is what I’m going to explore in subsequent articles.

Leave a Reply

Your email address will not be published. Required fields are marked *