Skip to content

Researchers have a plan to give the GPT-3 AI some "common sense"

Scientists are giving the best language AI the power of vision, allowing it to become even more indistinguishable from human speech.

Hope Corrigan
Hope Corrigan
2 min read
Researchers have a plan to give the GPT-3 AI some "common sense"

As anyone who’s awkwardly explained what the word ‘virgin’ means to a young child who turns out to be enquiring about the sticker on some olive oil can agree, context is very important.

Language is so intricate and twisted, and English is arguably one of the worst. Nonsensical from incorporating a myriad of sources over the course of human history. Engulfing jokes and sayings until their origin is cryptic and the words have changed from the thin water of the womb to the thickest of blood.

It’s constantly evolving in the way that only a lit AF living language can, regardless of how much sense those words may actually make.

AI can be tricky enough without dealing with human intricacies. Recently, this football watching AI focussed on a bald man's head for most of a match, while a driverless car drove straight into a wall.

This is why, despite being incredibly impressive, the human language generating GPT-3 AI can get simple questions wrong.

The GPT-3 or Generative Pre-trained Transformer 3 is the third generation model of an AI developed by OpenAI to use deep learning of human language to produce text that is as indistinguishable as possible from that of a real human.

It does a pretty good job, but as MIT Technology Review explains it can be tripped up because it lacks context, or the seemingly inaptly named human phenomena of common sense.

Little things like asking it the colour of a sheep can result in an equal chance of ‘black’ or ‘white’ being the answer. Our language has so many nods to black sheep that it’s no surprise the AI assumes both are likely answers without realising that those references specifically imply the rarity of the variation.

So researchers at the University of North Carolina have decided to give the damn thing sight.

It’s not as easy as jamming a current visual learning AI and text based together, because again there needs to be more context. Most visual learning data sets, like Microsoft Common Objects in Context, are only paired with a few words to distinguish objects, so the researchers had to come up with something better.

While still using MS COCO, researchers added a technique they call "vokenization" that scans for visual patterns which provide more context to images. Rather than simply an image of a sheep, it can get 3D information about the image telling us more about the sheep in context.

This visual information would likely help, showing the AI that black sheep are far less common with many more white sheep images. Furthermore the AI could see that sheep are often in fields, not jumping over fences or being counted by individuals desperate to sleep.

But of course, this won’t be enough. The AI can’t tell how bad sheep can smell, in a real and true nose wrinkling way. It seems there’ll always be something to add to the ever growing knowledge of unsupervised AI learning.

IdeasTechnologyArtificial Intelligence

Hope Corrigan

Secretly several dogs stacked on top of one another in a large coat, Hope has a habit of getting far too excited about all things videogames and tech. She loves the new accomplishments and ideas huma

Related Posts

Members Public

Great summer reads

16 great links to some of the best stories around the web that help you stay on top of what's next in digital.

Cartoon style robot at a table looking at a typewriter.
Members Public

What will the Canva of AI copy mean for writing?

Every industry has its practitioners who over charge and under deliver. Both graphic design and copywriting have more than their fair share.

A robot sits in the foreground, typing on a weird typewriter. There are rows of more robots, all typing.
Members Public

Maintaining a clear sense of the future

In 1999 I completely dismissed the importance of flash memory cards as a storage medium for the future. It taught me big lessons on how to consider a technology's long-term potential.

An advanced robot looks toward a sunset from a mountain above a city skyline.