I use plenty of technologies without actually knowing exactly how they work. Excel, Spotify, electricity … for the most part, I don’t need to know much about the back end because it doesn’t pose a risk to the way I use the tool.
That’s not true for AI. Right now, AI chatbots seem like magic, and that’s dangerous. It makes it easy to overestimate their abilities. You don’t need to know every technical detail of how AI works – but you do need to know enough to understand its limitations.
Today we’ll cover a quick guide to how AI works, including:
- Why AI is super creative (and kind of a liar)
- What LLMs mean when they say, “I understand”
- What all of this means for you as a user
Want to get better at using AI in your day-to-day? Join our AI for Personal Productivity workshop on Dec. 13. Use code EFFICIENCY for 20% off a workshop seat or PRODUCTIVITY for 25% off membership.
5 insights on how AI works (and what they mean for you)
1. Large language models are text-predictors (but that’s cooler than it sounds).
Chat-based AI tools (like GPT-4, Claude, and Poe) are built on top of large language models – machine learning models that predict and generate plausible text based on what you feed them.
When you ask the LLM a question, the AI’s neural network takes your input (“What should I wear to work tomorrow?”), runs it through an algorithm, and spits out an output.
In the most basic terms, it’s looking at the internet’s information and predicting the next word based on what a lot of actual critical thinkers (you, me, and other humans) have said in the past. So when it looks at the sentence like:
When it’s cold outside, I put on ________.
It might estimate the following probabilities:
A sweater 10.9%
Gloves 8.5%
A hat 6.2%
A log on the fire 2.5%
A record 0.2%
But calling AI a text-predictor is burying the cleverness. How it does it is the cool part. A neural network is designed to mimic how we think our brains work, with lots of cross-connecting neurons. The AI has ingested a trillion words in context, and along the way, it’s built up a statistical analysis of how those words are connected.
Let’s use an example: the words king and queen. Those words are connected, but they’re connected in many different dimensions that mean many different things. King and queen are:
- Royal positions
- Chess pieces
- Playing cards
- Prom court designations
- Gen Z slang for people killing it
- Types of drag performers
- Types of beds
The context of “king” and “queen” in conversation totally changes the meaning. Large language models have hundreds of millions of dimensions, and LLMs are keeping track of those dimensions across hundreds of thousands of conversations. The subtlety they can capture is astounding.
2. LLMs are statistical, but they’re not logical (unlike Alexa or Siri).
Alexa, Siri, and the version of IBM Watson that won on Jeopardy are all symbolic-based. Everything they know has been programmed into them, so they’re very reliable.
When you ask, “What’s the weather in Paris tomorrow?,” they’ll parse your words into a sentence with verbs and nouns to try and understand what you’re asking. Then they’ll see if they have a pre-programmed routine to handle that request (access weather app > input parameters on Paris > search for tomorrow).
LLMs don’t work this way – they are always just generating the next, most likely word in a series, based on a mind-bogglingly complex analysis of a huge amount of text. Their reasoning ability is limited to what’s built into our language, rather than outsourced to critical thinking.
Which leads us to our next important point …
3. LLMs don’t know what you’re talking about.
It feels like LLMs understand us, but they don’t. They are, in essence, a very very advanced version of autocomplete.
This might be of interest (or concern) if you use an LLM for something like therapy or personal advice. When you describe a difficult personal situation and the LLM says, “It sounds like you’re having a hard time,” it’s forming predictive text based on what hundreds of thousands of empathetic therapists have said in the past. It doesn’t actually understand the concept of a “hard time,” nor has it experienced one.
The deep question here: Does it matter? Does AI need to “feel” what we’re feeling to empathize? There are plenty of humans who feel what we’re feeling and do a horrible job talking to us about it.
In the 60’s an early chatbot called Eliza was released and apparently some college students chatted with it for hours, self-therapizing:
Is this response any worse than what your best friend might say (if they don’t just pivot the conversation back to themselves)?
From Inflection’s Pi
4. Large language models have a ton of information, but not a single source of truth.
LLMs are trained on basically everything we’ve ever written online. They’re very good at being creative because they’ve read all the ideas – literally, all the blog posts, video transcripts, literary works, literary criticism, you name it – and can echo the best ones back to us.
But they don’t have a concept of truth, and they’re biased toward pleasing us. It’s easy to force them to give us an incorrect answer by telling them to tell you what you want to hear. So they’re prone to error and inconsistency. They’re only as good as the data they’re trained on, and unlike humans, they don’t have a “gut check” mechanism to say, “Wait a minute – does that actually make any sense?”
Take the example below. Sherlock Holmes never broke anyone’s nose, but when I suggest that he did, Claude follows my cue and hallucinates a scene to back it up. When I push Claude to show proof, it comes back with a passage that contains no broken nose.
5. AI chatbots (like GPT-4, Poe, etc.) add another layer of intelligence on top of LLMs.
ChatGPT, Claude, Bing, and Bard are built on top of LLMs, but they’re also more than LLMs. They connect to non-LLM services (like Google Search) to perform tasks that LLMs aren’t good at, like math, fact-finding, data analysis, audio input, coding, and lots more to come.
This is called “multi-model AI,” and we’ll start to see a lot more of it.
That said, even with these connections, AI can still be much more inconsistent than a Google search when it comes to the truth, especially those that have taken a more liberal approach to hallucinations or incorporated less human feedback.
I asked Claude the square root of 4,568,331:
It was correct! I guess I can trust Claude to do all my math problems from now on …
Hmm. I guess not.
What all of this means for you:
- Use AI for creativity; use Google or Siri for facts. If you want to know the weather, ask Google. If you want 50 ideas for a marketing campaign around the heart-healthy benefits of orange juice, ask GPT.
- Fact-check your chatbot’s work. Right now, AI chatbots are like a new team member. They have a ton of potential, but you need to check their work. If you are asking your chatbot for research or data, don’t blindly accept its answers – ask for a source, and then check the source. (You’ll sometimes find they’re made up).
- Ask AI to explain its reasoning. This trick can make it easier to spot weak logic or fallacies.
- For the best quality, pay for AI tools with more integrations. You get what you pay for, and right now, paid systems like GPT-4 have more integrations with non-LLM tools.
- When you want real understanding, you’ve still got humans. People can’t be beat when it comes to empathy. GPT probably won’t replace you in a 1:1 any time soon.
Want to get even better at using AI in your day-to-day? Join our AI for Personal Productivity
workshop on Dec. 13. Use code EFFICIENCY for 20% off a workshop seat or PRODUCTIVITY for 25% off membership.