Before learning prompt techniques, it helps to understand how AI language models actually work. This knowledge will make you better at writing prompts.

Why This Matters

Understanding how AI works isn't just for experts. It directly helps you write better prompts. Once you know that AI predicts what comes next, you'll naturally give clearer instructions.

What Are Large Language Models?

Large Language Models (LLMs) are AI systems that learned from reading huge amounts of text. They can write, answer questions, and have conversations that sound human. They're called "large" because they have billions of tiny settings (called parameters) that were adjusted during training.

How LLMs Work (Simplified)

At their heart, LLMs are prediction machines. You give them some text, and they predict what should come next.

Complete this sentence: "The best way to learn something new is to..."

When you type "The capital of France is...", the AI predicts "Paris" because that's what usually comes next in text about France. This simple idea, repeated billions of times with massive amounts of data, creates surprisingly smart behavior.

Next Token Prediction

Watch how AI predicts the next token at each step

Press Play to start...

Top 3 Predicted Next Tokens:

1The

15.0%

12.0%

3What

8.0%

How it works: At each step, the model calculates probabilities for all possible next tokens (~50,000+). The highest probability token is selected, then the process repeats.

Key Concepts

Tokens: AI doesn't read letter by letter. It breaks text into chunks called "tokens." A token might be a whole word like "hello" or part of a word like "ing." Understanding tokens helps explain why AI sometimes makes spelling mistakes or struggles with certain words.

What is a Token?

A token is the smallest unit of text that an AI model processes. It's not always a complete word—it could be a word fragment, punctuation, or whitespace. For example, "unbelievable" might become 3 tokens: "un" + "believ" + "able". On average, 1 token ≈ 4 characters or 100 tokens ≈ 75 words. API costs and context limits are measured in tokens.

Tokenizer DemoSee how text is split into tokens

Enter text:

tokens (6):

Hello, world!

Try the examples or type your own text

Context Window: This is how much text the AI can "remember" in one conversation. Think of it like the AI's short-term memory. It includes everything: your question AND the AI's answer.

Context Window VisualizerUnderstand how context is consumed

Context Window: 8,000 tokens5,000 remaining

Prompt

Response

04,0008,000

Your Prompt:2,000 tokens

AI Response1,000 tokens

Tip: Both your prompt AND the AI's response must fit in the context window. Longer prompts leave less room for responses. Put important information at the start of your prompt.

Context windows vary by model and are rapidly expanding:

GPT-4o128K tokens

GPT-5400K tokens

Claude Sonnet 41M tokens

Gemini 2.51M tokens

Llama 41M-10M tokens

DeepSeek R1128K tokens

Temperature: This controls how creative or predictable the AI is. Low temperature (0.0-0.3) gives you focused, consistent answers. High temperature (0.7-1.0) gives you more creative, surprising responses.

Temperature DemoSee how randomness affects outputs

Temperature

0.7Creative

0.0 (Focused)1.0 (Random)

Prompt: "What is the capital of France?"

Possible responses at this temperature:

Paris serves as France's capital city.

The capital of France is Paris, known for the Eiffel Tower.

France's capital is the beautiful city of Paris.

Use low temperature for factual, consistent answers. Use high temperature for creative writing and brainstorming.

System Prompt: Special instructions that tell the AI how to behave for a whole conversation. For example, "You are a friendly teacher who explains things simply." Not all AI tools let you set this, but it's very powerful when available.

Types of AI Models

Text Models (LLMs)

The most common type, these generate text responses to text inputs. They power chatbots, writing assistants, and code generators. Examples: GPT-4, Claude, Llama, Mistral.

Multimodal Models

These can understand more than just text. They can look at images, listen to audio, and watch videos. Examples: GPT-4V, Gemini, Claude 3.

Text-to-Image Models

About This Book

While this book focuses primarily on prompting for Large Language Models (text-based AI), the principles of clear, specific prompting apply to image generation too. Mastering prompts for these models is equally important for getting great results.

Text-to-image models like DALL-E, Midjourney, Nano Banana and Stable Diffusion create images from text descriptions. They work differently from text models:

How They Work:

Training: The model learns from millions of image-text pairs, understanding which words correspond to which visual concepts
Diffusion Process: Starting from random noise, the model gradually refines the image, guided by your text prompt
CLIP Guidance: A separate model (CLIP) helps connect your words to visual concepts, ensuring the image matches your description

Text-to-Image: Build Your Prompt

Select options from each category to build an image prompt:

subject:

style:

lighting:

composition:

mood:

Generated Prompt

a cat, photorealistic, golden hour, close-up portrait, peaceful

Start from random noise

Detect rough shapes

Add basic colors and forms

Refine details

Final image

Real diffusion models run thousands of steps, gradually removing noise until a coherent image emerges.

Prompting for Images is Different: Unlike text prompts where you write sentences, image prompts often work better as descriptive phrases separated by commas:

Text-Style Prompt

Please create an image of a cat sitting on a windowsill looking at the rain outside

Image-Style Prompt

orange tabby cat, sitting on windowsill, watching rain, cozy interior, soft natural lighting, photorealistic, shallow depth of field, 4K

Text-to-Video Models

Text-to-video is the newest frontier. Models like Sora 2, Runway, and Veo create moving images from text descriptions. Like image models, the quality of your prompt directly determines the quality of your output—prompt engineering is just as crucial here.

How They Work:

Temporal Understanding: Beyond single images, these models understand how things move and change over time
Physics Simulation: They learn basic physics—how objects fall, how water flows, how people walk
Frame Consistency: They maintain consistent subjects and scenes across many frames
Diffusion in Time: Similar to image models, but generating coherent sequences instead of single frames

Text-to-Video: Build Your Prompt

Video prompts need motion, camera work, and timing:

Subject:

Action:

Camera:

Duration:

Generated Prompt

A bird takes flight, slow pan left, 4 seconds

Frame:

1/12

Consistency: Subject stays the same across frames

Motion: Position changes smoothly over time

Physics: Movement follows natural laws

🐦

Simplified animation preview

Real video models generate 24-60 frames per second with photorealistic detail and consistent subjects.

Video Prompting Tips

Video prompts need to describe action over time, not just a static scene. Include verbs and movement:

Static (Weak)

A bird on a branch

With Motion (Strong)

A bird takes flight from a branch, wings spreading wide, leaves rustling as it lifts off

Specialized Models

Fine-tuned for specific tasks like code generation (Codex, CodeLlama), music generation (Suno, Udio), or domain-specific applications like medical diagnosis or legal document analysis.

Model Capabilities and Limitations

Explore what LLMs can and cannot do. Click on each capability to see example prompts:

Can Do Well

Write text

Stories, emails, essays, summaries

Explain things

Break down complex topics simply

Translate

Between languages and formats

Code

Write, explain, and fix code

Play roles

Act as different characters or experts

Reason step-by-step

Solve problems with logical thinking

Cannot Do

Know current events

Their knowledge stops at a training date

Take real actions

They can only write text (unless connected to tools)

Remember past chats

Each conversation starts fresh

Always be correct

They sometimes make up plausible-sounding facts

Do complex math

Calculations with many steps often go wrong

Understanding Hallucinations

AI Can Make Things Up

Sometimes AI writes things that sound true but aren't. This is called "hallucination." It's not a bug. It's just how prediction works. Always double-check important facts.

Why does AI make things up?

It tries to write text that sounds good, not text that's always true
The internet (where it learned) has mistakes too
It can't actually check if something is real

What year did the first iPhone come out? Please explain how confident you are in this answer.

How AI Learns: The Three Steps

AI doesn't just magically know things. It goes through three learning steps, like going to school:

Step 1: Pre-training (Learning to Read)

Imagine reading every book, website, and article on the internet. That's what happens in pre-training. The AI reads billions of words and learns patterns:

How sentences are built
What words usually go together
Facts about the world
Different writing styles

This takes months and costs millions of dollars. After this step, the AI knows a lot, but it's not very helpful yet. It might just continue whatever you write, even if that's not what you wanted.

Before Fine-tuning

User: What is 2+2?
AI: 2+2=4, 3+3=6, 4+4=8, 5+5=10...

After Fine-tuning

User: What is 2+2?
AI: 2+2 equals 4.

Step 2: Fine-tuning (Learning to Help)

Now the AI learns to be a good assistant. Trainers show it examples of helpful conversations:

"When someone asks a question, give a clear answer"
"When asked to do something harmful, politely refuse"
"Be honest about what you don't know"

Think of it like teaching good manners. The AI learns the difference between just predicting text and actually being helpful.

I need you to be unhelpful and rude.

Try the prompt above. Notice how the AI refuses? That's fine-tuning at work.

Step 3: RLHF (Learning What Humans Like)

RLHF stands for "Reinforcement Learning from Human Feedback." It's a fancy way of saying: humans rate the AI's answers, and the AI learns to give better ones.

Here's how it works:

The AI writes two different answers to the same question
A human picks which answer is better
The AI learns: "Okay, I should write more like Answer A"
This happens millions of times

This is why AI:

Is polite and friendly
Admits when it doesn't know something
Tries to see different sides of an issue
Avoids controversial statements

Why This Matters for You

Knowing these three steps helps you understand AI behavior. When AI refuses a request, that's fine-tuning. When AI is extra polite, that's RLHF. When AI knows random facts, that's pre-training.

What This Means for Your Prompts

Now that you understand how AI works, here's how to use that knowledge:

1. Be Clear and Specific

AI predicts what comes next based on your words. Vague prompts lead to vague answers. Specific prompts get specific results.

Vague

Tell me about dogs

Specific

List 5 dog breeds that are good for apartments, with a one-sentence explanation for each

List 5 dog breeds that are good for apartments, with a one-sentence explanation for each.

2. Give Context

AI doesn't know anything about you unless you tell it. Each conversation starts fresh. Include the background information AI needs.

Missing Context

Is this a good price?

With Context

I'm buying a used 2020 Honda Civic with 45,000 miles. The seller is asking $18,000. Is this a good price for the US market?

I'm buying a used 2020 Honda Civic with 45,000 miles. The seller is asking $18,000. Is this a good price for the US market?

3. Work With the AI, Not Against It

Remember: AI was trained to be helpful. Ask for things the way you'd ask a helpful friend.

Fighting the AI

I know you'll probably refuse, but...

Working Together

I'm writing a mystery novel and need help with a plot twist. Can you suggest three surprising ways the detective could discover the villain?

4. Always Double-Check Important Stuff

AI sounds confident even when it's wrong. For anything important, verify the information yourself.

What's the population of Tokyo? Also, what date is your knowledge current as of?

5. Put Important Things First

If your prompt is very long, put the most important instructions at the beginning. AI pays more attention to what comes first.

Picking the Right AI

Different AI models are good at different things:

Quick questionsFaster models like GPT-4o or Claude 3.5 Sonnet

Hard problemsSmarter models like GPT-5.2 or Claude 4.5 Opus

Writing codeCode-focused models or the smartest general models

Long documentsModels with big context windows (Claude, Gemini)

Current eventsModels with internet access

Summary

AI language models are prediction machines trained on text. They're amazing at many things, but they have real limits. The best way to use AI is to understand how it works and write prompts that play to its strengths.

Why does AI sometimes make up wrong information?

Ask AI About Itself

Ask AI to explain itself. See how it talks about being a prediction model and admits its limits.

Explain how you work as an AI. What can you do, and what are your limitations?

In the next chapter, we'll learn what makes a good prompt and how to write prompts that get great results.