Text Generation — AI Reference Library

Text generation is the most visible capability of modern AI. LLMs can write emails, explain concepts, summarise documents, translate languages, write code, compose poetry, and hold extended conversations. Understanding how this works — and where it breaks down — makes you a much better user of these tools.

How text is generated — token by token

When you send a message to an LLM, it doesn't generate the whole response at once. It generates one token at a time, each token selection influenced by everything that came before. This is why you see the text "streaming" in character by character.

At each step, the model produces a probability distribution over all possible next tokens (there are roughly 50,000 in a typical vocabulary). The temperature setting controls how the model samples from this distribution:

Low temperature (0.1–0.3) — almost always picks the most probable token. Consistent, predictable, useful for factual or code tasks.
High temperature (0.8–1.2) — more likely to pick less probable tokens. Creative, varied, sometimes surprising. Useful for creative writing, brainstorming.

What LLMs are genuinely excellent at

Drafting, editing, and improving text — emails, reports, essays, documentation
Summarisation — distilling long documents into key points
Translation — approaching human quality in major language pairs
Explanation — breaking down complex topics in accessible language
Code generation — writing, debugging, and explaining code across many languages
Classification — categorising text by topic, sentiment, or intent
Extraction — pulling structured information from unstructured text

The hallucination problem — why LLMs confabulate

LLMs generate text by predicting the most plausible continuation — not by retrieving verified facts. When asked about something outside their training data, or something requiring precise recall, they confidently generate text that sounds correct but may be entirely fabricated.

This isn't a bug being actively fixed — it's a fundamental consequence of how these systems work. Mitigation strategies include: grounding models in retrieved documents (RAG), giving models access to search tools, and training models to express uncertainty. But hallucination cannot be fully eliminated in pure text generation systems.

Practical rule

Treat LLM outputs as a smart first draft that needs fact-checking, not as a source of truth. Use LLMs for generation and drafting; use other tools and your own knowledge for verification.

Prompt sensitivity

LLM outputs are highly sensitive to how you phrase your input. "Summarise this article" and "Give me the three most important takeaways from this article in bullet points" will produce very different outputs — even with the same input. This is why prompt engineering is a genuinely useful skill, not just marketing jargon.

Key takeaways

LLMs generate text one token at a time, sampling from a probability distribution
Temperature controls creativity: low = consistent, high = creative and varied
LLMs excel at drafting, summarising, translating, explaining, and coding
Hallucination is fundamental to how LLMs work — always verify factual claims
Prompt phrasing significantly affects output — prompt engineering matters