What is RAG? The technology behind an AI that truly understands your documents

The knowledge is there. It sits in reports, manuals, policy documents, contracts, meeting minutes and dozens of folders on the network drive. And yet it takes a new employee three hours to find the answer to a question that an experienced colleague answers in thirty seconds, simply because that colleague knows where to look. That knowledge monopoly is fragile. People leave, fall ill, get busy or become unreachable. And in the meantime the organisation depends on whoever happens to have exactly the right folder open on their screen.

This is precisely the problem that a knowledge-base agent built on RAG solves. But what is RAG, exactly?

What does an ordinary AI do with a question?

A large language model, such as the AI behind ChatGPT or comparable tools, is trained on an enormous amount of text from the public internet. That makes it impressively versatile. But that training has a hard limit: it has a cut-off date, and the AI has never had access to your internal documents, contracts or procedures.

Ask such a model about something outside its training data, for instance a specific collective-labour-agreement clause, the latest version of an internal construction protocol or the terms in a supplier contract, and it does something known as "hallucinating". The model does not know the answer, but behaves as though it does. It produces a plausible-sounding answer that makes no real sense. Not because it is poor, but because this is simply how a language model works: it predicts which words logically follow one another, based on statistics rather than on factual verification.

For an arbitrary writing task that is acceptable. For an employee looking up a policy rule, a lawyer checking a contract clause or a buyer consulting supplier agreements, it absolutely is not.

What RAG does differently

RAG stands for Retrieval-Augmented Generation: retrieve first, then generate. Instead of letting the language model answer a question straight away, the system first searches a dedicated document base. The most relevant text fragments are retrieved and passed to the model together with the question. Only then does it formulate an answer, and it bases that answer solely on the retrieved fragments.

Think of the difference between a librarian who first looks up the relevant page before answering, and someone who answers purely from memory. The first can point to her sources. The second sometimes invents something that feels like a memory but misses the mark entirely. RAG is that librarian: it searches first and generates afterwards.

Concretely: an employee asks the knowledge-base agent, "What are the rules around notice periods in our standard supplier contract?" An ordinary AI gives a generic legal answer based on general contract law. A RAG system retrieves the exact clause from the contract in question, shows the relevant passage and formulates an answer based on that, including a reference to the document it came from.

How vector search works (explained simply)

Classic search works on keywords: you search for the word "notice period" and the search engine finds only documents that contain that exact word. If a document says "termination period" or "contract duration upon cancellation", it is not found, even though it holds exactly the information you are looking for.

Vector search thinks in meaning rather than in letters. Every document, or more precisely every fragment of a document, is converted into a long sequence of numbers that captures the meaning of that text. This is called an embedding. Pieces of text with a similar meaning are given embeddings that lie close together in a mathematical space. When an employee asks a question, that question is converted in the same way, and the system finds the fragments whose meaning is closest to that of the question.

The result is a search engine that understands that "when can we terminate the contract" and "supplier notice period" refer to the same thing, even when those exact words do not appear in the document where the answer is found. For large document bases, whose text is not uniformly written, this is a fundamental improvement over traditional search.

When is RAG worthwhile for your organisation?

RAG only comes into its own above a certain scale. When you are dealing with thousands of documents that grow continuously, are unstructured, and where you cannot determine in advance which parts are relevant to a specific question, RAG is the right approach.

Concrete situations where RAG clearly adds value: a construction company with tens of thousands of technical specifications, safety instructions, municipal zoning plans and tender documents; a municipality with hundreds of policy memos, council decisions and implementation plans; a legal department with extensive contract portfolios; a healthcare institution with a large library of guidelines, protocols and case files.

A strong signal that RAG is worthwhile is questions such as: "Where was that exception to the warranty scheme again?", "What exactly does our health and safety policy say about working from home during pregnancy?", or "In which of our 47 framework contracts is there a price-revision clause?" These are questions where an employee would otherwise spend hours searching by hand, and where the answer is in fact somewhere in the documents, but access to that knowledge is the bottleneck.

When is RAG overkill?

It is fair to say that RAG is not the right choice for everyone, and it is not cheap. Anyone with twenty or thirty manageable documents that rarely change is better off using a simpler approach: sending the full context directly to the language model, without an intermediate search step.

With small document bases, RAG does something unexpected: the retrieval step sometimes misses the most relevant fragment, precisely because the text is not always structured so that the semantically most similar passage is also the most informative one. An employee asking about the margin on type B assignments might then get back general text about margins instead of the right document. Simpler approaches are then more accurate, cheaper and easier to maintain.

The rule of thumb: with a few hundred documents or fewer, start with a simpler approach. RAG justifies its complexity only when scale demands it: thousands of documents that grow continuously and no longer fit within a single context window.

What source fragments mean for trust and compliance

An additional benefit of RAG that proves decisive in practice for organisations with compliance requirements: the system shows its sources. With every answer, the exact fragments on which the answer is based can be displayed, including the document name, page number and the literal text.

That is a qualitative leap compared with an ordinary language model, which simply cannot indicate where its answer comes from, because the answer is assembled from statistical patterns and not from traceable sources. For organisations in construction, government, healthcare or finance, where decisions must be demonstrably substantiated, that is a fundamental difference.

Take the example of a procurement manager who wants to know whether a supplier contract contains an escalation procedure. With a RAG system he gets not only "yes, it is there" as an answer, but also the exact clause from the exact document, so that he can judge for himself whether the interpretation is correct. That is the difference between an AI that advises and an AI that provides source citations, and in professional settings the latter is the only acceptable standard.

At the same time, honesty is in order: RAG strongly reduces hallucinations, but does not eliminate them entirely. If the retrieval step retrieves an irrelevant fragment, the model will still build on it. The quality of a RAG system depends largely on how well the documents are indexed, how cleverly the fragments are delimited, and how carefully the system is set up. RAG is a solid architecture, not a magic solution that compensates for poor source quality.

Bouwend Nederland: from document to insight

A concrete example of RAG in practice is the knowledge-base tool that NewWorks developed for Bouwend Nederland. The organisation works with large volumes of documentation: position papers, policy documents, reports and memos. The information was almost always there, but finding it was slow, especially when several documents had to be compared on the same questions at once.

NewWorks built a tool that indexes those documents in a vector database and makes them queryable in natural language. What makes it special is the so-called query matrix: a table format in which each row is a topic (labour market, nitrogen, housing construction) and each column a focus point (position, concrete measure, justification). This allows dozens of documents to be queried at once, in a structured and repeatable way, with source fragments as evidence for every result.

The result: what previously took hours of manual summarising becomes, in minutes, a comparable overview that is immediately usable for decision-making, without the analyst losing contact with the sources.

NewWorks builds knowledge-base agents based on RAG for Dutch organisations that want to unlock their document knowledge. In the knowledge-base agent track, NewWorks guides organisations from first exploration to a working system: from indexing and vectorising the existing document base to an agent that answers employees directly, verifiably and with source citations. The Bouwend Nederland case illustrates what this approach looks like in practice: from a large document archive to a structured decision layer in a single tool.

Curious how this would work for you?

Plan an introduction

What is RAG? The technology behind an AI that truly understands your documents

What does an ordinary AI do with a question?

What RAG does differently

How vector search works (explained simply)

When is RAG worthwhile for your organisation?

When is RAG overkill?

What source fragments mean for trust and compliance

Bouwend Nederland: from document to insight

Read more.

An AI employee is not a chatbot. This is the real difference

Why your AI pilot never reaches production (and how to break the cycle)

Shadow AI: what it is, what it costs, and how to prevent it