Beginner

Source Checking & Citations

AI states things confidently regardless of whether they are true. It cites papers that do not exist, invents statistics that sound plausible, and attributes quotes to people who never said them — all with the same authoritative tone it uses for verified facts. Knowing how to check AI-generated claims is not optional: it is the skill that separates people who use AI safely from people who spread misinformation without realising it.

The Core Problem: Confident Wrongness

The most dangerous property of AI hallucinations is not that they exist — it is that they are indistinguishable from correct information by appearance alone. A hallucinated citation looks exactly like a real one: it has an author name, a paper title, a journal, a year, sometimes even a DOI or URL. A made-up statistic reads the same as a verified one. The prose is fluent, the logic sounds coherent, the tone is authoritative.

This is not a bug that will be fixed in the next model version. It is a structural property of how language models work: they generate text that is statistically plausible, not text that is factually verified. They do not look information up — they produce text that resembles what an accurate answer would look like.

The NeurIPS 2025 warning: experts missed it too

In January 2026, GPTZero analysed over 4,000 research papers accepted to NeurIPS 2025 — one of the world's top AI conferences — and found more than 100 AI-hallucinated citations across 51 papers. These fake references had slipped past 3–5 expert peer reviewers per paper. The fabrications fell into three types: fully invented citations (fake author, fake title, fake journal); "vibe citations" (real-sounding blends of multiple real papers); and subtle modifications of real papers (wrong author name, paraphrased title, added or removed co-authors). If expert AI researchers cannot reliably catch hallucinated citations in their field, the rest of us need systematic verification habits — not just the hope that something sounds right.

Types of AI Source Hallucination

Type	What It Looks Like	How to Catch It
Fully invented citation	Author, title, journal, year — all fabricated. URL leads nowhere or does not exist.	Search Google Scholar / DOI lookup — paper simply does not exist
"Vibe citing"	Elements blended from multiple real papers into one plausible-sounding citation. Title and authors are believable but the specific paper never existed.	Find each real paper it may have drawn from — none say what the AI claimed
Modified real citation	A real paper exists, but the author's name, title, or year is slightly wrong. The paper does not say what the AI claims.	Find the actual paper — check that the cited claim appears in it verbatim
Invented statistic	"Studies show 73% of users..." — no study cited, or cited study does not contain the statistic.	Find the original report — search for the exact figure in primary sources
Misattributed quote	Real person, real-sounding quote — but they never said it. Or real quote, wrong attribution.	Search the exact quote text — find original context and speaker

The Fundamental Rule

Never cite a source you found through AI without verifying that:

The source actually exists
The source says what the AI claims it says
You have read the relevant passage yourself

This is not optional for professional, academic, or published work. Claiming you were misled by AI is not a defence against publishing false information.

Tools That Show Their Sources (and Their Limits)

Some AI tools are specifically designed to show citations alongside their answers — which is better than a plain chat assistant that gives you no references at all. But even cited answers require verification.

Perplexity

Cites numbered sources for every claim and links to the original pages. In 2025 accuracy tests, Perplexity tied every claim to a source for about 78% of complex research questions (vs. 62% for ChatGPT). Still: the citations link to real web pages, but the AI sometimes misattributes — claiming a source says something it does not. Always click and read the cited source for any claim you will use.

Consensus

Restricts all references to published, peer-reviewed papers. The safest tool for academic citations — every source it cites is at least a real, published paper. Still requires you to verify that the paper actually supports the specific claim being made, not just that the paper exists.

NotebookLM

Works only from documents you upload — so it cannot cite a paper that does not exist in your notebook. Every answer cites the exact passage from your source. The safest citation approach for working with documents you already have, since it cannot hallucinate from training data.

Important: Even tools that cite sources can misattribute — claiming a source says something it does not, or citing the abstract when the actual finding is in the full text. "It has a citation" is not the same as "the citation says what the AI claims." You still need to click and read.

How to Verify AI Claims — Practical Techniques

Lateral Reading

Lateral reading is the technique professional fact-checkers use: instead of reading a source deeply to evaluate it, you open multiple other sources about the claim in new browser tabs, simultaneously, to see if independent sources corroborate it. This is faster and more reliable than trying to evaluate a single source in isolation.

Lateral reading workflow:

Identify the specific, testable claim (isolate it from the surrounding prose)
Open a new browser tab — search for the claim independently, not for the source AI mentioned
Look for the same claim in 2–3 independent sources (not all citing each other)
If multiple independent, credible sources confirm it: probably true
If you cannot find independent confirmation: treat as unverified

Verifying Academic Citations

Step by step:

Take the exact citation (author, title, journal, year) and search Google Scholar
If the paper does not appear: it likely does not exist — discard the citation
If it does appear: find the DOI link and access the full paper or abstract
Search the paper (Ctrl+F) for the exact claim the AI attributed to it
If the paper does not contain the claim: the AI misattributed — discard
If it does: you now have a verified, real citation you can use

Useful tools: Google Scholar (scholar.google.com), Semantic Scholar (semanticscholar.org), DOI resolver (doi.org), PubMed for biomedical papers (pubmed.ncbi.nlm.nih.gov).

Verifying Statistics and Data Claims

How to find the primary source:

Search the exact statistic + likely source organisation (e.g. "42% CAGR AI education market" + "report")
Look for the original report — government body, research firm, academic institution — not a news article summarising it
Find the specific table, page, or section in the report that contains the figure
Check the methodology: what was measured, how, when, and in what population?
If you cannot find the original source that contains the exact number: do not use the statistic

Watch for: AI often cites a projected figure (e.g. "will reach $41 billion by 2030") as if it is a current fact. Projections are estimates — they carry uncertainty and depend on methodology. Make sure you distinguish current data from forecasts.

Verifying Quotes

Search the exact quote text in quotation marks in a search engine
If results appear: check that the attributed person actually said it in the found context
If no results appear: the quote may be fabricated or paraphrased from something that was said differently
For important quotes: find the original interview, book, or speech — not a secondary source quoting it

Prompting AI to Be Honest About Uncertainty

By default, AI will answer questions it is uncertain about with the same confident tone it uses for things it knows reliably. You can change this behaviour with specific prompts that force it to flag uncertainty.

Uncertainty-flagging prompts

"Answer my question, but flag any specific fact, statistic, or citation you are not highly confident about. Use phrases like 'I'm uncertain about this' or 'you should verify this' where appropriate."
"After answering, tell me: which specific claims in your response would be hardest to verify, and where should I look to check them?"
"If you are about to cite a specific paper, statistic, or quote, tell me how confident you are that it exists exactly as you describe it, on a scale of low / medium / high."
"For this response: do not include any citation you are not highly confident actually exists. If unsure, describe the general concept without a specific citation rather than guessing."

These prompts help, but they are not a guarantee. AI may still misidentify its own uncertainty. Use them as one layer of protection, not a substitute for verification.

Red Flags That Demand Extra Verification

Very specific statistics without an obvious source

"Studies show that 73% of..." or "According to a 2024 report, approximately 4.2 million..." — suspiciously precise numbers with vague attribution are a hallucination red flag. Real statistics come from specific, findable reports. If you cannot find the source, the number is probably invented.

Citations from niche or obscure journals you cannot find

If a cited journal does not appear in Google Scholar, PubMed, or a basic web search, it may not exist. Real journals — even minor ones — are indexed somewhere. A journal that exists only in an AI citation is fabricated.

Claims about very recent events (post-training cutoff)

AI training data has a cutoff date. If you ask about events after that date, the model may plausibly confabulate details rather than admitting it does not know. Be especially sceptical of specific details about recent news, current prices, current officeholders, or recent research published after the model's training cutoff.

Legal, medical, or financial specifics

For high-stakes domains — drug interactions, legal precedents, tax rules, financial regulations — errors have real consequences. AI gets these wrong with the same confidence as it gets them right. Any specific legal, medical, or financial claim should be verified by a qualified professional or against official primary sources, not treated as reliable from AI alone.

Quotes from real people in natural language

AI is good at generating text that sounds like how a specific person talks. A quote that sounds exactly like something a public figure might say — in their recognisable style, with their usual vocabulary — may be fabricated rather than real. Always find the original interview, book, or speech rather than relying on an AI-generated quote.

Source Verification Workflow

For quick fact-checking (2 minutes)

Identify the specific claim you want to verify
Open Perplexity — search for the claim directly (not the AI-generated answer, the claim itself)
Check 2–3 independent sources cited by Perplexity that confirm the claim
If Perplexity cannot find confirmation: the claim is likely inaccurate or unverifiable

For academic citations (5–10 minutes)

Take the citation to Google Scholar — search by title and author
If not found: discard the citation entirely
If found: access the paper and use Ctrl+F to search for the specific claim
If the claim is not in the paper: discard, note the paper exists but does not say what was claimed
If found: note the exact page and quote — you now have a real, usable citation

For statistics and data (5–15 minutes)

Search the statistic + likely source type ("[stat] site:gov" or "[stat] research report")
Find the primary report — not a blog or news article that cites it
Locate the exact figure in the primary document
Check methodology: sample size, date, geography, what was actually measured
If you cannot trace it to a primary source: do not use the figure

Using AI to help verify AI (the right approach)

Use Perplexity (cited) or Consensus (academic papers only) rather than plain ChatGPT for research
Ask a different AI assistant to check the same claim independently — disagreement is a signal
Ask: "What is the most credible primary source I should check for [claim]?" — then go check that source yourself
Upload the actual document to NotebookLM and ask questions — it cannot hallucinate beyond what you gave it

When AI Claims Are Generally Reliable

Well-established conceptual explanations

AI is very reliable when explaining widely-taught concepts — how photosynthesis works, what a derivative is, how TCP/IP functions, what the French Revolution was. These are covered extensively in training data and the general explanation is stable. You still may want to verify fine details, but the conceptual framework is usually sound.

Widely known, frequently documented facts

Facts that appear in thousands of sources — major historical dates, well-known scientific constants, the capitals of countries, the authors of famous works — are very unlikely to be wrong. The more extensively documented a fact is, the more reliably AI gets it right.

Reasoning and logic (not facts)

AI's reasoning about a set of facts you provide is generally more reliable than its recall of external facts. If you give the AI accurate data and ask it to analyse, compare, structure, or reason about it — as distinct from asking it to recall facts from memory — the output tends to be much more dependable.

What Is New in 2025–2026

Hallucination rates have improved — but not disappeared

The best models in 2025–2026 have meaningfully lower hallucination rates than earlier generations. Gemini 2.0 Flash achieved a hallucination rate as low as 0.7% in some benchmarks in April 2025. But lower hallucination rate does not mean zero — and even 0.7% means roughly 1 in 140 claims is wrong, stated with confidence. At scale, that remains a significant problem.

NeurIPS 2025: fake citations in elite AI research (January 2026)

The discovery that 100+ hallucinated citations slipped through peer review at the world's top AI conference was a significant moment in 2026. GPTZero coined the term "vibe citing" for the pattern: AI derives plausible-sounding citations from combinations of real papers, producing references that look accurate to casual inspection but fall apart under close examination. The submission volume to NeurIPS grew 220% from 2020 to 2025, straining review processes and making this problem harder to catch.

C2PA watermarking becoming standard for AI content

The Coalition for Content Provenance and Authenticity (C2PA) standard is being adopted by OpenAI, Google, Adobe, and Microsoft to embed invisible metadata in AI-generated images, audio, and video indicating their AI origin. This helps receivers of content verify whether it was AI-generated, though watermarks can be stripped and detection is not foolproof.

AI fact-checking tools emerging

Dedicated AI hallucination detection tools — like GPTZero's Hallucination Detector — can automatically scan AI-generated text for unsupported claims and hallucinated sources. MIT's ContextCite research tool tracks how AI attributes information to sources, making it possible to trace errors back to where the AI went wrong. These tools are still early but signal a growing infrastructure for AI output verification.

Checklist: Do You Understand This?

Can you explain why AI hallucinations are particularly dangerous — specifically why they are hard to detect by reading alone?
Can you name the three types of AI citation hallucination and give an example of each?
Can you describe what "vibe citing" means, and what real-world incident from 2026 made it a widely known term?
Can you walk through the 5-step process for verifying an academic citation AI produced?
Can you explain what lateral reading is and how it differs from reading a source deeply to evaluate it?
Can you name two red flags in an AI response that should trigger extra verification effort?
Can you write a prompt that asks AI to flag its own uncertain claims as it responds?
Can you describe why NotebookLM is the safest tool for citation work when you already have the source documents?
Can you explain what types of AI claims are generally reliable vs. unreliable, and why?