SQL Generation & Validation
AI is genuinely useful for SQL — explaining what a complex query does, drafting queries from a description, and catching common mistakes. But a query that executes without error can still return wrong results. Validation before running AI-generated SQL in production is not optional. This page covers the prompting patterns and the validation workflow.
What AI SQL Generation Does Well
Strong use cases
- Drafting queries from plain-language descriptions
- Explaining what an existing complex query does
- Translating queries between SQL dialects (MySQL → PostgreSQL)
- Generating common patterns: aggregations, window functions, CTEs
- Spotting syntax errors and suggesting fixes
- Writing boilerplate: CRUD statements, index creation, schema comparisons
Where you must validate carefully
- Join logic — AI often guesses join conditions and gets cardinality wrong
- Filters and WHERE clauses — may silently miss required conditions
- NULL handling — easy to get wrong, hard to spot in results
- Aggregations with GROUP BY — missing group columns cause silent data loss
- DELETE and UPDATE — no row limit by default; can affect far more rows than intended
- Dialect-specific functions — AI may use functions that don't exist in your version
Prompt Patterns That Work
1. Provide the Schema, Always
Database: PostgreSQL 15
Tables (relevant to this query):
users (id INT PK, email VARCHAR, created_at TIMESTAMP, plan VARCHAR)
orders (id INT PK, user_id INT FK users.id, amount DECIMAL, status VARCHAR, created_at TIMESTAMP)
Write a query that: [DESCRIBE WHAT YOU WANT IN PLAIN LANGUAGE]
Requirements:
- Return only [LIST THE COLUMNS YOU NEED]
- Filter for [CONDITIONS]
- Use [specific dialect / optimisation preferences if any]
Without schema context, AI invents table and column names. Providing the actual schema dramatically increases accuracy and reduces hallucinated joins.
2. Explain Query Intent
Explain what this query does, step by step. For each clause (FROM, JOIN, WHERE, GROUP BY, SELECT), explain in plain English what it is doing and why.
Also identify: any potential issues with NULL handling, potential performance problems, or cases where the query might return unexpected results.
[PASTE QUERY]
Use this to understand AI-generated queries before running them, or to understand existing queries you inherited.
3. Safe DELETE / UPDATE Drafting
Write a DELETE query that removes [WHAT TO DELETE] from the [TABLE] table where [CONDITIONS].
Requirements:
- First write it as a SELECT so I can verify the rows that would be affected
- Then write the DELETE with the same WHERE clause
- Add a LIMIT [N] to cap the rows affected per execution
- Include a comment with the expected row count based on the conditions
Always verify with a SELECT before running destructive queries. A LIMIT prevents runaway deletes from a too-broad WHERE clause.
Validation Before Running AI-Generated SQL
Validation checklist (required before production):
- EXPLAIN / EXPLAIN ANALYZE: Check the query plan — verify the query is using the expected indexes and not doing full table scans on large tables
- Run with LIMIT on a copy or staging DB: Confirm the result shape and sample values look correct before running against production data
- Verify row count: For UPDATE/DELETE, run a SELECT COUNT(*) with the same WHERE clause first — confirm the number of rows matches expectations
- Check NULL handling: Ask AI "are there any NULLs in columns X, Y, Z that would affect this query?" and test with NULL values in development
- Review JOINs: Confirm every JOIN produces the expected cardinality — a many-to-many join can silently multiply rows
The correct results trap
A query can execute successfully and return plausible-looking data while being subtly wrong. Never confirm a query is correct just because it ran without error. Review a sample of actual results against what you know to be true about your data.
Security: Preventing SQL Injection
AI-generated SQL in applications must use parameterised queries, not string concatenation. This is the single most important rule.
Never do this (SQL injection)
# DANGEROUS — string concatenation
query = f"SELECT * FROM users WHERE email = '{email}'"
Always use parameterised queries
# SAFE — parameterised
query = "SELECT * FROM users WHERE email = %s"
cursor.execute(query, (email,))
When asking AI to generate application code that includes SQL, explicitly instruct: "Use parameterised queries — never string concatenation for user-supplied values."
Checklist: Do You Understand This?
- Why is providing the schema in your SQL prompt more important than describing the query in detail?
- What is the "correct results trap" — why does a query executing without error not confirm it is correct?
- What three steps should you take before running an AI-generated DELETE query in production?
- What is SQL injection, and what single rule prevents it in AI-generated application code?
- Ask AI to explain this query to identify its potential issues:
SELECT user_id, SUM(amount) FROM orders GROUP BY user_id - What does EXPLAIN ANALYZE tell you that running the query does not?