A practical guide for getting the most out of the Text2SQL Agent when querying your data with natural language.
- Getting Started
- How to Ask Questions
- Understanding Results
- Query Examples
- Tips for Better Results
- Understanding the Reasoning Trace
- Common Pitfalls
- Troubleshooting
The Text2SQL Agent converts your natural language questions into SQL queries. You don't need to know SQL - just describe what data you want in plain English.
Simply ask a question about your data:
"Show me all customers from California"
The agent will:
- Analyze your question
- Inspect your database schema
- Generate the appropriate SQL
- Execute it and return results
| Field | Description |
|---|---|
| SQL | The generated SQL query |
| Results | The data from your database |
| Confidence | How confident the agent is (0-100%) |
| Reasoning | Step-by-step explanation of how it arrived at the query |
The more specific your question, the better the results.
| Less Effective | More Effective |
|---|---|
| "Show me sales" | "Show me total sales by month for 2024" |
| "Find customers" | "Find customers who haven't ordered in 90 days" |
| "Get products" | "Get the top 10 products by revenue" |
Write questions as you would ask a colleague who knows your data.
Good examples:
- "What were our total sales last quarter?"
- "Which customers have placed more than 5 orders?"
- "Show me the average order value by region"
- "List all products that are out of stock"
Avoid:
- Vague pronouns: "Show me their orders" (whose orders?)
- Ambiguous terms: "Show me the best customers" (best by what metric?)
- Multiple unrelated questions in one: "Show sales and also update the inventory"
Be explicit about dates and time periods.
| Ambiguous | Clear |
|---|---|
| "Recent orders" | "Orders from the last 7 days" |
| "Old customers" | "Customers who joined before 2023" |
| "This month's sales" | "Sales from December 2024" |
When asking for aggregations, specify how to calculate them.
| Vague | Specific |
|---|---|
| "Best products" | "Products with highest total revenue" |
| "Active users" | "Users who logged in within the last 30 days" |
| "Top customers" | "Customers ranked by number of orders" |
The confidence score (0-100%) indicates how certain the agent is about the generated SQL.
| Score | Meaning | Recommendation |
|---|---|---|
| 90-100% | Very confident | Results are likely accurate |
| 70-89% | Confident | Review the SQL if results seem off |
| 50-69% | Moderate | Verify the SQL matches your intent |
| Below 50% | Low confidence | Rephrase your question |
✅ Trust the results when:
- Confidence is above 80%
- The SQL looks correct for your question
- Row counts make sense
- Values are in expected ranges
- Confidence is between 50-80%
- Results are unexpectedly empty
- Numbers seem too high or too low
- You're making business decisions based on the data
Even if you don't know SQL, you can often verify the query is correct:
SELECT name, email, total_orders
FROM customers
WHERE state = 'California'
ORDER BY total_orders DESC
LIMIT 10SELECT- the columns being retrieved (name, email, total_orders)FROM- the table being queried (customers)WHERE- the filter being applied (state = 'California')ORDER BY- how results are sorted (by total_orders, highest first)LIMIT- maximum rows returned (10)
Simple filter:
"Show all orders with status 'pending'"
Multiple conditions:
"Find customers from California or New York who have spent over $1000"
Date ranges:
"Show orders placed between January 1 and March 31, 2024"
Negative filters:
"List products that are not in the 'Electronics' category"
Counting:
"How many orders were placed last month?"
Summing:
"What is the total revenue from the Western region?"
Averaging:
"What is the average order value per customer?"
Grouping:
"Show total sales by product category"
Multiple aggregations:
"For each salesperson, show their total sales, number of orders, and average order value"
Top N:
"Show the top 5 customers by lifetime value"
Bottom N:
"List the 10 worst-performing products by sales"
Rankings:
"Rank stores by monthly revenue"
Period over period:
"Compare sales this month vs last month"
Category comparisons:
"Show average order value for new vs returning customers"
Trends:
"Show monthly sales trend for 2024"
Outliers:
"Find orders with unusually high discounts (over 50%)"
Missing data:
"Which customers have no email address on file?"
Multi-step analysis:
"For each product category, show the best-selling product and its total revenue"
Conditional logic:
"Classify customers as 'VIP' if they've spent over $10,000, 'Regular' if over $1,000, otherwise 'New'"
Joins across tables:
"Show customer names along with their most recent order date and total spent"
If you know your schema, use the actual names:
# Good - uses actual column name
"Show the created_at date for all users"
# Less precise - agent has to guess
"Show when users signed up"
Always mention if you want ascending or descending:
"Show products sorted by price from lowest to highest"
"List customers by total spent, highest first"
Avoid overwhelming results by specifying limits:
"Show the first 50 orders from today"
"List up to 100 customers matching these criteria"
Instead of one complex question, try multiple simpler ones:
Instead of:
"Show me which sales reps are underperforming compared to the team average, what their pipeline looks like, and which deals they're likely to close this quarter"
Try:
1. "What is the average sales per rep this quarter?"
2. "Which reps are below the team average?"
3. "Show the pipeline for rep John Smith"
Define what you mean by subjective terms:
# Ambiguous
"Show active users"
# Clear
"Show users who have logged in within the last 7 days"
Stick to the same terms throughout your queries:
# Inconsistent (might confuse the agent)
"Show customers" then "List clients" then "Find buyers"
# Consistent
"Show customers" then "Filter these customers" then "Count customers"
When you enable show_reasoning, you can see how the agent arrived at its answer.
Question: "Which product category has the highest revenue?"
Step 1: [Schema Inspection]
Thought: I need to find tables related to products and revenue
Action: Inspecting database schema
Observation: Found 'products' table with 'category' column and 'order_items' table with 'price' and 'quantity'
Step 2: [Query Planning]
Thought: I need to join products with order_items, calculate revenue, and group by category
Action: Planning SQL structure
Observation: Will use SUM(price * quantity) grouped by category
Step 3: [SQL Generation]
Thought: Generating SQL with aggregation and ordering
Action: Creating query
Observation: Generated SELECT statement with GROUP BY and ORDER BY
Step 4: [Validation]
Thought: Checking if the query is correct
Action: Validating SQL syntax and logic
Observation: Query is valid and matches the question intent
| Step Type | What It Does |
|---|---|
| Schema Inspection | Agent examines your database structure |
| Query Planning | Agent decides how to structure the SQL |
| SQL Generation | Agent writes the actual SQL query |
| Validation | Agent checks the query makes sense |
| Execution | Agent runs the query and gets results |
| Self-Correction | Agent fixes errors if the query failed |
If results aren't what you expected, check the reasoning trace:
- Wrong table selected? → Specify the table name in your question
- Wrong column used? → Use the exact column name
- Missing join? → Mention both entities ("customers and their orders")
- Wrong aggregation? → Specify the calculation ("sum of revenue, not count")
❌ Won't work well:
"Show me churned customers"
✅ Better:
"Show customers who haven't made a purchase in 90 days"
The agent doesn't know your definition of "churned" - be explicit.
❌ Might miss data:
"Show customers where discount is not 10%"
This won't include customers with NULL discounts. If you want them:
✅ Better:
"Show customers where discount is not 10% or discount is empty"
❌ Unclear:
"Show last week's orders"
✅ Clear:
"Show orders from December 9-15, 2024"
Or be explicit about what "last week" means:
"Show orders from the previous 7 days"
❌ Confusing:
"Show me the customer with the most order"
✅ Clear:
"Show me the customer with the most orders"
❌ Ambiguous:
"Now show their revenue"
✅ Clear:
"Show revenue for the top 10 customers from my previous query"
Or better, combine into one query:
"Show the top 10 customers by order count, including their total revenue"
Possible causes:
- Filter is too restrictive
- Date range has no data
- Table is empty
- Column values don't match your filter
Try:
- Remove some filters to broaden the search
- Check if the date range is correct
- Verify the filter values exist in your data
Possible causes:
- Agent interpreted your question differently
- Wrong table or column was used
- Aggregation logic is different than expected
Try:
- Check the reasoning trace to see the agent's interpretation
- Review the generated SQL
- Rephrase with more specific terms
Possible causes:
- Question is ambiguous
- Schema doesn't have obvious matches
- Complex query with many possible interpretations
Try:
- Use exact table/column names if you know them
- Break into simpler questions
- Add more context about what you want
Possible causes:
- Large dataset without filters
- Complex joins across many tables
- No LIMIT specified
Try:
- Add a date range filter
- Specify a LIMIT (e.g., "show first 100")
- Narrow down to specific categories or segments
Possible causes:
- Database connection issue
- Table or column doesn't exist
- Syntax error in generated SQL
Try:
- Check if the database is accessible
- Verify table/column names in your question
- Simplify the question and try again
| Intent | Phrase |
|---|---|
| List data | "Show me...", "List all...", "Find..." |
| Count | "How many...", "Count the..." |
| Sum | "What is the total...", "Sum of..." |
| Average | "What is the average...", "Mean..." |
| Compare | "Compare X vs Y...", "Difference between..." |
| Rank | "Top N...", "Best/worst...", "Rank by..." |
| Trend | "Show trend...", "Over time...", "By month..." |
| Filter | "Where...", "Only include...", "Exclude..." |
| Expression | Example |
|---|---|
| Specific date | "on December 15, 2024" |
| Date range | "between Jan 1 and Mar 31, 2024" |
| Relative | "in the last 7 days", "past month" |
| Period | "Q4 2024", "fiscal year 2024" |
| Comparison | "this month vs last month" |
| Word | SQL Equivalent |
|---|---|
| total, sum | SUM() |
| average, mean | AVG() |
| count, number of | COUNT() |
| maximum, highest | MAX() |
| minimum, lowest | MIN() |
| distinct, unique | COUNT(DISTINCT) |
If you're still having trouble:
- Check the reasoning trace - understand how the agent interpreted your question
- Simplify your question - start simple and add complexity gradually
- Use exact names - reference actual table and column names
- Review the SQL - even basic SQL knowledge helps verify correctness
For technical issues, see Troubleshooting or contact your administrator.