TutorialsAIPostgreSQLSQL

How to Query AWS Redshift with Natural Language

AWS Redshift is one of the most widely used data warehouses in the world. It stores petabytes of business data for thousands of companies sales records, cli...

Marcus Chen· Solutions EngineerMarch 28, 20267 min read

AWS Redshift is one of the most widely used data warehouses in the world. It stores petabytes of business data for thousands of companies sales records, clickstream events, financial transactions, user behavior and it's phenomenally fast at querying all of it.

The catch: Redshift speaks SQL, and a fairly specific dialect of it. Most business users can't write SQL. And even for analysts who can, writing optimised Redshift queries requires understanding distribution keys, sort keys, and query execution plans that most people never need to learn.

This guide explains how to query Redshift in plain English no SQL required so that analysts, ops managers, and founders can get answers directly from their data warehouse without waiting on a data engineer.

Why Redshift Queries Are Harder Than Typical SQL

Redshift is based on PostgreSQL but with important differences. If you write standard Postgres queries against Redshift, they'll often work but they won't be fast.

For example, a simple query like this can take seconds or minutes depending on how your tables are set up:

SELECT
  event_date,
  COUNT(DISTINCT user_id) AS daily_active_users
FROM events
WHERE event_date >= DATEADD(day, -30, GETDATE())
GROUP BY event_date
ORDER BY event_date;

Redshift uses DATEADD and GETDATE() instead of PostgreSQL's NOW() and INTERVAL. It uses NVL instead of COALESCE in some older contexts. Its window functions, sorting behavior, and join strategies are tuned for columnar storage which means poorly structured queries can be 10x slower than they need to be.

For non-technical users, none of this matters. They just want to know "how many users were active each day last month?" The underlying dialect complexity shouldn't be their problem.

What Natural Language Querying Looks Like in Practice

Here's the basic workflow with a tool like AI for Database:

  • Connect your Redshift cluster (connection string, database name, credentials)
  • Type a question in plain English
  • The AI translates it to valid, optimised Redshift SQL, runs it, and returns results as a table or chart
  • Questions that work well:

  • "Show me total revenue by region for Q1 2026"
  • "Which products had the highest return rate last 90 days?"
  • "Compare new vs. returning customer orders month over month"
  • "What's the average order value for customers who signed up via paid search?"
  • Each of these translates to a multi-table SQL query with joins, aggregations, and date handling. The AI writes that SQL for you, handles Redshift-specific syntax, and returns the answer.

    Connecting Redshift to AI for Database

    Connecting is straightforward. You need:

  • Your Redshift cluster endpoint (e.g., my-cluster.abc123.us-east-1.redshift.amazonaws.com)
  • Port (default: 5439)
  • Database name
  • A read-only database user (recommended for safety)
  • For the database user, create a restricted one specifically for AI queries:

    -- Run this in your Redshift cluster as a superuser
    CREATE USER aifordatabase_reader PASSWORD 'your-secure-password';
    
    -- Grant SELECT on the schemas you want to expose
    GRANT USAGE ON SCHEMA public TO aifordatabase_reader;
    GRANT SELECT ON ALL TABLES IN SCHEMA public TO aifordatabase_reader;
    
    -- If you have other schemas (e.g., analytics, reporting)
    GRANT USAGE ON SCHEMA analytics TO aifordatabase_reader;
    GRANT SELECT ON ALL TABLES IN SCHEMA analytics TO aifordatabase_reader;

    Using a read-only user means the AI can answer questions but can never modify, delete, or insert data. This is a sensible default for any team giving non-technical users query access.

    Once connected, AI for Database inspects your Redshift schema tables, columns, data types, relationships and uses that context to translate natural language into accurate SQL.

    Common Redshift Analytics Queries (and Their Plain-English Equivalents)

    Here are the kinds of Redshift queries that come up constantly in business analytics and what you'd type to get them.

    Revenue by Time Period

    Ask: "What was our total revenue by month for the past year?"

    Generated SQL:

    SELECT
      DATE_TRUNC('month', order_date) AS month,
      SUM(order_total) AS total_revenue
    FROM orders
    WHERE order_date >= DATEADD(year, -1, GETDATE())
    GROUP BY 1
    ORDER BY 1;

    Funnel Drop-off Analysis

    Ask: "Show me the conversion rate from signup to first purchase, by acquisition channel."

    Generated SQL:

    SELECT
      u.acquisition_channel,
      COUNT(DISTINCT u.user_id) AS signups,
      COUNT(DISTINCT o.user_id) AS converted,
      ROUND(
        COUNT(DISTINCT o.user_id) * 100.0 / NULLIF(COUNT(DISTINCT u.user_id), 0),
        2
      ) AS conversion_rate
    FROM users u
    LEFT JOIN orders o ON u.user_id = o.user_id
    GROUP BY u.acquisition_channel
    ORDER BY signups DESC;

    Cohort Retention

    Ask: "What percentage of users who signed up in January 2026 made a purchase in each of the following 3 months?"

    Cohort analysis is notoriously verbose SQL. In natural language it's one sentence; the AI writes the dozen-line query it takes to compute it correctly.

    Top N Analysis

    Ask: "Which 10 customers generated the most revenue in Q4 2025?"

    SELECT
      customer_id,
      customer_name,
      SUM(order_total) AS total_revenue
    FROM orders
    JOIN customers USING (customer_id)
    WHERE order_date BETWEEN '2025-10-01' AND '2025-12-31'
    GROUP BY customer_id, customer_name
    ORDER BY total_revenue DESC
    LIMIT 10;

    Building Dashboards on Top of Redshift

    Single queries are useful. Dashboards that refresh automatically are better.

    AI for Database lets you describe the charts you want, builds the underlying Redshift queries, and then keeps those charts live on a schedule you control. Business metrics that used to require a data analyst spinning up a scheduled job are now a plain-English description and a refresh cadence.

    A typical Redshift-backed dashboard for an e-commerce company might include:

  • Daily and weekly revenue trend (line chart, refreshes nightly)
  • Top 20 products by units sold this month (table, refreshes daily)
  • New customer acquisition by channel (bar chart, refreshes weekly)
  • Average order value trend (line chart, refreshes nightly)
  • Return rate by product category (table, refreshes weekly)
  • Each of these is a Redshift query running on a schedule. With a natural language interface, you don't need to write or maintain those queries you describe what you want to see.

    Setting Up Alerts on Redshift Data

    Beyond dashboards, AI for Database action workflows let you monitor Redshift for specific conditions and trigger notifications automatically.

    For example:

  • Daily revenue alert: "If today's revenue is more than 20% below yesterday's, send a Slack message to the business team."
  • Anomalous refund rate: "If the refund rate for any product category exceeds 15% in the past 7 days, email the ops team."
  • Inventory threshold: "If any SKU drops below 100 units in the inventory table, send a webhook to the purchasing system."
  • These run as scheduled workflows against your live Redshift data no stored procedures, no Lambda functions, no custom code.

    What About Redshift Serverless?

    Redshift Serverless works the same way. You connect using the serverless endpoint and the same database credentials. The SQL dialect is identical. AI for Database connects to it exactly as it would a provisioned Redshift cluster.

    If you're on Redshift Serverless, your endpoint looks like:

    default.123456789012.us-east-1.redshift-serverless.amazonaws.com

    Use port 5439 and your workgroup/namespace credentials.

    Redshift vs. BigQuery vs. Snowflake for Natural Language Querying

    If you're choosing a data warehouse and wondering which works best with natural language interfaces:

    All three Redshift, BigQuery, and Snowflake work well with AI-generated SQL. The differences are mostly dialect-level:

  • Redshift uses PostgreSQL-like syntax with Redshift-specific date functions (DATEADD, DATEDIFF, GETDATE)
  • BigQuery uses Standard SQL with DATE_SUB, CURRENT_DATE, and backtick-quoted identifiers
  • Snowflake uses Standard SQL with some proprietary extensions and its own date/time functions
  • A well-designed natural language layer handles all three correctly by being aware of the target database dialect. AI for Database does this it knows whether you're connected to Redshift, BigQuery, or Snowflake and generates valid, optimised SQL for each.

    Ready to try AI for Database?

    Query your database in plain English. No SQL required. Start free today.