TutorialsAIPostgreSQLSQL

Database Compliance Made Simple: GDPR, SOC 2, and HIPAA for Small Teams

Database compliance sounds like a problem for enterprises with dedicated legal and security teams. In practice, it hits small teams hardest — a 12-person Saa...

Dr. Elena Vasquez· AI Research LeadMarch 18, 202610 min read

Database compliance sounds like a problem for enterprises with dedicated legal and security teams. In practice, it hits small teams hardest — a 12-person SaaS startup is expected to meet the same GDPR requirements as a 5,000-person company, with a fraction of the resources.

The good news: most of what compliance actually requires is not that complicated. The hard part is knowing which requirements apply to you, understanding what your database needs to do to satisfy them, and keeping evidence that you're actually doing it.

This guide covers the three frameworks that come up most often for growing SaaS and data-driven companies — GDPR, SOC 2, and HIPAA — and explains what they mean in concrete, database-level terms.

Which Frameworks Apply to You?

Before worrying about what to implement, figure out what you're actually required to comply with.

GDPR (General Data Protection Regulation) applies if you process personal data of people in the European Union — regardless of where your company is based. If a user in Germany signs up for your product, GDPR applies. Personal data is broadly defined: names, email addresses, IP addresses, and any identifier that can be linked to a specific person.

HIPAA (Health Insurance Portability and Accountability Act) applies if you handle Protected Health Information (PHI) in the US. PHI includes medical records, diagnoses, treatment data, and anything that links health information to an identifiable individual. If you're building healthcare software, a medical practice management tool, or anything that stores patient data, HIPAA applies.

SOC 2 is different — it's not a legal requirement but a voluntary audit standard. Customers (especially enterprise buyers) increasingly require a SOC 2 Type II report before signing contracts. It covers security, availability, processing integrity, confidentiality, and privacy across your whole system, but your database is a central piece of the audit evidence.

You may need all three, two of them, or just one. If you're a general SaaS product serving EU customers, GDPR is the baseline. If you want to sell to mid-market or enterprise, SOC 2 matters. If you're in healthcare, HIPAA is non-negotiable.

GDPR: What Your Database Actually Needs to Do

GDPR is primarily about data subject rights and data minimisation. Here's what that means in database terms.

Know what personal data you store and where

You need a data map: a record of what personal data you store, which tables it lives in, what you use it for, and how long you keep it. This sounds bureaucratic, but it's genuinely useful. Most teams have personal data scattered across tables they've forgotten about.

A basic audit query for a PostgreSQL database:

-- Find tables likely to contain personal data
SELECT table_name, column_name, data_type
FROM information_schema.columns
WHERE table_schema = 'public'
  AND (
    column_name ILIKE '%email%'
    OR column_name ILIKE '%phone%'
    OR column_name ILIKE '%address%'
    OR column_name ILIKE '%name%'
    OR column_name ILIKE '%ip%'
  )
ORDER BY table_name, column_name;

This won't catch everything, but it's a starting point to identify where personal data lives.

Support data subject requests (DSARs)

GDPR gives individuals the right to request their data (Subject Access Request), correct it, or have it deleted (Right to Erasure, also called the "right to be forgotten"). You need to be able to fulfil these requests within 30 days.

That means you need to be able to:

  • Export all data you hold about a specific person
  • Delete or anonymise it on request (within legal retention limits)
  • An example deletion query pattern:

    -- Anonymise rather than delete (preserves referential integrity)
    UPDATE users
    SET
      email = CONCAT('deleted-', id, '@anon.invalid'),
      name = 'Deleted User',
      phone = NULL,
      ip_address = NULL,
      deleted_at = NOW()
    WHERE id = :user_id;
    
    -- Also clean related tables
    UPDATE orders SET customer_email = NULL WHERE user_id = :user_id;
    UPDATE audit_logs SET ip_address = NULL WHERE user_id = :user_id;

    Anonymisation is usually safer than hard deletion because foreign key constraints often make hard deletes messy. The key is that the person is no longer identifiable.

    Data retention limits

    GDPR requires you to not keep personal data longer than necessary. Define retention periods per data type:

    -- Delete inactive users' personal data after 3 years
    UPDATE users
    SET email = CONCAT('expired-', id, '@anon.invalid'),
        name = 'Expired User',
        phone = NULL
    WHERE last_login_at < NOW() - INTERVAL '3 years'
      AND deleted_at IS NULL;

    Run this as a scheduled job, not a one-off. Retention must be ongoing.

    Encryption and access controls

    Personal data should be encrypted at rest and in transit. At the database level, this typically means:

  • TLS enforced for all connections
  • Disk-level or tablespace encryption for at-rest data
  • Column-level encryption for particularly sensitive fields (medical data, payment info)
  • Access controls matter too: not everyone on the team should be able to run SELECT * FROM users. Use database roles to limit who can query what.

    SOC 2: What Auditors Will Look For in Your Database

    SOC 2 audits cover five Trust Service Criteria: Security, Availability, Processing Integrity, Confidentiality, and Privacy. Most small teams start with just Security and Availability.

    At the database level, auditors focus on:

    Access control and authentication

    Who can connect to your database, and how? Auditors want to see:

  • No shared credentials (every team member and service has a unique user)
  • Multi-factor authentication on administrative access
  • Principle of least privilege (read-only users can't write; application users can't drop tables)
  • -- PostgreSQL: create a read-only analytics user
    CREATE ROLE analytics_readonly;
    GRANT CONNECT ON DATABASE mydb TO analytics_readonly;
    GRANT USAGE ON SCHEMA public TO analytics_readonly;
    GRANT SELECT ON ALL TABLES IN SCHEMA public TO analytics_readonly;
    
    -- Apply to specific user
    CREATE USER priya WITH PASSWORD 'strongpassword';
    GRANT analytics_readonly TO priya;

    Audit logging

    SOC 2 requires evidence that you know who did what and when. Your database should log:

  • Successful and failed login attempts
  • Schema changes (DDL events)
  • Privileged operations
  • In PostgreSQL, enable pgaudit for structured audit logging. In MySQL, use the audit log plugin. Cloud databases (RDS, Cloud SQL, AlloyDB) have managed audit logging you can enable with a config change.

    Change management

    Every schema migration should be tracked: who approved it, when it ran, what it changed. Use a migration tool like Flyway, Liquibase, or Prisma Migrate. Keep the migration history in version control. Auditors will ask for this.

    Backup and recovery

    You need documented, tested backup procedures. "We have automated backups" isn't enough — auditors want to see that you've tested restoration. Document:

  • Backup frequency and retention period
  • Last tested restoration date and result
  • Recovery time objective (RTO) and recovery point objective (RPO)
  • HIPAA: Database Requirements for Healthcare Data

    HIPAA has two main rules relevant to databases: the Privacy Rule and the Security Rule.

    The Security Rule has specific technical safeguards:

    Encryption: PHI must be encrypted in transit (TLS 1.2+) and at rest. For cloud databases, this is usually a checkbox. For self-hosted, you need to verify it's actually enabled.

    Access controls: Each person accessing PHI needs a unique identifier. No shared logins. Role-based access must be documented and reviewed regularly.

    Audit controls: You must have hardware, software, or procedural mechanisms to examine activity in systems containing PHI. This is the same audit logging requirement as SOC 2, but with legal teeth.

    Automatic logoff: Application sessions must time out. This is usually an application-layer concern, but if your team uses database GUI tools (like TablePlus or DBeaver) to access PHI directly, those sessions need to time out too.

    Data backup and disaster recovery: Documented backup procedures with tested restoration. Same as SOC 2 requirements.

    A critical HIPAA point: every vendor who has access to PHI needs a signed Business Associate Agreement (BAA). If you're using a managed database service (AWS RDS, Google Cloud SQL, Azure Database), check that your cloud provider will sign a BAA. Most major providers will. If you're using AI for Database to query a database containing PHI, the same applies — any tool that touches the data needs a BAA in place.

    Monitoring Compliance in Your Database

    Compliance isn't a one-time project. It requires ongoing monitoring:

  • Access reviews: Who has database access? Run this monthly and revoke stale credentials.
  • Data inventory checks: Has new personal data been added to tables without going through your data map process?
  • Retention enforcement: Is your automated deletion job actually running?
  • Failed login monitoring: Spikes in failed logins can indicate a brute-force attempt.
  • A practical query to review database users in PostgreSQL:

    SELECT
      rolname AS username,
      rolsuper AS is_superuser,
      rolcreatedb AS can_create_db,
      rolcreaterole AS can_create_roles,
      rolcanlogin AS can_login,
      pg_catalog.pg_get_userbyid(oid) AS owner
    FROM pg_catalog.pg_roles
    WHERE rolcanlogin = true
    ORDER BY rolname;

    Run this, compare it to your expected user list, and immediately revoke access for anyone who shouldn't be there.

    With AI for Database, you can build this kind of compliance monitoring into a dashboard. Connect your database, set up queries that check for access anomalies or retention violations, and schedule them to run weekly — automatically. You'll get an alert if something unexpected shows up, without needing to remember to check.

    Building a Minimal Viable Compliance Programme

    If you're starting from zero, here's a practical sequence:

  • Data map: Document what personal data you store, where, and why. A spreadsheet is fine to start.
  • Access audit: Remove all shared credentials. Give everyone a unique login. Restrict to least privilege.
  • Enable audit logging: Turn on whatever your database or cloud provider offers. You need a record of who touched what.
  • Implement retention: Write the anonymisation queries. Schedule them to run automatically.
  • Test backup restoration: Actually run through restoring from backup. Document it.
  • Handle DSARs: Build a simple internal process to export and delete user data within 30 days of a request.
  • This isn't everything — a real compliance programme involves policies, training, incident response plans, and more. But these six steps handle the majority of the database-level requirements across GDPR, SOC 2, and HIPAA, and they're all achievable by a small team without a dedicated security hire.

    The Bottom Line

    Database compliance feels overwhelming because the regulatory language is dense and the requirements feel abstract. But most of what GDPR, SOC 2, and HIPAA actually require at the database level is: know what you have, control who can see it, log what's done to it, and delete it when you no longer need it.

    You don't need an enterprise compliance team. You need good database hygiene and the discipline to maintain it. Start with a data map and an access audit this week. The rest follows from there.

    If ongoing monitoring is the hard part — remembering to check things, running audits on a schedule, alerting when something looks off — AI for Database can help. Connect your database, set up monitoring queries, and schedule them to run automatically. Free to try at aifordatabase.com.

    Ready to try AI for Database?

    Query your database in plain English. No SQL required. Start free today.