Data Model

Version 1.0 — Initial schema PostgreSQL 16 April 2026 10 tables · 3 data stores

Overview

Groundwork is a shared reality platform that brings transparency to residential renovation. The data model supports three primary actors — homeowners, contractors, and system automations — across a project lifecycle from contract signing through final payment. This document is the authoritative implementation reference for the PostgreSQL schema.

Design Principles

  • All primary keys are UUID (v4 via gen_random_uuid()) — avoids enumeration attacks and enables client-side ID generation
  • All timestamps are TIMESTAMPTZ stored in UTC — clients localize on display
  • Soft state via status enums — records are never hard-deleted by application code
  • JSONB for computed/AI analysis fields — avoids premature column normalization on evolving AI outputs
  • Row Level Security enabled on all tables — enforced at the database layer, not the application layer

Naming Conventions

  • Tables: lowercase snake_case, plural nouns
  • Columns: lowercase snake_case
  • Foreign keys: {referenced_table_singular}_id
  • Indexes: idx_{table}_{columns}
  • Constraints: chk_{table}_{rule}
  • RLS policies: pol_{table}_{actor}_{operation}
  • Enums: {domain}_{purpose} (e.g., project_status)

Entity Relationship Diagram

All ten entities and their foreign key relationships. Crow's foot notation: || = exactly one, |< = one-to-many, >< = many-to-many.

Core entity Financial Audit / Log AI / Scores users PK id: uuid email: text name: text role: enum contractor_profiles PK id: uuid FK user_id: uuid company_name: text license_number: text license_verified: bool avg_co_pct: numeric projects PK id: uuid FK homeowner_id: uuid FK contractor_id: uuid? name: text address / city / state status: enum current_total: numeric milestones PK id: uuid FK project_id: uuid name: text sequence_order: int status: enum FK depends_on_id: uuid? planned_start/end change_orders PK id: uuid FK project_id: uuid FK submitted_by: uuid amount: numeric type: enum status: enum analysis: jsonb payments PK id: uuid FK project_id: uuid FK milestone_id: uuid? amount: numeric status: enum due_date / paid_date documents PK id: uuid FK project_id: uuid FK uploaded_by: uuid file_key: text category: enum file_size_bytes: int contract_health_scores PK id: uuid FK project_id: uuid overall_score: int payment_sched: int timeline_spec: int ... 7 more scores analysis: jsonb activity_feed PK id: uuid FK project_id: uuid FK actor_id: uuid? event_type: enum metadata: jsonb audit_log PK id: uuid FK user_id: uuid? entity_type: text old/new: jsonb ip_address: inet 1:1 1:N 0:N 1:N self-ref 1:N 1:N 0:N 1:N 1:N 1:N
Reading the diagram: Solid lines indicate required foreign keys. Dashed lines indicate nullable (optional) foreign keys. Self-referencing arrows indicate recursive relationships. All entities link back to projects as the primary aggregate root, with users as the authentication and actor anchor.

Entity Reference

Each entity's columns, data types, constraints, and notes. PK = primary key, FK = foreign key, IX = indexed, NN = NOT NULL.

users

Authentication & actor identity
ColumnTypeConstraintsNotes
PK idUUIDDEFAULT gen_random_uuid()Immutable. Never reused.
emailTEXTNOT NULL, UNIQUEPII. Stored lowercase. Used for auth.
nameTEXTNOT NULLPII. Display name.
roleuser_roleNOT NULL, DEFAULT 'homeowner'Enum: homeowner / contractor / admin
password_hashTEXTNOT NULLbcrypt or Argon2id. Never expose via API.
IX created_atTIMESTAMPTZNOT NULL, DEFAULT now()UTC. Immutable after insert.
updated_atTIMESTAMPTZNOT NULL, DEFAULT now()Updated via trigger.

contractor_profiles

Public contractor data & reputation metrics
ColumnTypeConstraintsNotes
PK idUUIDDEFAULT gen_random_uuid()
FK IX user_idUUIDNOT NULL, UNIQUE, REFERENCES users(id) ON DELETE CASCADE1:1 with users. Cascade cleans profile on user delete.
company_nameTEXTNOT NULL
license_numberTEXTPII. Encrypted at rest via pgcrypto.
license_stateCHAR(2)ISO 3166-2 state code.
license_verifiedBOOLEANNOT NULL, DEFAULT falseSet by admin/verification service.
insurance_verifiedBOOLEANNOT NULL, DEFAULT false
years_in_businessSMALLINTCHECK (years_in_business >= 0)
avg_timeline_variance_pctNUMERIC(6,2)Computed nightly. Positive = over schedule.
avg_change_order_pctNUMERIC(6,2)Avg CO amount / original contract value.
total_projects_on_platformINTEGERNOT NULL, DEFAULT 0Denormalized counter. Updated via trigger.
created_atTIMESTAMPTZNOT NULL, DEFAULT now()
updated_atTIMESTAMPTZNOT NULL, DEFAULT now()

projects

Primary aggregate root — the renovation job
ColumnTypeConstraintsNotes
PK idUUIDDEFAULT gen_random_uuid()
FK IX homeowner_idUUIDNOT NULL, REFERENCES users(id) ON DELETE RESTRICTRESTRICT prevents orphaned projects.
FK IX contractor_idUUIDREFERENCES users(id) ON DELETE SET NULLNullable. NULL = contractor not yet assigned.
nameTEXTNOT NULLe.g. "Kitchen Remodel 2026"
addressTEXTNOT NULLPII. Street address of property.
cityTEXTNOT NULL
IX stateCHAR(2)NOT NULLISO 3166-2. Indexed for regional analytics.
zipCHAR(10)NOT NULLZIP or ZIP+4 format.
original_contract_amountNUMERIC(12,2)NOT NULL, CHECK > 0Immutable after project activation.
current_totalNUMERIC(12,2)NOT NULLDenormalized: original + approved COs.
start_dateDATE
expected_end_dateDATE
actual_end_dateDATESet when status transitions to completed.
IX statusproject_statusNOT NULL, DEFAULT 'setup'Enum: setup/active/paused/completed/disputed
created_atTIMESTAMPTZNOT NULL, DEFAULT now()
updated_atTIMESTAMPTZNOT NULL, DEFAULT now()

milestones

Ordered work phases with dependency graph
ColumnTypeConstraintsNotes
PK idUUIDDEFAULT gen_random_uuid()
FK IX project_idUUIDNOT NULL, REFERENCES projects(id) ON DELETE CASCADE
nameTEXTNOT NULLe.g. "Demo", "Framing", "Drywall"
descriptionTEXT
sequence_orderSMALLINTNOT NULLDisplay order within project. UNIQUE per project.
planned_startDATE
planned_endDATE
actual_startDATE
actual_endDATE
IX statusmilestone_statusNOT NULL, DEFAULT 'upcoming'Enum: upcoming/in_progress/completed/blocked
FK depends_on_milestone_idUUIDREFERENCES milestones(id) ON DELETE SET NULLNullable. Self-referential DAG for sequencing.
created_atTIMESTAMPTZNOT NULL, DEFAULT now()
updated_atTIMESTAMPTZNOT NULL, DEFAULT now()

change_orders

Scope and cost modifications with AI analysis
ColumnTypeConstraintsNotes
PK idUUIDDEFAULT gen_random_uuid()
FK IX project_idUUIDNOT NULL, REFERENCES projects(id) ON DELETE CASCADE
FK submitted_byUUIDNOT NULL, REFERENCES users(id) ON DELETE RESTRICTAudit trail must be preserved.
amountNUMERIC(12,2)NOT NULLCan be negative (credit COs).
markup_percentageNUMERIC(5,2)DEFAULT 0Contractor markup over cost. E.g. 15.00 = 15%.
descriptionTEXTNOT NULL
typechange_order_typeNOT NULLEnum: unforeseen/homeowner_request/code_requirement
IX statuschange_order_statusNOT NULL, DEFAULT 'pending'Enum: pending/approved/declined/modified
groundwork_analysisJSONBAI output: reasonableness_score (0-100), type_classification, comparable_market_rates, flags[]
signed_atTIMESTAMPTZSet when homeowner approves.
created_atTIMESTAMPTZNOT NULL, DEFAULT now()
updated_atTIMESTAMPTZNOT NULL, DEFAULT now()

payments

Payment schedule tied to project milestones
ColumnTypeConstraintsNotes
PK idUUIDDEFAULT gen_random_uuid()
FK IX project_idUUIDNOT NULL, REFERENCES projects(id) ON DELETE CASCADE
FK IX milestone_idUUIDREFERENCES milestones(id) ON DELETE SET NULLNullable. Not all payments are milestone-gated.
amountNUMERIC(12,2)NOT NULL, CHECK > 0
IX statuspayment_statusNOT NULL, DEFAULT 'scheduled'Enum: scheduled/due/paid/overdue
due_dateDATENOT NULL
paid_dateDATESet when payment is confirmed.
created_atTIMESTAMPTZNOT NULL, DEFAULT now()No updated_at — payments are append-only.

documents

File metadata — actual bytes live in Cloudflare R2
ColumnTypeConstraintsNotes
PK idUUIDDEFAULT gen_random_uuid()
FK IX project_idUUIDNOT NULL, REFERENCES projects(id) ON DELETE CASCADE
FK uploaded_byUUIDNOT NULL, REFERENCES users(id) ON DELETE RESTRICT
nameTEXTNOT NULLUser-facing filename.
file_keyTEXTNOT NULL, UNIQUER2 object key. Format: {project_id}/{uuid}/{filename}
file_typeTEXTNOT NULLMIME type. e.g. application/pdf
file_size_bytesBIGINTNOT NULL, CHECK > 0
IX categorydocument_categoryNOT NULLEnum: contract/permit/inspection/change_order/insurance/license
created_atTIMESTAMPTZNOT NULL, DEFAULT now()Immutable. Documents are never updated, only replaced.

contract_health_scores

AI-computed contract analysis — append-only history
ColumnTypeConstraintsNotes
PK idUUIDDEFAULT gen_random_uuid()
FK IX project_idUUIDNOT NULL, REFERENCES projects(id) ON DELETE CASCADEMultiple scores per project as contract is revised.
overall_scoreSMALLINTNOT NULL, CHECK 0–100
payment_schedule_scoreSMALLINTNOT NULL, CHECK 0–100
timeline_specificity_scoreSMALLINTNOT NULL, CHECK 0–100
change_order_clause_scoreSMALLINTNOT NULL, CHECK 0–100
dispute_resolution_scoreSMALLINTNOT NULL, CHECK 0–100
completion_definition_scoreSMALLINTNOT NULL, CHECK 0–100
warranty_scoreSMALLINTNOT NULL, CHECK 0–100
insurance_licensing_scoreSMALLINTNOT NULL, CHECK 0–100
lien_waiver_scoreSMALLINTNOT NULL, CHECK 0–100
cancellation_terms_scoreSMALLINTNOT NULL, CHECK 0–100
analysis_jsonJSONBNOT NULLFull AI output: per-clause findings, recommendations[], severity[]
created_atTIMESTAMPTZNOT NULL, DEFAULT now()Append-only. Use latest row for current score.

activity_feed

Immutable event stream for the project timeline
ColumnTypeConstraintsNotes
PK idUUIDDEFAULT gen_random_uuid()
FK IX project_idUUIDNOT NULL, REFERENCES projects(id) ON DELETE CASCADE
FK actor_idUUIDREFERENCES users(id) ON DELETE SET NULLNullable — system events have no actor.
sourceactivity_sourceNOT NULLEnum: contractor/system/city/homeowner
IX event_typeactivity_event_typeNOT NULLEnum: milestone_update/change_order/delivery/inspection/payment/note
descriptionTEXTNOT NULLHuman-readable event summary.
metadataJSONBEvent-specific data. Keys vary by event_type. E.g. for payment: {amount, payment_id}
IX created_atTIMESTAMPTZNOT NULL, DEFAULT now()Immutable. The feed is append-only.

audit_log

Compliance trail — every write to every table
ColumnTypeConstraintsNotes
PK idUUIDDEFAULT gen_random_uuid()
FK IX user_idUUIDREFERENCES users(id) ON DELETE SET NULLNullable — SET NULL preserves the audit trail if user is deleted.
actionTEXTNOT NULLINSERT / UPDATE / DELETE
IX entity_typeTEXTNOT NULLTable name. e.g. 'projects', 'change_orders'
IX entity_idUUIDNOT NULLPK of the modified row.
old_valueJSONBNULL on INSERT.
new_valueJSONBNULL on DELETE.
ip_addressINETPII. IPv4 or IPv6.
IX created_atTIMESTAMPTZNOT NULL, DEFAULT now()Immutable. Never update or delete audit rows.

CREATE TABLE Statements

Full PostgreSQL 16 DDL. Run in order — foreign key references require earlier tables to exist. All tables are created in the public schema and should be moved to a dedicated schema (e.g. app) in production.

000_enums.sql SQL
-- ─────────────────────────────────────────────────────────────
-- Enum Types
-- Create before tables. ALTER TYPE ... ADD VALUE is safe for
-- adding new members without a migration lock.
-- ─────────────────────────────────────────────────────────────

CREATE TYPE user_role AS ENUM (
  'homeowner',
  'contractor',
  'admin'
);

CREATE TYPE project_status AS ENUM (
  'setup',      -- contract not yet signed
  'active',     -- work underway
  'paused',     -- work temporarily halted
  'completed',  -- final walkthrough done
  'disputed'    -- active dispute in progress
);

CREATE TYPE milestone_status AS ENUM (
  'upcoming',
  'in_progress',
  'completed',
  'blocked'
);

CREATE TYPE change_order_type AS ENUM (
  'unforeseen',        -- hidden condition discovered during work
  'homeowner_request', -- scope addition requested by owner
  'code_requirement'   -- mandated by inspection / permit
);

CREATE TYPE change_order_status AS ENUM (
  'pending',
  'approved',
  'declined',
  'modified'  -- counter-offered by homeowner
);

CREATE TYPE payment_status AS ENUM (
  'scheduled',
  'due',
  'paid',
  'overdue'
);

CREATE TYPE document_category AS ENUM (
  'contract',
  'permit',
  'inspection',
  'change_order',
  'insurance',
  'license'
);

CREATE TYPE activity_source AS ENUM (
  'contractor',
  'system',
  'city',
  'homeowner'
);

CREATE TYPE activity_event_type AS ENUM (
  'milestone_update',
  'change_order',
  'delivery',
  'inspection',
  'payment',
  'note'
);
001_users.sql SQL
CREATE TABLE users (
  id            UUID        PRIMARY KEY DEFAULT gen_random_uuid(),
  email         TEXT        NOT NULL UNIQUE
                              CHECK (lower(email) = email),
  name          TEXT        NOT NULL CHECK (length(name) >= 1),
  role          user_role   NOT NULL DEFAULT 'homeowner',
  password_hash TEXT        NOT NULL,
  created_at    TIMESTAMPTZ NOT NULL DEFAULT now(),
  updated_at    TIMESTAMPTZ NOT NULL DEFAULT now()
);

-- Enforce lowercase email at DB level to prevent duplicate accounts
CREATE UNIQUE INDEX idx_users_email_lower
  ON users (lower(email));

-- Auto-update updated_at on any row change
CREATE OR REPLACE FUNCTION set_updated_at()
RETURNS TRIGGER LANGUAGE plpgsql AS $$
BEGIN
  NEW.updated_at = now();
  RETURN NEW;
END;
$$;

CREATE TRIGGER trg_users_updated_at
  BEFORE UPDATE ON users
  FOR EACH ROW EXECUTE FUNCTION set_updated_at();
002_contractor_profiles.sql SQL
CREATE TABLE contractor_profiles (
  id                          UUID        PRIMARY KEY DEFAULT gen_random_uuid(),
  user_id                     UUID        NOT NULL UNIQUE
                                            REFERENCES users(id)
                                            ON DELETE CASCADE,
  company_name                TEXT        NOT NULL,
  license_number              TEXT,        -- encrypted via pgcrypto in application layer
  license_state               CHAR(2),
  license_verified            BOOLEAN     NOT NULL DEFAULT FALSE,
  insurance_verified          BOOLEAN     NOT NULL DEFAULT FALSE,
  years_in_business           SMALLINT    CHECK (years_in_business >= 0),
  avg_timeline_variance_pct   NUMERIC(6,2), -- positive = over schedule, negative = early
  avg_change_order_pct        NUMERIC(6,2),
  total_projects_on_platform  INTEGER     NOT NULL DEFAULT 0
                                            CHECK (total_projects_on_platform >= 0),
  created_at                  TIMESTAMPTZ NOT NULL DEFAULT now(),
  updated_at                  TIMESTAMPTZ NOT NULL DEFAULT now()
);

CREATE INDEX idx_contractor_profiles_user_id
  ON contractor_profiles (user_id);

CREATE INDEX idx_contractor_profiles_verified
  ON contractor_profiles (license_verified, insurance_verified)
  WHERE license_verified = TRUE;   -- partial index: only verified contractors

CREATE TRIGGER trg_contractor_profiles_updated_at
  BEFORE UPDATE ON contractor_profiles
  FOR EACH ROW EXECUTE FUNCTION set_updated_at();
003_projects.sql SQL
CREATE TABLE projects (
  id                       UUID           PRIMARY KEY DEFAULT gen_random_uuid(),
  homeowner_id             UUID           NOT NULL
                                           REFERENCES users(id)
                                           ON DELETE RESTRICT,
  contractor_id            UUID           REFERENCES users(id)
                                           ON DELETE SET NULL,
  name                     TEXT           NOT NULL,
  address                  TEXT           NOT NULL,
  city                     TEXT           NOT NULL,
  state                    CHAR(2)        NOT NULL,
  zip                      CHAR(10)       NOT NULL,
  original_contract_amount NUMERIC(12,2)  NOT NULL
                                           CHECK (original_contract_amount > 0),
  current_total            NUMERIC(12,2)  NOT NULL,
  start_date               DATE,
  expected_end_date        DATE,
  actual_end_date          DATE,
  status                   project_status NOT NULL DEFAULT 'setup',
  created_at               TIMESTAMPTZ    NOT NULL DEFAULT now(),
  updated_at               TIMESTAMPTZ    NOT NULL DEFAULT now(),

  CONSTRAINT chk_projects_end_after_start
    CHECK (expected_end_date IS NULL OR start_date IS NULL
           OR expected_end_date >= start_date),
  CONSTRAINT chk_projects_current_gte_original
    CHECK (current_total >= 0)
);

CREATE INDEX idx_projects_homeowner_id   ON projects (homeowner_id);
CREATE INDEX idx_projects_contractor_id  ON projects (contractor_id)
  WHERE contractor_id IS NOT NULL;
CREATE INDEX idx_projects_status        ON projects (status);
CREATE INDEX idx_projects_state_status  ON projects (state, status);

CREATE TRIGGER trg_projects_updated_at
  BEFORE UPDATE ON projects
  FOR EACH ROW EXECUTE FUNCTION set_updated_at();
004_milestones.sql SQL
CREATE TABLE milestones (
  id                      UUID             PRIMARY KEY DEFAULT gen_random_uuid(),
  project_id              UUID             NOT NULL
                                           REFERENCES projects(id)
                                           ON DELETE CASCADE,
  name                    TEXT             NOT NULL,
  description             TEXT,
  sequence_order          SMALLINT         NOT NULL,
  planned_start           DATE,
  planned_end             DATE,
  actual_start            DATE,
  actual_end              DATE,
  status                  milestone_status NOT NULL DEFAULT 'upcoming',
  depends_on_milestone_id UUID             REFERENCES milestones(id)
                                           ON DELETE SET NULL,
  created_at              TIMESTAMPTZ      NOT NULL DEFAULT now(),
  updated_at              TIMESTAMPTZ      NOT NULL DEFAULT now(),

  CONSTRAINT chk_milestones_no_self_dep
    CHECK (depends_on_milestone_id <> id),
  CONSTRAINT uq_milestones_project_order
    UNIQUE (project_id, sequence_order)
);

CREATE INDEX idx_milestones_project_id       ON milestones (project_id);
CREATE INDEX idx_milestones_project_status   ON milestones (project_id, status);
CREATE INDEX idx_milestones_depends_on       ON milestones (depends_on_milestone_id)
  WHERE depends_on_milestone_id IS NOT NULL;

CREATE TRIGGER trg_milestones_updated_at
  BEFORE UPDATE ON milestones
  FOR EACH ROW EXECUTE FUNCTION set_updated_at();
005_change_orders.sql SQL
CREATE TABLE change_orders (
  id                   UUID                PRIMARY KEY DEFAULT gen_random_uuid(),
  project_id           UUID                NOT NULL
                                           REFERENCES projects(id)
                                           ON DELETE CASCADE,
  submitted_by         UUID                NOT NULL
                                           REFERENCES users(id)
                                           ON DELETE RESTRICT,
  amount               NUMERIC(12,2)       NOT NULL,
  markup_percentage    NUMERIC(5,2)        NOT NULL DEFAULT 0
                                           CHECK (markup_percentage >= 0),
  description          TEXT                NOT NULL,
  type                 change_order_type   NOT NULL,
  status               change_order_status NOT NULL DEFAULT 'pending',
  groundwork_analysis  JSONB,
  signed_at            TIMESTAMPTZ,
  created_at           TIMESTAMPTZ         NOT NULL DEFAULT now(),
  updated_at           TIMESTAMPTZ         NOT NULL DEFAULT now(),

  CONSTRAINT chk_co_signed_only_when_approved
    CHECK (signed_at IS NULL OR status = 'approved')
);

CREATE INDEX idx_change_orders_project_id     ON change_orders (project_id);
CREATE INDEX idx_change_orders_project_status ON change_orders (project_id, status);
CREATE INDEX idx_change_orders_submitted_by   ON change_orders (submitted_by);

-- GIN index for JSONB analysis querying
CREATE INDEX idx_change_orders_analysis_gin
  ON change_orders USING GIN (groundwork_analysis)
  WHERE groundwork_analysis IS NOT NULL;

CREATE TRIGGER trg_change_orders_updated_at
  BEFORE UPDATE ON change_orders
  FOR EACH ROW EXECUTE FUNCTION set_updated_at();
006_payments.sql SQL
CREATE TABLE payments (
  id           UUID           PRIMARY KEY DEFAULT gen_random_uuid(),
  project_id   UUID           NOT NULL
                               REFERENCES projects(id)
                               ON DELETE CASCADE,
  milestone_id UUID           REFERENCES milestones(id)
                               ON DELETE SET NULL,
  amount       NUMERIC(12,2)  NOT NULL CHECK (amount > 0),
  status       payment_status NOT NULL DEFAULT 'scheduled',
  due_date     DATE           NOT NULL,
  paid_date    DATE,
  created_at   TIMESTAMPTZ    NOT NULL DEFAULT now(),

  CONSTRAINT chk_payments_paid_date_requires_paid_status
    CHECK (paid_date IS NULL OR status = 'paid')
);

CREATE INDEX idx_payments_project_id        ON payments (project_id);
CREATE INDEX idx_payments_milestone_id      ON payments (milestone_id)
  WHERE milestone_id IS NOT NULL;
CREATE INDEX idx_payments_project_status    ON payments (project_id, status);
-- Partial index: only overdue/due payments — most common dashboard query
CREATE INDEX idx_payments_due_overdue       ON payments (due_date, project_id)
  WHERE status IN ('due', 'overdue');
007_documents.sql SQL
CREATE TABLE documents (
  id              UUID              PRIMARY KEY DEFAULT gen_random_uuid(),
  project_id      UUID              NOT NULL
                                    REFERENCES projects(id)
                                    ON DELETE CASCADE,
  uploaded_by     UUID              NOT NULL
                                    REFERENCES users(id)
                                    ON DELETE RESTRICT,
  name            TEXT              NOT NULL,
  file_key        TEXT              NOT NULL UNIQUE,  -- R2 object key
  file_type       TEXT              NOT NULL,
  file_size_bytes BIGINT            NOT NULL CHECK (file_size_bytes > 0),
  category        document_category NOT NULL,
  created_at      TIMESTAMPTZ       NOT NULL DEFAULT now()
);

CREATE INDEX idx_documents_project_id          ON documents (project_id);
CREATE INDEX idx_documents_project_category    ON documents (project_id, category);
CREATE INDEX idx_documents_uploaded_by         ON documents (uploaded_by);
008_contract_health_scores.sql SQL
CREATE TABLE contract_health_scores (
  id                        UUID        PRIMARY KEY DEFAULT gen_random_uuid(),
  project_id                UUID        NOT NULL
                                        REFERENCES projects(id)
                                        ON DELETE CASCADE,
  overall_score             SMALLINT    NOT NULL CHECK (overall_score BETWEEN 0 AND 100),
  payment_schedule_score    SMALLINT    NOT NULL CHECK (payment_schedule_score BETWEEN 0 AND 100),
  timeline_specificity_score SMALLINT   NOT NULL CHECK (timeline_specificity_score BETWEEN 0 AND 100),
  change_order_clause_score SMALLINT    NOT NULL CHECK (change_order_clause_score BETWEEN 0 AND 100),
  dispute_resolution_score  SMALLINT    NOT NULL CHECK (dispute_resolution_score BETWEEN 0 AND 100),
  completion_definition_score SMALLINT  NOT NULL CHECK (completion_definition_score BETWEEN 0 AND 100),
  warranty_score            SMALLINT    NOT NULL CHECK (warranty_score BETWEEN 0 AND 100),
  insurance_licensing_score SMALLINT    NOT NULL CHECK (insurance_licensing_score BETWEEN 0 AND 100),
  lien_waiver_score         SMALLINT    NOT NULL CHECK (lien_waiver_score BETWEEN 0 AND 100),
  cancellation_terms_score  SMALLINT    NOT NULL CHECK (cancellation_terms_score BETWEEN 0 AND 100),
  analysis_json             JSONB       NOT NULL,
  created_at                TIMESTAMPTZ NOT NULL DEFAULT now()
);

CREATE INDEX idx_chs_project_id          ON contract_health_scores (project_id);
CREATE INDEX idx_chs_project_created     ON contract_health_scores (project_id, created_at DESC);
CREATE INDEX idx_chs_overall_score       ON contract_health_scores (overall_score);
CREATE INDEX idx_chs_analysis_gin        ON contract_health_scores USING GIN (analysis_json);
009_activity_feed.sql SQL
CREATE TABLE activity_feed (
  id          UUID                 PRIMARY KEY DEFAULT gen_random_uuid(),
  project_id  UUID                 NOT NULL
                                   REFERENCES projects(id)
                                   ON DELETE CASCADE,
  actor_id    UUID                 REFERENCES users(id)
                                   ON DELETE SET NULL,
  source      activity_source      NOT NULL,
  event_type  activity_event_type  NOT NULL,
  description TEXT                 NOT NULL,
  metadata    JSONB,
  created_at  TIMESTAMPTZ          NOT NULL DEFAULT now()
);

-- Primary access pattern: project feed ordered by time (most recent first)
CREATE INDEX idx_activity_project_created
  ON activity_feed (project_id, created_at DESC);
CREATE INDEX idx_activity_project_event_type
  ON activity_feed (project_id, event_type);
-- GIN for metadata key searches (e.g. find events referencing a specific change_order_id)
CREATE INDEX idx_activity_metadata_gin
  ON activity_feed USING GIN (metadata)
  WHERE metadata IS NOT NULL;
010_audit_log.sql SQL
CREATE TABLE audit_log (
  id          UUID        PRIMARY KEY DEFAULT gen_random_uuid(),
  user_id     UUID        REFERENCES users(id)
                          ON DELETE SET NULL,
  action      TEXT        NOT NULL CHECK (action IN ('INSERT', 'UPDATE', 'DELETE')),
  entity_type TEXT        NOT NULL,
  entity_id   UUID        NOT NULL,
  old_value   JSONB,
  new_value   JSONB,
  ip_address  INET,
  created_at  TIMESTAMPTZ NOT NULL DEFAULT now()
);

-- Lookup by user for compliance review
CREATE INDEX idx_audit_user_id         ON audit_log (user_id, created_at DESC);
-- Lookup all changes to a specific record
CREATE INDEX idx_audit_entity          ON audit_log (entity_type, entity_id, created_at DESC);
CREATE INDEX idx_audit_created_at      ON audit_log (created_at);  -- for time-range compliance exports

-- audit_log rows must never be deleted or updated — enforce via RLS
ALTER TABLE audit_log ENABLE ROW LEVEL SECURITY;

Index Recommendations

All indexes listed with their type, selectivity rationale, and maintenance cost. Prefer partial indexes for status-filtered queries — they are smaller, faster to scan, and cheaper to maintain.

Index Name Table Columns Type Rationale
idx_users_email_loweruserslower(email)Unique B-treeLogin lookup. Expression index avoids duplicate accounts differing only by case.
idx_contractor_profiles_user_idcontractor_profilesuser_idB-treeJoin from users to their profile on every contractor dashboard load.
idx_contractor_profiles_verifiedcontractor_profileslicense_verified, insurance_verifiedPartial B-treeHomeowner directory search for verified contractors only. Partial keeps index small.
idx_projects_homeowner_idprojectshomeowner_idB-treeHomeowner dashboard: "show all my projects." High-frequency read.
idx_projects_contractor_idprojectscontractor_idPartial B-treeContractor job list. Partial (WHERE NOT NULL) excludes unassigned projects.
idx_projects_state_statusprojectsstate, statusB-treeAdmin analytics by region and status. Composite covers both filter + covering scan.
idx_milestones_project_statusmilestonesproject_id, statusB-treeFrequent: "show in_progress milestones for project." Composite avoids table heap fetch.
idx_milestones_depends_onmilestonesdepends_on_milestone_idPartial B-treeDependency graph traversal. Partial (WHERE NOT NULL) skips root milestones.
idx_change_orders_project_statuschange_ordersproject_id, statusB-treeDashboard: pending COs per project. Very frequent homeowner and contractor read.
idx_change_orders_analysis_ginchange_ordersgroundwork_analysisGINEnables @> containment queries on AI analysis JSON (e.g., find all COs with reasonableness_score < 50).
idx_payments_due_overduepaymentsdue_date, project_idPartial B-treeOverdue payment alert cron. Partial on status = due/overdue keeps index tiny — most payments are paid.
idx_chs_project_createdcontract_health_scoresproject_id, created_at DESCB-treeFetch latest score per project using LIMIT 1 with index-only scan.
idx_chs_analysis_gincontract_health_scoresanalysis_jsonGINAggregate analytics over recommendation flags across all contracts.
idx_activity_project_createdactivity_feedproject_id, created_at DESCB-treeThe single most critical index: project timeline view, paginated newest-first.
idx_activity_metadata_ginactivity_feedmetadataGINFind all feed events referencing a specific entity ID stored in metadata JSON.
idx_audit_entityaudit_logentity_type, entity_id, created_at DESCB-treeCompliance: full change history for a specific row. Composite covers the typical query.
GIN vs B-tree for JSONB: Use GIN (jsonb_ops) when you query with @>, ?, or ?| operators across the full document. Use a B-tree expression index like ((col->>'key')::int) when filtering on a single known key with range comparisons. The groundwork_analysis and analysis_json columns warrant GIN because the query shape is dynamic.

Data Store Decision Matrix

Three storage tiers — PostgreSQL (Neon or self-hosted), Cloudflare R2, and an in-memory cache (Redis or Cloudflare Workers KV). Each piece of data belongs in exactly one primary store.

DataPrimary StoreRationaleCache?TTL
All relational data (users, projects, milestones, etc.) PostgreSQL ACID guarantees, relational integrity, RLS enforcement Selective
File uploads (contracts, permits, photos, inspection PDFs) Cloudflare R2 Binary objects, no egress fees, S3-compatible API. Only the file_key reference lives in Postgres. No
Presigned R2 upload/download URLs Cache (KV) Generated on demand, valid for 15 minutes. Never persist to Postgres. Yes 15 min
Active project dashboard (milestones + payments + latest CO) PostgreSQL Source of truth. Cached for UI performance. Yes 60 sec
Latest contract_health_score per project PostgreSQL Queried on every dashboard load. Warm cache avoids repeated DESC + LIMIT queries. Yes 5 min
Contractor profile (public-facing reputation data) PostgreSQL Written infrequently (nightly batch). Cache aggressively. Yes 10 min
Activity feed (project timeline events) PostgreSQL Append-only. Cache the first page only; invalidate on new event. Yes 30 sec
AI analysis output (groundwork_analysis, analysis_json) PostgreSQL (JSONB) Append-only once written. JSONB allows future structured querying without schema migration. No
Audit log PostgreSQL Never cache. Always query source for compliance queries. No
Session tokens / JWTs Cache (KV) Short-lived, high-read, never needs relational joins. Revocation list also in KV. Yes Session lifetime
R2 object metadata (size, content-type) PostgreSQL Stored in documents table at upload time to avoid S3 HEAD requests. No

Row Count Projections

Estimates based on: Year 1 — 500 active projects; Year 2 — 3,000 active projects; Year 3 — 12,000 active projects. Assumptions: avg 8 milestones/project, avg 4 change orders/project, avg 6 payments/project, avg 8 documents/project, avg 50 activity events/project/month, 3 active months average project duration.

TableYear 1Year 2Year 3Growth DriverNotes
users~2,500~14,000~55,000Project participants + orphan signupsSmallest table. No partitioning needed.
contractor_profiles~400~2,200~8,5001:1 with contractor usersTrivial size.
projects~600~3,500~14,000Core growth metricIncludes historical completed projects.
milestones~5,000~28,000~112,0008 avg per projectManageable. No partitioning through Y3.
change_orders~2,400~14,000~56,0004 avg per projectManageable. High-value table for analytics.
payments~3,600~21,000~84,0006 avg per projectManageable. Low churn after paid status.
documents~4,800~28,000~112,0008 avg per projectMetadata only. Heavy growth = monitor R2 spend.
contract_health_scores~700~4,200~16,8001-3 scores per project lifecycleAppend-only. Very small table.
activity_feed~75,000~450,000~5,400,000High event volume, 50 events/project/month x 3 months avgFirst table to partition. Partition by created_at at ~2M rows.
audit_log~120,000~720,000~8,640,000Every write to every table creates a rowLargest table. Partition by month from day one. Archive to cold storage after 2 years.
Watch these two: audit_log and activity_feed grow at a rate 10-100x other tables. Both should be range-partitioned on created_at before launch (especially audit_log). This decision is much cheaper to make at schema creation than to retrofit at 1M rows.

Scaling Strategy

Read Replicas

Add a read replica when: p95 read latency on the primary exceeds 50ms, or read connections routinely exceed 70% of max_connections. Route all SELECTs for dashboard views, activity feed, and contractor profiles to the replica. Writes always go to the primary.
Y1

Single primary, no replica

At <15,000 rows on the hot tables, a properly indexed single Postgres instance handles the load trivially. Use PgBouncer in transaction mode for connection pooling from the first day.

Y2

Primary + 1 read replica

Route analytics queries, contractor profile lookups, and activity feed reads to the replica. Primary handles all writes and transactional reads. Monitor replication lag — target <100ms.

Y3

Consider horizontal read scaling or CQRS

If read replicas saturate, introduce a separate read model for analytics (e.g., materialized views refreshed on a schedule, or an OLAP sidecar). Do not prematurely shard — PostgreSQL handles millions of rows in a single instance well.

Table Partitioning

audit_log — range partition by month (implement at launch) SQL
-- Convert audit_log to a partitioned table.
-- Do this at schema creation — retrofitting requires pg_partman or table swap.

CREATE TABLE audit_log (
  ...  -- same columns as above
  created_at TIMESTAMPTZ NOT NULL DEFAULT now()
) PARTITION BY RANGE (created_at);

-- Create monthly partitions (use pg_partman to automate)
CREATE TABLE audit_log_2026_04
  PARTITION OF audit_log
  FOR VALUES FROM ('2026-04-01') TO ('2026-05-01');

-- Old partitions can be detached + archived to cold storage (e.g. S3/R2 pg_dump)
-- without affecting the live table
ALTER TABLE audit_log DETACH PARTITION audit_log_2024_01;

Connection Pooling

PgBouncer Configuration

  • Mode: transaction pooling — required for Postgres functions and short API requests
  • max_client_conn: 1000 (API instances)
  • default_pool_size: 25 per database
  • server_lifetime: 3600 seconds
  • server_idle_timeout: 60 seconds
  • Avoid session pooling — incompatible with RLS using SET LOCAL

RLS + Connection Pooling

Row Level Security relies on current_setting('app.current_user_id'). With transaction-mode pooling, set this at the start of every transaction:

BEGIN;
SET LOCAL app.current_user_id = '<uuid>';
SET LOCAL app.current_role   = 'homeowner';
-- queries here
COMMIT;

PII Inventory

Personal Identifiable Information inventory per CCPA/GDPR requirements. All PII columns are flagged in the entity reference above with a yellow background.

TableColumnPII CategoryEncryption ApproachRetention
usersemail Contact identifier At-rest encryption via database-level encryption (TDE). Application layer: no additional encryption — must be queryable for auth. Until account deletion
usersname Personal name TDE only. Queried frequently for display. Until account deletion
userspassword_hash Credential (hashed) Argon2id with per-user salt. Never logged, never exposed in API responses or audit_log new_value. Until account deletion
contractor_profileslicense_number Government-issued identifier Application-layer encryption with pgcrypto.pgp_sym_encrypt(). Decrypted only for verification workflows. Until contractor account deletion
projectsaddress Physical address of property TDE. Accessible to project participants (homeowner + contractor) via RLS. Not exposed to admin without audit trail. 7 years (construction record requirement)
audit_logip_address Network identifier (PII in some jurisdictions) TDE. Truncate to /24 (IPv4) or /48 (IPv6) after 90 days for analytics. Original preserved for fraud investigation. 2 years, then anonymize
audit_logold_value / new_value May contain any PII from any table Exclude password_hash from audit_log via trigger logic. JSONB redaction applied before export. 2 years per compliance
Audit log JSONB redaction: The audit trigger must explicitly exclude password_hash from the new_value JSONB snapshot. Use a trigger-level function that strips known sensitive keys before writing: new_value = new_value - 'password_hash'. Never log raw credential fields.

Row Level Security Policies

RLS is enabled on all tables. The application sets app.current_user_id and app.current_role at the start of every transaction. A service_role database role bypasses RLS for internal batch jobs and migrations.

rls_setup.sql — enable and configure SQL
-- Enable RLS on all application tables
ALTER TABLE users                  ENABLE ROW LEVEL SECURITY;
ALTER TABLE contractor_profiles    ENABLE ROW LEVEL SECURITY;
ALTER TABLE projects              ENABLE ROW LEVEL SECURITY;
ALTER TABLE milestones            ENABLE ROW LEVEL SECURITY;
ALTER TABLE change_orders        ENABLE ROW LEVEL SECURITY;
ALTER TABLE payments             ENABLE ROW LEVEL SECURITY;
ALTER TABLE documents            ENABLE ROW LEVEL SECURITY;
ALTER TABLE contract_health_scores ENABLE ROW LEVEL SECURITY;
ALTER TABLE activity_feed        ENABLE ROW LEVEL SECURITY;

-- Helper functions used across policies
CREATE OR REPLACE FUNCTION current_user_id() RETURNS UUID LANGUAGE sql STABLE AS $$
  SELECT nullif(current_setting('app.current_user_id', TRUE), '')::UUID
$$;

CREATE OR REPLACE FUNCTION current_user_role() RETURNS TEXT LANGUAGE sql STABLE AS $$
  SELECT current_setting('app.current_role', TRUE)
$$;

CREATE OR REPLACE FUNCTION is_project_participant(p_id UUID)
RETURNS BOOLEAN LANGUAGE sql STABLE SECURITY DEFINER AS $$
  SELECT EXISTS (
    SELECT 1 FROM projects
    WHERE id = p_id
      AND (homeowner_id = current_user_id()
           OR contractor_id = current_user_id())
  )
$$;

projects policies

pol_projects_homeowner_select
CREATE POLICY pol_projects_homeowner_select ON projects
  FOR SELECT USING (
    homeowner_id = current_user_id()
    OR contractor_id = current_user_id()
    OR current_user_role() = 'admin'
  );
pol_projects_homeowner_insert
CREATE POLICY pol_projects_homeowner_insert ON projects
  FOR INSERT WITH CHECK (
    homeowner_id = current_user_id()   -- only create projects for yourself
  );
pol_projects_update
CREATE POLICY pol_projects_update ON projects
  FOR UPDATE USING (
    homeowner_id = current_user_id()    -- homeowner owns status transitions
    OR contractor_id = current_user_id() -- contractor updates progress fields
    OR current_user_role() = 'admin'
  );

milestones, change_orders, payments, documents, activity_feed

Pattern: participant-scoped via project FK
-- All child tables share this pattern. Replace {table} with the table name.

CREATE POLICY pol_{table}_select ON {table}
  FOR SELECT USING (
    is_project_participant(project_id)
    OR current_user_role() = 'admin'
  );

CREATE POLICY pol_{table}_insert ON {table}
  FOR INSERT WITH CHECK (
    is_project_participant(project_id)
  );

CREATE POLICY pol_{table}_update ON {table}
  FOR UPDATE USING (
    is_project_participant(project_id)
    OR current_user_role() = 'admin'
  );

-- activity_feed and audit_log: no DELETE policy (append-only)

audit_log policy

pol_audit_log_admin_select — audit log is admin-only
-- Regular users never read or write audit_log directly.
-- Writes go through a SECURITY DEFINER trigger function.
-- Reads are restricted to admin role and compliance exports.

CREATE POLICY pol_audit_log_admin_select ON audit_log
  FOR SELECT USING (current_user_role() = 'admin');

CREATE POLICY pol_audit_log_system_insert ON audit_log
  FOR INSERT WITH CHECK (TRUE);  -- only reachable via SECURITY DEFINER trigger

-- No UPDATE or DELETE policies — audit rows are immutable.
Performance note: is_project_participant() is a SECURITY DEFINER function that queries the projects table. It will use idx_projects_homeowner_id and idx_projects_contractor_id. Because it is marked STABLE, PostgreSQL can cache the result within a single query. For bulk queries across multiple projects, consider a lateral join instead of per-row function invocations.

Migration Strategy

Schema changes are deployed via versioned SQL migration files managed by a migration runner (Flyway or the node-postgres ecosystem's postgres-migrations). Every change goes through the same pipeline: local dev → staging → production.

File Naming Convention

migration file naming convention
V{version}__{description}.sql

Examples:
V001__create_enum_types.sql
V002__create_users.sql
V010__create_audit_log.sql
V011__add_contractor_profiles_insurance_expiry.sql
V012__add_index_projects_state_status.sql

-- Repeatable migrations (views, functions, triggers)
-- Re-run whenever the file changes:
R__set_updated_at_function.sql
R__rls_policies_projects.sql

Safe Migration Checklist

ADD COLUMN
Always ADD COLUMN ... DEFAULT ... NULL first, then backfill, then add NOT NULL constraint in a separate migration. Never add NOT NULL without a default in a single step — it acquires an ACCESS EXCLUSIVE lock that blocks all reads on large tables.
ADD INDEX
Use CREATE INDEX CONCURRENTLY on production. Does not block reads or writes. Cannot run inside a transaction — execute as a standalone migration step outside BEGIN/COMMIT.
RENAME COLUMN
Never rename directly in production. The safe path: (1) add new column, (2) dual-write old + new at the application layer, (3) backfill new from old, (4) cut over reads to new column, (5) drop old column after one release cycle.
ALTER TYPE
Adding values to an existing ENUM with ALTER TYPE ... ADD VALUE is safe and does not lock the table in PostgreSQL 12+. Removing or reordering enum values requires creating a new type and migrating — schedule during a maintenance window.
DROP TABLE
Never drop in the same release that removes application code references. Sequence: (1) deploy code that no longer writes to the table, (2) wait one release cycle, (3) deploy the DROP in a separate migration. Use DROP TABLE IF EXISTS for idempotency.
PARTITIONING
Converting an existing table to partitioned (e.g., audit_log) cannot be done in-place. Use the pg_partman approach: create a new partitioned parent table, swap via table rename during a brief maintenance window, backfill with INSERT INTO ... SELECT using batched chunks.

Migration Runner Integration

deploy pipeline shell
# Run in CI/CD before deploying application code
# Migrations must be backward-compatible with the previous app version

# Flyway example
flyway -url="jdbc:postgresql://host/groundwork" \
       -user="$DB_USER" \
       -password="$DB_PASSWORD" \
       -locations="filesystem:./migrations" \
       migrate

# Verify: check applied migrations match expected count
flyway info | grep "Pending" | wc -l  # should be 0 after successful migrate

Seed Data

Data that ships with the database in every environment (dev, staging, production). Seed data is idempotent — running it twice produces the same state.

Enum values are defined in the DDL (not as lookup tables) so they do not require seed rows. The seed covers: a default admin user for local development, and any reference data used by the application layer.
seed_dev.sql — development environment only SQL
-- Development seed: creates a known admin user.
-- Password: 'devpassword123' (Argon2id hash shown below is illustrative)
-- DO NOT run this seed in production.

INSERT INTO users (id, email, name, role, password_hash)
VALUES (
  '00000000-0000-0000-0000-000000000001',
  'admin@groundwork.dev',
  'Groundwork Admin',
  'admin',
  '$argon2id$v=19$m=65536,t=3,p=4$...'
)
ON CONFLICT (id) DO NOTHING;

-- Sample homeowner
INSERT INTO users (id, email, name, role, password_hash)
VALUES (
  '00000000-0000-0000-0000-000000000002',
  'homeowner@groundwork.dev',
  'Jane Homeowner',
  'homeowner',
  '$argon2id$v=19$m=65536,t=3,p=4$...'
)
ON CONFLICT (id) DO NOTHING;

-- Sample contractor + profile
INSERT INTO users (id, email, name, role, password_hash)
VALUES (
  '00000000-0000-0000-0000-000000000003',
  'contractor@groundwork.dev',
  'Bob Builder',
  'contractor',
  '$argon2id$v=19$m=65536,t=3,p=4$...'
)
ON CONFLICT (id) DO NOTHING;

INSERT INTO contractor_profiles
  (user_id, company_name, license_state,
   license_verified, insurance_verified, years_in_business)
VALUES (
  '00000000-0000-0000-0000-000000000003',
  'Bob Builder LLC',
  'CA',
  TRUE, TRUE, 12
)
ON CONFLICT (user_id) DO NOTHING;
seed_production.sql — safe for all environments SQL
-- No user records in production seed.
-- Enums are in DDL. No lookup tables required.
--
-- What does ship to production:
--   1. Database configuration settings
--   2. pg_partman setup for audit_log partitioning
--   3. Nightly maintenance job configuration

-- Ensure pg_partman extension is available
CREATE EXTENSION IF NOT EXISTS pg_partman SCHEMA partman;
CREATE EXTENSION IF NOT EXISTS pgcrypto;
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;

-- Register audit_log with pg_partman for automatic monthly partition creation
SELECT partman.create_parent(
  p_parent_table => 'public.audit_log',
  p_control      => 'created_at',
  p_interval     => 'monthly',
  p_premake      => 3   -- pre-create 3 months ahead
);

-- Activity feed: partition when it approaches 500K rows (Year 2+)
-- Register same way when partitioning is added via migration.

Top 10 Query Patterns

The highest-frequency, highest-importance queries. Each includes expected result cardinality, index coverage, and notes on optimization.

Q1 — Homeowner dashboard: all my active projects

~5 rows idx_projects_homeowner_id
SELECT
  p.id, p.name, p.status, p.current_total,
  p.expected_end_date,
  (SELECT COUNT(*)::INT FROM change_orders co
   WHERE co.project_id = p.id AND co.status = 'pending')   AS pending_co_count,
  (SELECT COUNT(*)::INT FROM payments pm
   WHERE pm.project_id = p.id AND pm.status = 'overdue')  AS overdue_payments,
  chs.overall_score
FROM projects p
LEFT JOIN LATERAL (
  SELECT overall_score
  FROM contract_health_scores
  WHERE project_id = p.id
  ORDER BY created_at DESC LIMIT 1
) chs ON TRUE
WHERE p.homeowner_id = $1
  AND p.status IN ('active', 'paused', 'disputed')
ORDER BY p.updated_at DESC;

-- Index coverage: idx_projects_homeowner_id + idx_chs_project_created
-- LATERAL avoids N+1 for latest score. Subqueries use partial indexes on status.

Q2 — Project activity feed (paginated, newest first)

~20 rows idx_activity_project_created
SELECT
  af.id, af.event_type, af.source,
  af.description, af.metadata, af.created_at,
  u.name AS actor_name
FROM activity_feed af
LEFT JOIN users u ON u.id = af.actor_id
WHERE af.project_id = $1
  AND af.created_at < $2   -- cursor for keyset pagination
ORDER BY af.created_at DESC
LIMIT 20;

-- Keyset pagination on created_at is O(log N) vs OFFSET which is O(N).
-- Index: idx_activity_project_created covers (project_id, created_at DESC).

Q3 — Project milestone timeline with payment linkage

~8 rows idx_milestones_project_id
SELECT
  m.id, m.name, m.sequence_order,
  m.status, m.planned_start, m.planned_end,
  m.actual_start, m.actual_end,
  m.depends_on_milestone_id,
  p.id AS payment_id,
  p.amount AS payment_amount,
  p.status AS payment_status
FROM milestones m
LEFT JOIN payments p ON p.milestone_id = m.id
WHERE m.project_id = $1
ORDER BY m.sequence_order;

Q4 — Pending change orders for homeowner review

~3 rows idx_change_orders_project_status
SELECT
  co.id, co.description, co.amount,
  co.markup_percentage, co.type,
  co.groundwork_analysis->>'reasonableness_score' AS ai_score,
  co.groundwork_analysis->>'type_classification'   AS ai_classification,
  co.created_at,
  u.name AS submitted_by_name
FROM change_orders co
JOIN  users u ON u.id = co.submitted_by
WHERE co.project_id = $1
  AND co.status = 'pending'
ORDER BY co.created_at ASC;

-- ->> operator extracts text from JSONB without GIN — fine for display.
-- GIN index is used only for @> containment queries across many rows.

Q5 — Contractor's active job list

~15 rows idx_projects_contractor_id
SELECT
  p.id, p.name, p.city, p.state,
  p.status, p.current_total,
  p.start_date, p.expected_end_date,
  (SELECT status FROM milestones
   WHERE project_id = p.id AND status = 'in_progress'
   LIMIT 1)                      AS active_milestone_status,
  u.name                          AS homeowner_name
FROM projects p
JOIN users u ON u.id = p.homeowner_id
WHERE p.contractor_id = $1
  AND p.status IN ('active', 'paused')
ORDER BY p.expected_end_date ASC;

Q6 — Overdue payments across all projects (system cron job)

~50 rows idx_payments_due_overdue
-- Runs daily to identify and transition scheduled payments to overdue
UPDATE payments
SET    status = 'overdue'
WHERE  status = 'due'
  AND  due_date < CURRENT_DATE
RETURNING id, project_id, amount, due_date;

-- Partial index idx_payments_due_overdue covers status IN ('due','overdue')
-- so this update touches only the relevant fraction of the payments table.

Q7 — Latest contract health score per project

1 row idx_chs_project_created
SELECT
  overall_score,
  payment_schedule_score, timeline_specificity_score,
  change_order_clause_score, dispute_resolution_score,
  completion_definition_score, warranty_score,
  insurance_licensing_score, lien_waiver_score,
  cancellation_terms_score,
  analysis_json->>'summary' AS summary,
  created_at
FROM contract_health_scores
WHERE project_id = $1
ORDER BY created_at DESC
LIMIT 1;

-- idx_chs_project_created (project_id, created_at DESC) makes this
-- an index-only scan followed by a single heap fetch. Extremely fast.

Q8 — Admin: change orders with low AI reasonableness score

~20 rows idx_change_orders_analysis_gin
-- Find COs where AI flagged potential overcharging (score < 40)
SELECT
  co.id, co.project_id, co.amount, co.description,
  (co.groundwork_analysis->>'reasonableness_score')::INT AS score,
  co.groundwork_analysis->'flags' AS flags,
  co.created_at
FROM change_orders co
WHERE
  co.groundwork_analysis IS NOT NULL
  AND (co.groundwork_analysis->>'reasonableness_score')::INT < 40
  AND co.status = 'pending'
ORDER BY score ASC
LIMIT 50;

-- For this single-key range query, a B-tree expression index is more
-- efficient than GIN. Add in a future migration once the query pattern
-- is confirmed: CREATE INDEX ON change_orders
-- (((groundwork_analysis->>'reasonableness_score')::int));
-- The GIN index supports ad-hoc JSONB exploration in the meantime.

Q9 — Contractor reputation: historical timeline variance

~30 rows idx_projects_contractor_id
-- Nightly batch: recompute avg_timeline_variance_pct for all contractors
WITH project_variances AS (
  SELECT
    p.contractor_id,
    AVG(
      CASE WHEN p.actual_end_date IS NOT NULL
           THEN (p.actual_end_date - p.expected_end_date)
                / NULLIF((p.expected_end_date - p.start_date), 0) * 100.0
      END
    )::NUMERIC(6,2)  AS avg_variance,
    COUNT(*)          AS project_count
  FROM projects p
  WHERE p.contractor_id IS NOT NULL
    AND p.status = 'completed'
    AND p.start_date IS NOT NULL
  GROUP BY p.contractor_id
)
UPDATE contractor_profiles cp
SET
  avg_timeline_variance_pct = pv.avg_variance,
  total_projects_on_platform = pv.project_count,
  updated_at = now()
FROM project_variances pv
WHERE cp.user_id = pv.contractor_id;

Q10 — Audit trail for a specific project (compliance export)

~200 rows idx_audit_entity
-- Fetch all audit events touching a project and its child entities.
-- Used for dispute resolution and legal discovery.
SELECT
  al.created_at,
  al.action,
  al.entity_type,
  al.entity_id,
  u.email  AS actor_email,
  al.ip_address,
  al.old_value,
  al.new_value
FROM audit_log al
LEFT JOIN users u ON u.id = al.user_id
WHERE
  (al.entity_type = 'projects'     AND al.entity_id = $1)
  OR (al.entity_type = 'milestones'   AND al.entity_id IN
      (SELECT id FROM milestones WHERE project_id = $1))
  OR (al.entity_type = 'change_orders' AND al.entity_id IN
      (SELECT id FROM change_orders WHERE project_id = $1))
  OR (al.entity_type = 'payments'    AND al.entity_id IN
      (SELECT id FROM payments      WHERE project_id = $1))
ORDER BY al.created_at ASC;

-- idx_audit_entity (entity_type, entity_id, created_at DESC) covers each arm.
-- For a monthly partitioned audit_log, Postgres will prune irrelevant partitions
-- when a date range is added to the WHERE clause:
-- AND al.created_at >= $2 AND al.created_at < $3