Data Model

Version 1.0 — Initial schema PostgreSQL 16 April 2026 10 tables · 3 data stores

Overview

Groundwork is a shared reality platform that brings transparency to residential renovation. The data model supports three primary actors — homeowners, contractors, and system automations — across a project lifecycle from contract signing through final payment. This document is the authoritative implementation reference for the PostgreSQL schema.

Design Principles

All primary keys are UUID (v4 via gen_random_uuid()) — avoids enumeration attacks and enables client-side ID generation
All timestamps are TIMESTAMPTZ stored in UTC — clients localize on display
Soft state via status enums — records are never hard-deleted by application code
JSONB for computed/AI analysis fields — avoids premature column normalization on evolving AI outputs
Row Level Security enabled on all tables — enforced at the database layer, not the application layer

Naming Conventions

Tables: lowercase snake_case, plural nouns
Columns: lowercase snake_case
Foreign keys: {referenced_table_singular}_id
Indexes: idx_{table}_{columns}
Constraints: chk_{table}_{rule}
RLS policies: pol_{table}_{actor}_{operation}
Enums: {domain}_{purpose} (e.g., project_status)

Entity Relationship Diagram

All ten entities and their foreign key relationships. Crow's foot notation: || = exactly one, |< = one-to-many, >< = many-to-many.

Reading the diagram: Solid lines indicate required foreign keys. Dashed lines indicate nullable (optional) foreign keys. Self-referencing arrows indicate recursive relationships. All entities link back to projects as the primary aggregate root, with users as the authentication and actor anchor.

Entity Reference

Each entity's columns, data types, constraints, and notes. PK = primary key, FK = foreign key, IX = indexed, NN = NOT NULL.

users

Authentication & actor identity

Column	Type	Constraints	Notes
PK id	UUID	`DEFAULT gen_random_uuid()`	Immutable. Never reused.
email	TEXT	NOT NULL, UNIQUE	PII. Stored lowercase. Used for auth.
name	TEXT	NOT NULL	PII. Display name.
role	user_role	NOT NULL, DEFAULT 'homeowner'	Enum: homeowner / contractor / admin
password_hash	TEXT	NOT NULL	bcrypt or Argon2id. Never expose via API.
IX created_at	TIMESTAMPTZ	NOT NULL, DEFAULT now()	UTC. Immutable after insert.
updated_at	TIMESTAMPTZ	NOT NULL, DEFAULT now()	Updated via trigger.

contractor_profiles

Public contractor data & reputation metrics

Column	Type	Constraints	Notes
PK id	UUID	DEFAULT gen_random_uuid()
FK IX user_id	UUID	NOT NULL, UNIQUE, REFERENCES users(id) ON DELETE CASCADE	1:1 with users. Cascade cleans profile on user delete.
company_name	TEXT	NOT NULL
license_number	TEXT		PII. Encrypted at rest via pgcrypto.
license_state	CHAR(2)		ISO 3166-2 state code.
license_verified	BOOLEAN	NOT NULL, DEFAULT false	Set by admin/verification service.
insurance_verified	BOOLEAN	NOT NULL, DEFAULT false
years_in_business	SMALLINT	CHECK (years_in_business >= 0)
avg_timeline_variance_pct	NUMERIC(6,2)		Computed nightly. Positive = over schedule.
avg_change_order_pct	NUMERIC(6,2)		Avg CO amount / original contract value.
total_projects_on_platform	INTEGER	NOT NULL, DEFAULT 0	Denormalized counter. Updated via trigger.
created_at	TIMESTAMPTZ	NOT NULL, DEFAULT now()
updated_at	TIMESTAMPTZ	NOT NULL, DEFAULT now()

projects

Primary aggregate root — the renovation job

Column	Type	Constraints	Notes
PK id	UUID	DEFAULT gen_random_uuid()
FK IX homeowner_id	UUID	NOT NULL, REFERENCES users(id) ON DELETE RESTRICT	RESTRICT prevents orphaned projects.
FK IX contractor_id	UUID	REFERENCES users(id) ON DELETE SET NULL	Nullable. NULL = contractor not yet assigned.
name	TEXT	NOT NULL	e.g. "Kitchen Remodel 2026"
address	TEXT	NOT NULL	PII. Street address of property.
city	TEXT	NOT NULL
IX state	CHAR(2)	NOT NULL	ISO 3166-2. Indexed for regional analytics.
zip	CHAR(10)	NOT NULL	ZIP or ZIP+4 format.
original_contract_amount	NUMERIC(12,2)	NOT NULL, CHECK > 0	Immutable after project activation.
current_total	NUMERIC(12,2)	NOT NULL	Denormalized: original + approved COs.
start_date	DATE
expected_end_date	DATE
actual_end_date	DATE		Set when status transitions to completed.
IX status	project_status	NOT NULL, DEFAULT 'setup'	Enum: setup/active/paused/completed/disputed
created_at	TIMESTAMPTZ	NOT NULL, DEFAULT now()
updated_at	TIMESTAMPTZ	NOT NULL, DEFAULT now()

milestones

Ordered work phases with dependency graph

Column	Type	Constraints	Notes
PK id	UUID	DEFAULT gen_random_uuid()
FK IX project_id	UUID	NOT NULL, REFERENCES projects(id) ON DELETE CASCADE
name	TEXT	NOT NULL	e.g. "Demo", "Framing", "Drywall"
description	TEXT
sequence_order	SMALLINT	NOT NULL	Display order within project. UNIQUE per project.
planned_start	DATE
planned_end	DATE
actual_start	DATE
actual_end	DATE
IX status	milestone_status	NOT NULL, DEFAULT 'upcoming'	Enum: upcoming/in_progress/completed/blocked
FK depends_on_milestone_id	UUID	REFERENCES milestones(id) ON DELETE SET NULL	Nullable. Self-referential DAG for sequencing.
created_at	TIMESTAMPTZ	NOT NULL, DEFAULT now()
updated_at	TIMESTAMPTZ	NOT NULL, DEFAULT now()

change_orders

Scope and cost modifications with AI analysis

Column	Type	Constraints	Notes
PK id	UUID	DEFAULT gen_random_uuid()
FK IX project_id	UUID	NOT NULL, REFERENCES projects(id) ON DELETE CASCADE
FK submitted_by	UUID	NOT NULL, REFERENCES users(id) ON DELETE RESTRICT	Audit trail must be preserved.
amount	NUMERIC(12,2)	NOT NULL	Can be negative (credit COs).
markup_percentage	NUMERIC(5,2)	DEFAULT 0	Contractor markup over cost. E.g. 15.00 = 15%.
description	TEXT	NOT NULL
type	change_order_type	NOT NULL	Enum: unforeseen/homeowner_request/code_requirement
IX status	change_order_status	NOT NULL, DEFAULT 'pending'	Enum: pending/approved/declined/modified
groundwork_analysis	JSONB		AI output: reasonableness_score (0-100), type_classification, comparable_market_rates, flags[]
signed_at	TIMESTAMPTZ		Set when homeowner approves.
created_at	TIMESTAMPTZ	NOT NULL, DEFAULT now()
updated_at	TIMESTAMPTZ	NOT NULL, DEFAULT now()

payments

Payment schedule tied to project milestones

Column	Type	Constraints	Notes
PK id	UUID	DEFAULT gen_random_uuid()
FK IX project_id	UUID	NOT NULL, REFERENCES projects(id) ON DELETE CASCADE
FK IX milestone_id	UUID	REFERENCES milestones(id) ON DELETE SET NULL	Nullable. Not all payments are milestone-gated.
amount	NUMERIC(12,2)	NOT NULL, CHECK > 0
IX status	payment_status	NOT NULL, DEFAULT 'scheduled'	Enum: scheduled/due/paid/overdue
due_date	DATE	NOT NULL
paid_date	DATE		Set when payment is confirmed.
created_at	TIMESTAMPTZ	NOT NULL, DEFAULT now()	No updated_at — payments are append-only.

documents

File metadata — actual bytes live in Cloudflare R2

Column	Type	Constraints	Notes
PK id	UUID	DEFAULT gen_random_uuid()
FK IX project_id	UUID	NOT NULL, REFERENCES projects(id) ON DELETE CASCADE
FK uploaded_by	UUID	NOT NULL, REFERENCES users(id) ON DELETE RESTRICT
name	TEXT	NOT NULL	User-facing filename.
file_key	TEXT	NOT NULL, UNIQUE	R2 object key. Format: `{project_id}/{uuid}/{filename}`
file_type	TEXT	NOT NULL	MIME type. e.g. application/pdf
file_size_bytes	BIGINT	NOT NULL, CHECK > 0
IX category	document_category	NOT NULL	Enum: contract/permit/inspection/change_order/insurance/license
created_at	TIMESTAMPTZ	NOT NULL, DEFAULT now()	Immutable. Documents are never updated, only replaced.

contract_health_scores

AI-computed contract analysis — append-only history

Column	Type	Constraints	Notes
PK id	UUID	DEFAULT gen_random_uuid()
FK IX project_id	UUID	NOT NULL, REFERENCES projects(id) ON DELETE CASCADE	Multiple scores per project as contract is revised.
overall_score	SMALLINT	NOT NULL, CHECK 0–100
payment_schedule_score	SMALLINT	NOT NULL, CHECK 0–100
timeline_specificity_score	SMALLINT	NOT NULL, CHECK 0–100
change_order_clause_score	SMALLINT	NOT NULL, CHECK 0–100
dispute_resolution_score	SMALLINT	NOT NULL, CHECK 0–100
completion_definition_score	SMALLINT	NOT NULL, CHECK 0–100
warranty_score	SMALLINT	NOT NULL, CHECK 0–100
insurance_licensing_score	SMALLINT	NOT NULL, CHECK 0–100
lien_waiver_score	SMALLINT	NOT NULL, CHECK 0–100
cancellation_terms_score	SMALLINT	NOT NULL, CHECK 0–100
analysis_json	JSONB	NOT NULL	Full AI output: per-clause findings, recommendations[], severity[]
created_at	TIMESTAMPTZ	NOT NULL, DEFAULT now()	Append-only. Use latest row for current score.

activity_feed

Immutable event stream for the project timeline

Column	Type	Constraints	Notes
PK id	UUID	DEFAULT gen_random_uuid()
FK IX project_id	UUID	NOT NULL, REFERENCES projects(id) ON DELETE CASCADE
FK actor_id	UUID	REFERENCES users(id) ON DELETE SET NULL	Nullable — system events have no actor.
source	activity_source	NOT NULL	Enum: contractor/system/city/homeowner
IX event_type	activity_event_type	NOT NULL	Enum: milestone_update/change_order/delivery/inspection/payment/note
description	TEXT	NOT NULL	Human-readable event summary.
metadata	JSONB		Event-specific data. Keys vary by event_type. E.g. for payment: {amount, payment_id}
IX created_at	TIMESTAMPTZ	NOT NULL, DEFAULT now()	Immutable. The feed is append-only.

audit_log

Compliance trail — every write to every table

Column	Type	Constraints	Notes
PK id	UUID	DEFAULT gen_random_uuid()
FK IX user_id	UUID	REFERENCES users(id) ON DELETE SET NULL	Nullable — SET NULL preserves the audit trail if user is deleted.
action	TEXT	NOT NULL	INSERT / UPDATE / DELETE
IX entity_type	TEXT	NOT NULL	Table name. e.g. 'projects', 'change_orders'
IX entity_id	UUID	NOT NULL	PK of the modified row.
old_value	JSONB		NULL on INSERT.
new_value	JSONB		NULL on DELETE.
ip_address	INET		PII. IPv4 or IPv6.
IX created_at	TIMESTAMPTZ	NOT NULL, DEFAULT now()	Immutable. Never update or delete audit rows.

CREATE TABLE Statements

Full PostgreSQL 16 DDL. Run in order — foreign key references require earlier tables to exist. All tables are created in the public schema and should be moved to a dedicated schema (e.g. app) in production.

000_enums.sql SQL

-- ─────────────────────────────────────────────────────────────
-- Enum Types
-- Create before tables. ALTER TYPE ... ADD VALUE is safe for
-- adding new members without a migration lock.
-- ─────────────────────────────────────────────────────────────

CREATE TYPE user_role AS ENUM (
  'homeowner',
  'contractor',
  'admin'
);

CREATE TYPE project_status AS ENUM (
  'setup',      -- contract not yet signed
  'active',     -- work underway
  'paused',     -- work temporarily halted
  'completed',  -- final walkthrough done
  'disputed'    -- active dispute in progress
);

CREATE TYPE milestone_status AS ENUM (
  'upcoming',
  'in_progress',
  'completed',
  'blocked'
);

CREATE TYPE change_order_type AS ENUM (
  'unforeseen',        -- hidden condition discovered during work
  'homeowner_request', -- scope addition requested by owner
  'code_requirement'   -- mandated by inspection / permit
);

CREATE TYPE change_order_status AS ENUM (
  'pending',
  'approved',
  'declined',
  'modified'  -- counter-offered by homeowner
);

CREATE TYPE payment_status AS ENUM (
  'scheduled',
  'due',
  'paid',
  'overdue'
);

CREATE TYPE document_category AS ENUM (
  'contract',
  'permit',
  'inspection',
  'change_order',
  'insurance',
  'license'
);

CREATE TYPE activity_source AS ENUM (
  'contractor',
  'system',
  'city',
  'homeowner'
);

CREATE TYPE activity_event_type AS ENUM (
  'milestone_update',
  'change_order',
  'delivery',
  'inspection',
  'payment',
  'note'
);

001_users.sql SQL

CREATE TABLE users (
  id            UUID        PRIMARY KEY DEFAULT gen_random_uuid(),
  email         TEXT        NOT NULL UNIQUE
                              CHECK (lower(email) = email),
  name          TEXT        NOT NULL CHECK (length(name) >= 1),
  role          user_role   NOT NULL DEFAULT 'homeowner',
  password_hash TEXT        NOT NULL,
  created_at    TIMESTAMPTZ NOT NULL DEFAULT now(),
  updated_at    TIMESTAMPTZ NOT NULL DEFAULT now()
);

-- Enforce lowercase email at DB level to prevent duplicate accounts
CREATE UNIQUE INDEX idx_users_email_lower
  ON users (lower(email));

-- Auto-update updated_at on any row change
CREATE OR REPLACE FUNCTION set_updated_at()
RETURNS TRIGGER LANGUAGE plpgsql AS $$
BEGIN
  NEW.updated_at = now();
  RETURN NEW;
END;
$$;

CREATE TRIGGER trg_users_updated_at
  BEFORE UPDATE ON users
  FOR EACH ROW EXECUTE FUNCTION set_updated_at();

002_contractor_profiles.sql SQL

CREATE TABLE contractor_profiles (
  id                          UUID        PRIMARY KEY DEFAULT gen_random_uuid(),
  user_id                     UUID        NOT NULL UNIQUE
                                            REFERENCES users(id)
                                            ON DELETE CASCADE,
  company_name                TEXT        NOT NULL,
  license_number              TEXT,        -- encrypted via pgcrypto in application layer
  license_state               CHAR(2),
  license_verified            BOOLEAN     NOT NULL DEFAULT FALSE,
  insurance_verified          BOOLEAN     NOT NULL DEFAULT FALSE,
  years_in_business           SMALLINT    CHECK (years_in_business >= 0),
  avg_timeline_variance_pct   NUMERIC(6,2), -- positive = over schedule, negative = early
  avg_change_order_pct        NUMERIC(6,2),
  total_projects_on_platform  INTEGER     NOT NULL DEFAULT 0
                                            CHECK (total_projects_on_platform >= 0),
  created_at                  TIMESTAMPTZ NOT NULL DEFAULT now(),
  updated_at                  TIMESTAMPTZ NOT NULL DEFAULT now()
);

CREATE INDEX idx_contractor_profiles_user_id
  ON contractor_profiles (user_id);

CREATE INDEX idx_contractor_profiles_verified
  ON contractor_profiles (license_verified, insurance_verified)
  WHERE license_verified = TRUE;   -- partial index: only verified contractors

CREATE TRIGGER trg_contractor_profiles_updated_at
  BEFORE UPDATE ON contractor_profiles
  FOR EACH ROW EXECUTE FUNCTION set_updated_at();

003_projects.sql SQL

CREATE TABLE projects (
  id                       UUID           PRIMARY KEY DEFAULT gen_random_uuid(),
  homeowner_id             UUID           NOT NULL
                                           REFERENCES users(id)
                                           ON DELETE RESTRICT,
  contractor_id            UUID           REFERENCES users(id)
                                           ON DELETE SET NULL,
  name                     TEXT           NOT NULL,
  address                  TEXT           NOT NULL,
  city                     TEXT           NOT NULL,
  state                    CHAR(2)        NOT NULL,
  zip                      CHAR(10)       NOT NULL,
  original_contract_amount NUMERIC(12,2)  NOT NULL
                                           CHECK (original_contract_amount > 0),
  current_total            NUMERIC(12,2)  NOT NULL,
  start_date               DATE,
  expected_end_date        DATE,
  actual_end_date          DATE,
  status                   project_status NOT NULL DEFAULT 'setup',
  created_at               TIMESTAMPTZ    NOT NULL DEFAULT now(),
  updated_at               TIMESTAMPTZ    NOT NULL DEFAULT now(),

  CONSTRAINT chk_projects_end_after_start
    CHECK (expected_end_date IS NULL OR start_date IS NULL
           OR expected_end_date >= start_date),
  CONSTRAINT chk_projects_current_gte_original
    CHECK (current_total >= 0)
);

CREATE INDEX idx_projects_homeowner_id   ON projects (homeowner_id);
CREATE INDEX idx_projects_contractor_id  ON projects (contractor_id)
  WHERE contractor_id IS NOT NULL;
CREATE INDEX idx_projects_status        ON projects (status);
CREATE INDEX idx_projects_state_status  ON projects (state, status);

CREATE TRIGGER trg_projects_updated_at
  BEFORE UPDATE ON projects
  FOR EACH ROW EXECUTE FUNCTION set_updated_at();

004_milestones.sql SQL

CREATE TABLE milestones (
  id                      UUID             PRIMARY KEY DEFAULT gen_random_uuid(),
  project_id              UUID             NOT NULL
                                           REFERENCES projects(id)
                                           ON DELETE CASCADE,
  name                    TEXT             NOT NULL,
  description             TEXT,
  sequence_order          SMALLINT         NOT NULL,
  planned_start           DATE,
  planned_end             DATE,
  actual_start            DATE,
  actual_end              DATE,
  status                  milestone_status NOT NULL DEFAULT 'upcoming',
  depends_on_milestone_id UUID             REFERENCES milestones(id)
                                           ON DELETE SET NULL,
  created_at              TIMESTAMPTZ      NOT NULL DEFAULT now(),
  updated_at              TIMESTAMPTZ      NOT NULL DEFAULT now(),

  CONSTRAINT chk_milestones_no_self_dep
    CHECK (depends_on_milestone_id <> id),
  CONSTRAINT uq_milestones_project_order
    UNIQUE (project_id, sequence_order)
);

CREATE INDEX idx_milestones_project_id       ON milestones (project_id);
CREATE INDEX idx_milestones_project_status   ON milestones (project_id, status);
CREATE INDEX idx_milestones_depends_on       ON milestones (depends_on_milestone_id)
  WHERE depends_on_milestone_id IS NOT NULL;

CREATE TRIGGER trg_milestones_updated_at
  BEFORE UPDATE ON milestones
  FOR EACH ROW EXECUTE FUNCTION set_updated_at();

005_change_orders.sql SQL

CREATE TABLE change_orders (
  id                   UUID                PRIMARY KEY DEFAULT gen_random_uuid(),
  project_id           UUID                NOT NULL
                                           REFERENCES projects(id)
                                           ON DELETE CASCADE,
  submitted_by         UUID                NOT NULL
                                           REFERENCES users(id)
                                           ON DELETE RESTRICT,
  amount               NUMERIC(12,2)       NOT NULL,
  markup_percentage    NUMERIC(5,2)        NOT NULL DEFAULT 0
                                           CHECK (markup_percentage >= 0),
  description          TEXT                NOT NULL,
  type                 change_order_type   NOT NULL,
  status               change_order_status NOT NULL DEFAULT 'pending',
  groundwork_analysis  JSONB,
  signed_at            TIMESTAMPTZ,
  created_at           TIMESTAMPTZ         NOT NULL DEFAULT now(),
  updated_at           TIMESTAMPTZ         NOT NULL DEFAULT now(),

  CONSTRAINT chk_co_signed_only_when_approved
    CHECK (signed_at IS NULL OR status = 'approved')
);

CREATE INDEX idx_change_orders_project_id     ON change_orders (project_id);
CREATE INDEX idx_change_orders_project_status ON change_orders (project_id, status);
CREATE INDEX idx_change_orders_submitted_by   ON change_orders (submitted_by);

-- GIN index for JSONB analysis querying
CREATE INDEX idx_change_orders_analysis_gin
  ON change_orders USING GIN (groundwork_analysis)
  WHERE groundwork_analysis IS NOT NULL;

CREATE TRIGGER trg_change_orders_updated_at
  BEFORE UPDATE ON change_orders
  FOR EACH ROW EXECUTE FUNCTION set_updated_at();

006_payments.sql SQL

CREATE TABLE payments (
  id           UUID           PRIMARY KEY DEFAULT gen_random_uuid(),
  project_id   UUID           NOT NULL
                               REFERENCES projects(id)
                               ON DELETE CASCADE,
  milestone_id UUID           REFERENCES milestones(id)
                               ON DELETE SET NULL,
  amount       NUMERIC(12,2)  NOT NULL CHECK (amount > 0),
  status       payment_status NOT NULL DEFAULT 'scheduled',
  due_date     DATE           NOT NULL,
  paid_date    DATE,
  created_at   TIMESTAMPTZ    NOT NULL DEFAULT now(),

  CONSTRAINT chk_payments_paid_date_requires_paid_status
    CHECK (paid_date IS NULL OR status = 'paid')
);

CREATE INDEX idx_payments_project_id        ON payments (project_id);
CREATE INDEX idx_payments_milestone_id      ON payments (milestone_id)
  WHERE milestone_id IS NOT NULL;
CREATE INDEX idx_payments_project_status    ON payments (project_id, status);
-- Partial index: only overdue/due payments — most common dashboard query
CREATE INDEX idx_payments_due_overdue       ON payments (due_date, project_id)
  WHERE status IN ('due', 'overdue');

007_documents.sql SQL

CREATE TABLE documents (
  id              UUID              PRIMARY KEY DEFAULT gen_random_uuid(),
  project_id      UUID              NOT NULL
                                    REFERENCES projects(id)
                                    ON DELETE CASCADE,
  uploaded_by     UUID              NOT NULL
                                    REFERENCES users(id)
                                    ON DELETE RESTRICT,
  name            TEXT              NOT NULL,
  file_key        TEXT              NOT NULL UNIQUE,  -- R2 object key
  file_type       TEXT              NOT NULL,
  file_size_bytes BIGINT            NOT NULL CHECK (file_size_bytes > 0),
  category        document_category NOT NULL,
  created_at      TIMESTAMPTZ       NOT NULL DEFAULT now()
);

CREATE INDEX idx_documents_project_id          ON documents (project_id);
CREATE INDEX idx_documents_project_category    ON documents (project_id, category);
CREATE INDEX idx_documents_uploaded_by         ON documents (uploaded_by);

008_contract_health_scores.sql SQL

CREATE TABLE contract_health_scores (
  id                        UUID        PRIMARY KEY DEFAULT gen_random_uuid(),
  project_id                UUID        NOT NULL
                                        REFERENCES projects(id)
                                        ON DELETE CASCADE,
  overall_score             SMALLINT    NOT NULL CHECK (overall_score BETWEEN 0 AND 100),
  payment_schedule_score    SMALLINT    NOT NULL CHECK (payment_schedule_score BETWEEN 0 AND 100),
  timeline_specificity_score SMALLINT   NOT NULL CHECK (timeline_specificity_score BETWEEN 0 AND 100),
  change_order_clause_score SMALLINT    NOT NULL CHECK (change_order_clause_score BETWEEN 0 AND 100),
  dispute_resolution_score  SMALLINT    NOT NULL CHECK (dispute_resolution_score BETWEEN 0 AND 100),
  completion_definition_score SMALLINT  NOT NULL CHECK (completion_definition_score BETWEEN 0 AND 100),
  warranty_score            SMALLINT    NOT NULL CHECK (warranty_score BETWEEN 0 AND 100),
  insurance_licensing_score SMALLINT    NOT NULL CHECK (insurance_licensing_score BETWEEN 0 AND 100),
  lien_waiver_score         SMALLINT    NOT NULL CHECK (lien_waiver_score BETWEEN 0 AND 100),
  cancellation_terms_score  SMALLINT    NOT NULL CHECK (cancellation_terms_score BETWEEN 0 AND 100),
  analysis_json             JSONB       NOT NULL,
  created_at                TIMESTAMPTZ NOT NULL DEFAULT now()
);

CREATE INDEX idx_chs_project_id          ON contract_health_scores (project_id);
CREATE INDEX idx_chs_project_created     ON contract_health_scores (project_id, created_at DESC);
CREATE INDEX idx_chs_overall_score       ON contract_health_scores (overall_score);
CREATE INDEX idx_chs_analysis_gin        ON contract_health_scores USING GIN (analysis_json);

009_activity_feed.sql SQL

CREATE TABLE activity_feed (
  id          UUID                 PRIMARY KEY DEFAULT gen_random_uuid(),
  project_id  UUID                 NOT NULL
                                   REFERENCES projects(id)
                                   ON DELETE CASCADE,
  actor_id    UUID                 REFERENCES users(id)
                                   ON DELETE SET NULL,
  source      activity_source      NOT NULL,
  event_type  activity_event_type  NOT NULL,
  description TEXT                 NOT NULL,
  metadata    JSONB,
  created_at  TIMESTAMPTZ          NOT NULL DEFAULT now()
);

-- Primary access pattern: project feed ordered by time (most recent first)
CREATE INDEX idx_activity_project_created
  ON activity_feed (project_id, created_at DESC);
CREATE INDEX idx_activity_project_event_type
  ON activity_feed (project_id, event_type);
-- GIN for metadata key searches (e.g. find events referencing a specific change_order_id)
CREATE INDEX idx_activity_metadata_gin
  ON activity_feed USING GIN (metadata)
  WHERE metadata IS NOT NULL;

010_audit_log.sql SQL

CREATE TABLE audit_log (
  id          UUID        PRIMARY KEY DEFAULT gen_random_uuid(),
  user_id     UUID        REFERENCES users(id)
                          ON DELETE SET NULL,
  action      TEXT        NOT NULL CHECK (action IN ('INSERT', 'UPDATE', 'DELETE')),
  entity_type TEXT        NOT NULL,
  entity_id   UUID        NOT NULL,
  old_value   JSONB,
  new_value   JSONB,
  ip_address  INET,
  created_at  TIMESTAMPTZ NOT NULL DEFAULT now()
);

-- Lookup by user for compliance review
CREATE INDEX idx_audit_user_id         ON audit_log (user_id, created_at DESC);
-- Lookup all changes to a specific record
CREATE INDEX idx_audit_entity          ON audit_log (entity_type, entity_id, created_at DESC);
CREATE INDEX idx_audit_created_at      ON audit_log (created_at);  -- for time-range compliance exports

-- audit_log rows must never be deleted or updated — enforce via RLS
ALTER TABLE audit_log ENABLE ROW LEVEL SECURITY;

Index Recommendations

All indexes listed with their type, selectivity rationale, and maintenance cost. Prefer partial indexes for status-filtered queries — they are smaller, faster to scan, and cheaper to maintain.

Index Name	Table	Columns	Type	Rationale
`idx_users_email_lower`	users	`lower(email)`	Unique B-tree	Login lookup. Expression index avoids duplicate accounts differing only by case.
`idx_contractor_profiles_user_id`	contractor_profiles	`user_id`	B-tree	Join from users to their profile on every contractor dashboard load.
`idx_contractor_profiles_verified`	contractor_profiles	`license_verified, insurance_verified`	Partial B-tree	Homeowner directory search for verified contractors only. Partial keeps index small.
`idx_projects_homeowner_id`	projects	`homeowner_id`	B-tree	Homeowner dashboard: "show all my projects." High-frequency read.
`idx_projects_contractor_id`	projects	`contractor_id`	Partial B-tree	Contractor job list. Partial (WHERE NOT NULL) excludes unassigned projects.
`idx_projects_state_status`	projects	`state, status`	B-tree	Admin analytics by region and status. Composite covers both filter + covering scan.
`idx_milestones_project_status`	milestones	`project_id, status`	B-tree	Frequent: "show in_progress milestones for project." Composite avoids table heap fetch.
`idx_milestones_depends_on`	milestones	`depends_on_milestone_id`	Partial B-tree	Dependency graph traversal. Partial (WHERE NOT NULL) skips root milestones.
`idx_change_orders_project_status`	change_orders	`project_id, status`	B-tree	Dashboard: pending COs per project. Very frequent homeowner and contractor read.
`idx_change_orders_analysis_gin`	change_orders	`groundwork_analysis`	GIN	Enables `@>` containment queries on AI analysis JSON (e.g., find all COs with reasonableness_score < 50).
`idx_payments_due_overdue`	payments	`due_date, project_id`	Partial B-tree	Overdue payment alert cron. Partial on status = due/overdue keeps index tiny — most payments are paid.
`idx_chs_project_created`	contract_health_scores	`project_id, created_at DESC`	B-tree	Fetch latest score per project using LIMIT 1 with index-only scan.
`idx_chs_analysis_gin`	contract_health_scores	`analysis_json`	GIN	Aggregate analytics over recommendation flags across all contracts.
`idx_activity_project_created`	activity_feed	`project_id, created_at DESC`	B-tree	The single most critical index: project timeline view, paginated newest-first.
`idx_activity_metadata_gin`	activity_feed	`metadata`	GIN	Find all feed events referencing a specific entity ID stored in metadata JSON.
`idx_audit_entity`	audit_log	`entity_type, entity_id, created_at DESC`	B-tree	Compliance: full change history for a specific row. Composite covers the typical query.

GIN vs B-tree for JSONB: Use GIN (jsonb_ops) when you query with @>, ?, or ?| operators across the full document. Use a B-tree expression index like ((col->>'key')::int) when filtering on a single known key with range comparisons. The groundwork_analysis and analysis_json columns warrant GIN because the query shape is dynamic.

Data Store Decision Matrix

Three storage tiers — PostgreSQL (Neon or self-hosted), Cloudflare R2, and an in-memory cache (Redis or Cloudflare Workers KV). Each piece of data belongs in exactly one primary store.

Data	Primary Store	Rationale	Cache?	TTL
All relational data (users, projects, milestones, etc.)	PostgreSQL	ACID guarantees, relational integrity, RLS enforcement	Selective	—
File uploads (contracts, permits, photos, inspection PDFs)	Cloudflare R2	Binary objects, no egress fees, S3-compatible API. Only the `file_key` reference lives in Postgres.	No	—
Presigned R2 upload/download URLs	Cache (KV)	Generated on demand, valid for 15 minutes. Never persist to Postgres.	Yes	15 min
Active project dashboard (milestones + payments + latest CO)	PostgreSQL	Source of truth. Cached for UI performance.	Yes	60 sec
Latest contract_health_score per project	PostgreSQL	Queried on every dashboard load. Warm cache avoids repeated DESC + LIMIT queries.	Yes	5 min
Contractor profile (public-facing reputation data)	PostgreSQL	Written infrequently (nightly batch). Cache aggressively.	Yes	10 min
Activity feed (project timeline events)	PostgreSQL	Append-only. Cache the first page only; invalidate on new event.	Yes	30 sec
AI analysis output (groundwork_analysis, analysis_json)	PostgreSQL (JSONB)	Append-only once written. JSONB allows future structured querying without schema migration.	No	—
Audit log	PostgreSQL	Never cache. Always query source for compliance queries.	No	—
Session tokens / JWTs	Cache (KV)	Short-lived, high-read, never needs relational joins. Revocation list also in KV.	Yes	Session lifetime
R2 object metadata (size, content-type)	PostgreSQL	Stored in `documents` table at upload time to avoid S3 HEAD requests.	No	—

Row Count Projections

Estimates based on: Year 1 — 500 active projects; Year 2 — 3,000 active projects; Year 3 — 12,000 active projects. Assumptions: avg 8 milestones/project, avg 4 change orders/project, avg 6 payments/project, avg 8 documents/project, avg 50 activity events/project/month, 3 active months average project duration.

Table	Year 1	Year 2	Year 3	Growth Driver	Notes
`users`	~2,500	~14,000	~55,000	Project participants + orphan signups	Smallest table. No partitioning needed.
`contractor_profiles`	~400	~2,200	~8,500	1:1 with contractor users	Trivial size.
`projects`	~600	~3,500	~14,000	Core growth metric	Includes historical completed projects.
`milestones`	~5,000	~28,000	~112,000	8 avg per project	Manageable. No partitioning through Y3.
`change_orders`	~2,400	~14,000	~56,000	4 avg per project	Manageable. High-value table for analytics.
`payments`	~3,600	~21,000	~84,000	6 avg per project	Manageable. Low churn after paid status.
`documents`	~4,800	~28,000	~112,000	8 avg per project	Metadata only. Heavy growth = monitor R2 spend.
`contract_health_scores`	~700	~4,200	~16,800	1-3 scores per project lifecycle	Append-only. Very small table.
`activity_feed`	~75,000	~450,000	~5,400,000	High event volume, 50 events/project/month x 3 months avg	First table to partition. Partition by created_at at ~2M rows.
`audit_log`	~120,000	~720,000	~8,640,000	Every write to every table creates a row	Largest table. Partition by month from day one. Archive to cold storage after 2 years.

Watch these two: audit_log and activity_feed grow at a rate 10-100x other tables. Both should be range-partitioned on created_at before launch (especially audit_log). This decision is much cheaper to make at schema creation than to retrofit at 1M rows.

Scaling Strategy

Read Replicas

Add a read replica when: p95 read latency on the primary exceeds 50ms, or read connections routinely exceed 70% of max_connections. Route all SELECTs for dashboard views, activity feed, and contractor profiles to the replica. Writes always go to the primary.

Single primary, no replica

At <15,000 rows on the hot tables, a properly indexed single Postgres instance handles the load trivially. Use PgBouncer in transaction mode for connection pooling from the first day.

Primary + 1 read replica

Route analytics queries, contractor profile lookups, and activity feed reads to the replica. Primary handles all writes and transactional reads. Monitor replication lag — target <100ms.

Consider horizontal read scaling or CQRS

If read replicas saturate, introduce a separate read model for analytics (e.g., materialized views refreshed on a schedule, or an OLAP sidecar). Do not prematurely shard — PostgreSQL handles millions of rows in a single instance well.

Table Partitioning

audit_log — range partition by month (implement at launch) SQL

-- Convert audit_log to a partitioned table.
-- Do this at schema creation — retrofitting requires pg_partman or table swap.

CREATE TABLE audit_log (
  ...  -- same columns as above
  created_at TIMESTAMPTZ NOT NULL DEFAULT now()
) PARTITION BY RANGE (created_at);

-- Create monthly partitions (use pg_partman to automate)
CREATE TABLE audit_log_2026_04
  PARTITION OF audit_log
  FOR VALUES FROM ('2026-04-01') TO ('2026-05-01');

-- Old partitions can be detached + archived to cold storage (e.g. S3/R2 pg_dump)
-- without affecting the live table
ALTER TABLE audit_log DETACH PARTITION audit_log_2024_01;

Connection Pooling

PgBouncer Configuration

Mode: transaction pooling — required for Postgres functions and short API requests
max_client_conn: 1000 (API instances)
default_pool_size: 25 per database
server_lifetime: 3600 seconds
server_idle_timeout: 60 seconds
Avoid session pooling — incompatible with RLS using SET LOCAL

RLS + Connection Pooling

Row Level Security relies on current_setting('app.current_user_id'). With transaction-mode pooling, set this at the start of every transaction:

BEGIN;
SET LOCAL app.current_user_id = '<uuid>';
SET LOCAL app.current_role   = 'homeowner';
-- queries here
COMMIT;

PII Inventory

Personal Identifiable Information inventory per CCPA/GDPR requirements. All PII columns are flagged in the entity reference above with a yellow background.

Table	Column	PII Category	Encryption Approach	Retention
`users`	`email`	Contact identifier	At-rest encryption via database-level encryption (TDE). Application layer: no additional encryption — must be queryable for auth.	Until account deletion
`users`	`name`	Personal name	TDE only. Queried frequently for display.	Until account deletion
`users`	`password_hash`	Credential (hashed)	Argon2id with per-user salt. Never logged, never exposed in API responses or audit_log new_value.	Until account deletion
`contractor_profiles`	`license_number`	Government-issued identifier	Application-layer encryption with `pgcrypto.pgp_sym_encrypt()`. Decrypted only for verification workflows.	Until contractor account deletion
`projects`	`address`	Physical address of property	TDE. Accessible to project participants (homeowner + contractor) via RLS. Not exposed to admin without audit trail.	7 years (construction record requirement)
`audit_log`	`ip_address`	Network identifier (PII in some jurisdictions)	TDE. Truncate to /24 (IPv4) or /48 (IPv6) after 90 days for analytics. Original preserved for fraud investigation.	2 years, then anonymize
`audit_log`	`old_value / new_value`	May contain any PII from any table	Exclude `password_hash` from audit_log via trigger logic. JSONB redaction applied before export.	2 years per compliance

Audit log JSONB redaction: The audit trigger must explicitly exclude password_hash from the new_value JSONB snapshot. Use a trigger-level function that strips known sensitive keys before writing: new_value = new_value - 'password_hash'. Never log raw credential fields.

Row Level Security Policies

RLS is enabled on all tables. The application sets app.current_user_id and app.current_role at the start of every transaction. A service_role database role bypasses RLS for internal batch jobs and migrations.

rls_setup.sql — enable and configure SQL

-- Enable RLS on all application tables
ALTER TABLE users                  ENABLE ROW LEVEL SECURITY;
ALTER TABLE contractor_profiles    ENABLE ROW LEVEL SECURITY;
ALTER TABLE projects              ENABLE ROW LEVEL SECURITY;
ALTER TABLE milestones            ENABLE ROW LEVEL SECURITY;
ALTER TABLE change_orders        ENABLE ROW LEVEL SECURITY;
ALTER TABLE payments             ENABLE ROW LEVEL SECURITY;
ALTER TABLE documents            ENABLE ROW LEVEL SECURITY;
ALTER TABLE contract_health_scores ENABLE ROW LEVEL SECURITY;
ALTER TABLE activity_feed        ENABLE ROW LEVEL SECURITY;

-- Helper functions used across policies
CREATE OR REPLACE FUNCTION current_user_id() RETURNS UUID LANGUAGE sql STABLE AS $$
  SELECT nullif(current_setting('app.current_user_id', TRUE), '')::UUID
$$;

CREATE OR REPLACE FUNCTION current_user_role() RETURNS TEXT LANGUAGE sql STABLE AS $$
  SELECT current_setting('app.current_role', TRUE)
$$;

CREATE OR REPLACE FUNCTION is_project_participant(p_id UUID)
RETURNS BOOLEAN LANGUAGE sql STABLE SECURITY DEFINER AS $$
  SELECT EXISTS (
    SELECT 1 FROM projects
    WHERE id = p_id
      AND (homeowner_id = current_user_id()
           OR contractor_id = current_user_id())
  )
$$;

projects policies

pol_projects_homeowner_select

CREATE POLICY pol_projects_homeowner_select ON projects
  FOR SELECT USING (
    homeowner_id = current_user_id()
    OR contractor_id = current_user_id()
    OR current_user_role() = 'admin'
  );

pol_projects_homeowner_insert

CREATE POLICY pol_projects_homeowner_insert ON projects
  FOR INSERT WITH CHECK (
    homeowner_id = current_user_id()   -- only create projects for yourself
  );

pol_projects_update

CREATE POLICY pol_projects_update ON projects
  FOR UPDATE USING (
    homeowner_id = current_user_id()    -- homeowner owns status transitions
    OR contractor_id = current_user_id() -- contractor updates progress fields
    OR current_user_role() = 'admin'
  );

milestones, change_orders, payments, documents, activity_feed

Pattern: participant-scoped via project FK

-- All child tables share this pattern. Replace {table} with the table name.

CREATE POLICY pol_{table}_select ON {table}
  FOR SELECT USING (
    is_project_participant(project_id)
    OR current_user_role() = 'admin'
  );

CREATE POLICY pol_{table}_insert ON {table}
  FOR INSERT WITH CHECK (
    is_project_participant(project_id)
  );

CREATE POLICY pol_{table}_update ON {table}
  FOR UPDATE USING (
    is_project_participant(project_id)
    OR current_user_role() = 'admin'
  );

-- activity_feed and audit_log: no DELETE policy (append-only)

audit_log policy

pol_audit_log_admin_select — audit log is admin-only

-- Regular users never read or write audit_log directly.
-- Writes go through a SECURITY DEFINER trigger function.
-- Reads are restricted to admin role and compliance exports.

CREATE POLICY pol_audit_log_admin_select ON audit_log
  FOR SELECT USING (current_user_role() = 'admin');

CREATE POLICY pol_audit_log_system_insert ON audit_log
  FOR INSERT WITH CHECK (TRUE);  -- only reachable via SECURITY DEFINER trigger

-- No UPDATE or DELETE policies — audit rows are immutable.

Performance note: is_project_participant() is a SECURITY DEFINER function that queries the projects table. It will use idx_projects_homeowner_id and idx_projects_contractor_id. Because it is marked STABLE, PostgreSQL can cache the result within a single query. For bulk queries across multiple projects, consider a lateral join instead of per-row function invocations.

Migration Strategy

Schema changes are deployed via versioned SQL migration files managed by a migration runner (Flyway or the node-postgres ecosystem's postgres-migrations). Every change goes through the same pipeline: local dev → staging → production.

File Naming Convention

migration file naming convention

V{version}__{description}.sql

Examples:
V001__create_enum_types.sql
V002__create_users.sql
V010__create_audit_log.sql
V011__add_contractor_profiles_insurance_expiry.sql
V012__add_index_projects_state_status.sql

-- Repeatable migrations (views, functions, triggers)
-- Re-run whenever the file changes:
R__set_updated_at_function.sql
R__rls_policies_projects.sql

Safe Migration Checklist

ADD COLUMN

Always ADD COLUMN ... DEFAULT ... NULL first, then backfill, then add NOT NULL constraint in a separate migration. Never add NOT NULL without a default in a single step — it acquires an ACCESS EXCLUSIVE lock that blocks all reads on large tables.

ADD INDEX

Use CREATE INDEX CONCURRENTLY on production. Does not block reads or writes. Cannot run inside a transaction — execute as a standalone migration step outside BEGIN/COMMIT.

RENAME COLUMN

Never rename directly in production. The safe path: (1) add new column, (2) dual-write old + new at the application layer, (3) backfill new from old, (4) cut over reads to new column, (5) drop old column after one release cycle.

ALTER TYPE

Adding values to an existing ENUM with ALTER TYPE ... ADD VALUE is safe and does not lock the table in PostgreSQL 12+. Removing or reordering enum values requires creating a new type and migrating — schedule during a maintenance window.

DROP TABLE

Never drop in the same release that removes application code references. Sequence: (1) deploy code that no longer writes to the table, (2) wait one release cycle, (3) deploy the DROP in a separate migration. Use DROP TABLE IF EXISTS for idempotency.

PARTITIONING

Converting an existing table to partitioned (e.g., audit_log) cannot be done in-place. Use the pg_partman approach: create a new partitioned parent table, swap via table rename during a brief maintenance window, backfill with INSERT INTO ... SELECT using batched chunks.

Migration Runner Integration

deploy pipeline shell

# Run in CI/CD before deploying application code
# Migrations must be backward-compatible with the previous app version

# Flyway example
flyway -url="jdbc:postgresql://host/groundwork" \
       -user="$DB_USER" \
       -password="$DB_PASSWORD" \
       -locations="filesystem:./migrations" \
       migrate

# Verify: check applied migrations match expected count
flyway info | grep "Pending" | wc -l  # should be 0 after successful migrate

Seed Data

Data that ships with the database in every environment (dev, staging, production). Seed data is idempotent — running it twice produces the same state.

Enum values are defined in the DDL (not as lookup tables) so they do not require seed rows. The seed covers: a default admin user for local development, and any reference data used by the application layer.

seed_dev.sql — development environment only SQL

-- Development seed: creates a known admin user.
-- Password: 'devpassword123' (Argon2id hash shown below is illustrative)
-- DO NOT run this seed in production.

INSERT INTO users (id, email, name, role, password_hash)
VALUES (
  '00000000-0000-0000-0000-000000000001',
  'admin@groundwork.dev',
  'Groundwork Admin',
  'admin',
  '$argon2id$v=19$m=65536,t=3,p=4$...'
)
ON CONFLICT (id) DO NOTHING;

-- Sample homeowner
INSERT INTO users (id, email, name, role, password_hash)
VALUES (
  '00000000-0000-0000-0000-000000000002',
  'homeowner@groundwork.dev',
  'Jane Homeowner',
  'homeowner',
  '$argon2id$v=19$m=65536,t=3,p=4$...'
)
ON CONFLICT (id) DO NOTHING;

-- Sample contractor + profile
INSERT INTO users (id, email, name, role, password_hash)
VALUES (
  '00000000-0000-0000-0000-000000000003',
  'contractor@groundwork.dev',
  'Bob Builder',
  'contractor',
  '$argon2id$v=19$m=65536,t=3,p=4$...'
)
ON CONFLICT (id) DO NOTHING;

INSERT INTO contractor_profiles
  (user_id, company_name, license_state,
   license_verified, insurance_verified, years_in_business)
VALUES (
  '00000000-0000-0000-0000-000000000003',
  'Bob Builder LLC',
  'CA',
  TRUE, TRUE, 12
)
ON CONFLICT (user_id) DO NOTHING;

seed_production.sql — safe for all environments SQL

-- No user records in production seed.
-- Enums are in DDL. No lookup tables required.
--
-- What does ship to production:
--   1. Database configuration settings
--   2. pg_partman setup for audit_log partitioning
--   3. Nightly maintenance job configuration

-- Ensure pg_partman extension is available
CREATE EXTENSION IF NOT EXISTS pg_partman SCHEMA partman;
CREATE EXTENSION IF NOT EXISTS pgcrypto;
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;

-- Register audit_log with pg_partman for automatic monthly partition creation
SELECT partman.create_parent(
  p_parent_table => 'public.audit_log',
  p_control      => 'created_at',
  p_interval     => 'monthly',
  p_premake      => 3   -- pre-create 3 months ahead
);

-- Activity feed: partition when it approaches 500K rows (Year 2+)
-- Register same way when partitioning is added via migration.

Top 10 Query Patterns

The highest-frequency, highest-importance queries. Each includes expected result cardinality, index coverage, and notes on optimization.

Q1 — Homeowner dashboard: all my active projects

~5 rows idx_projects_homeowner_id

SELECT
  p.id, p.name, p.status, p.current_total,
  p.expected_end_date,
  (SELECT COUNT(*)::INT FROM change_orders co
   WHERE co.project_id = p.id AND co.status = 'pending')   AS pending_co_count,
  (SELECT COUNT(*)::INT FROM payments pm
   WHERE pm.project_id = p.id AND pm.status = 'overdue')  AS overdue_payments,
  chs.overall_score
FROM projects p
LEFT JOIN LATERAL (
  SELECT overall_score
  FROM contract_health_scores
  WHERE project_id = p.id
  ORDER BY created_at DESC LIMIT 1
) chs ON TRUE
WHERE p.homeowner_id = $1
  AND p.status IN ('active', 'paused', 'disputed')
ORDER BY p.updated_at DESC;

-- Index coverage: idx_projects_homeowner_id + idx_chs_project_created
-- LATERAL avoids N+1 for latest score. Subqueries use partial indexes on status.

Q2 — Project activity feed (paginated, newest first)

~20 rows idx_activity_project_created

SELECT
  af.id, af.event_type, af.source,
  af.description, af.metadata, af.created_at,
  u.name AS actor_name
FROM activity_feed af
LEFT JOIN users u ON u.id = af.actor_id
WHERE af.project_id = $1
  AND af.created_at < $2   -- cursor for keyset pagination
ORDER BY af.created_at DESC
LIMIT 20;

-- Keyset pagination on created_at is O(log N) vs OFFSET which is O(N).
-- Index: idx_activity_project_created covers (project_id, created_at DESC).

Q3 — Project milestone timeline with payment linkage

~8 rows idx_milestones_project_id

SELECT
  m.id, m.name, m.sequence_order,
  m.status, m.planned_start, m.planned_end,
  m.actual_start, m.actual_end,
  m.depends_on_milestone_id,
  p.id AS payment_id,
  p.amount AS payment_amount,
  p.status AS payment_status
FROM milestones m
LEFT JOIN payments p ON p.milestone_id = m.id
WHERE m.project_id = $1
ORDER BY m.sequence_order;

Q4 — Pending change orders for homeowner review

~3 rows idx_change_orders_project_status

SELECT
  co.id, co.description, co.amount,
  co.markup_percentage, co.type,
  co.groundwork_analysis->>'reasonableness_score' AS ai_score,
  co.groundwork_analysis->>'type_classification'   AS ai_classification,
  co.created_at,
  u.name AS submitted_by_name
FROM change_orders co
JOIN  users u ON u.id = co.submitted_by
WHERE co.project_id = $1
  AND co.status = 'pending'
ORDER BY co.created_at ASC;

-- ->> operator extracts text from JSONB without GIN — fine for display.
-- GIN index is used only for @> containment queries across many rows.

Q5 — Contractor's active job list

~15 rows idx_projects_contractor_id

SELECT
  p.id, p.name, p.city, p.state,
  p.status, p.current_total,
  p.start_date, p.expected_end_date,
  (SELECT status FROM milestones
   WHERE project_id = p.id AND status = 'in_progress'
   LIMIT 1)                      AS active_milestone_status,
  u.name                          AS homeowner_name
FROM projects p
JOIN users u ON u.id = p.homeowner_id
WHERE p.contractor_id = $1
  AND p.status IN ('active', 'paused')
ORDER BY p.expected_end_date ASC;

Q6 — Overdue payments across all projects (system cron job)

~50 rows idx_payments_due_overdue

-- Runs daily to identify and transition scheduled payments to overdue
UPDATE payments
SET    status = 'overdue'
WHERE  status = 'due'
  AND  due_date < CURRENT_DATE
RETURNING id, project_id, amount, due_date;

-- Partial index idx_payments_due_overdue covers status IN ('due','overdue')
-- so this update touches only the relevant fraction of the payments table.

Q7 — Latest contract health score per project

1 row idx_chs_project_created

SELECT
  overall_score,
  payment_schedule_score, timeline_specificity_score,
  change_order_clause_score, dispute_resolution_score,
  completion_definition_score, warranty_score,
  insurance_licensing_score, lien_waiver_score,
  cancellation_terms_score,
  analysis_json->>'summary' AS summary,
  created_at
FROM contract_health_scores
WHERE project_id = $1
ORDER BY created_at DESC
LIMIT 1;

-- idx_chs_project_created (project_id, created_at DESC) makes this
-- an index-only scan followed by a single heap fetch. Extremely fast.

Q8 — Admin: change orders with low AI reasonableness score

~20 rows idx_change_orders_analysis_gin

-- Find COs where AI flagged potential overcharging (score < 40)
SELECT
  co.id, co.project_id, co.amount, co.description,
  (co.groundwork_analysis->>'reasonableness_score')::INT AS score,
  co.groundwork_analysis->'flags' AS flags,
  co.created_at
FROM change_orders co
WHERE
  co.groundwork_analysis IS NOT NULL
  AND (co.groundwork_analysis->>'reasonableness_score')::INT < 40
  AND co.status = 'pending'
ORDER BY score ASC
LIMIT 50;

-- For this single-key range query, a B-tree expression index is more
-- efficient than GIN. Add in a future migration once the query pattern
-- is confirmed: CREATE INDEX ON change_orders
-- (((groundwork_analysis->>'reasonableness_score')::int));
-- The GIN index supports ad-hoc JSONB exploration in the meantime.

Q9 — Contractor reputation: historical timeline variance

~30 rows idx_projects_contractor_id

-- Nightly batch: recompute avg_timeline_variance_pct for all contractors
WITH project_variances AS (
  SELECT
    p.contractor_id,
    AVG(
      CASE WHEN p.actual_end_date IS NOT NULL
           THEN (p.actual_end_date - p.expected_end_date)
                / NULLIF((p.expected_end_date - p.start_date), 0) * 100.0
      END
    )::NUMERIC(6,2)  AS avg_variance,
    COUNT(*)          AS project_count
  FROM projects p
  WHERE p.contractor_id IS NOT NULL
    AND p.status = 'completed'
    AND p.start_date IS NOT NULL
  GROUP BY p.contractor_id
)
UPDATE contractor_profiles cp
SET
  avg_timeline_variance_pct = pv.avg_variance,
  total_projects_on_platform = pv.project_count,
  updated_at = now()
FROM project_variances pv
WHERE cp.user_id = pv.contractor_id;

Q10 — Audit trail for a specific project (compliance export)

~200 rows idx_audit_entity

-- Fetch all audit events touching a project and its child entities.
-- Used for dispute resolution and legal discovery.
SELECT
  al.created_at,
  al.action,
  al.entity_type,
  al.entity_id,
  u.email  AS actor_email,
  al.ip_address,
  al.old_value,
  al.new_value
FROM audit_log al
LEFT JOIN users u ON u.id = al.user_id
WHERE
  (al.entity_type = 'projects'     AND al.entity_id = $1)
  OR (al.entity_type = 'milestones'   AND al.entity_id IN
      (SELECT id FROM milestones WHERE project_id = $1))
  OR (al.entity_type = 'change_orders' AND al.entity_id IN
      (SELECT id FROM change_orders WHERE project_id = $1))
  OR (al.entity_type = 'payments'    AND al.entity_id IN
      (SELECT id FROM payments      WHERE project_id = $1))
ORDER BY al.created_at ASC;

-- idx_audit_entity (entity_type, entity_id, created_at DESC) covers each arm.
-- For a monthly partitioned audit_log, Postgres will prune irrelevant partitions
-- when a date range is added to the WHERE clause:
-- AND al.created_at >= $2 AND al.created_at < $3