Skip to content

PII Inventory

Every column in the BreezyCorp database that holds personally identifiable information, classified per the Singapore PDPA. This is the source of truth for:

  • Pino log redaction config (apps/api/src/app.ts)
  • Sentry beforeSend scrub rules (Phase 3G, post-MVP)
  • Access audit queries during a tenant-leak investigation
  • DPIA / data subject access requests

Classification levels (increasingly sensitive):

  • PUBLIC — no access control needed
  • INTERNAL — Spade employees only
  • CONFIDENTIAL — client + Spade ops only; logged reads
  • RESTRICTED — encrypted at rest; logged reads + writes

Staff users

ColumnClassificationAt restRedacted in logsNotes
staff_users.emailCONFIDENTIALplainyes (regex in breadcrumbs)Used for login
staff_users.nameCONFIDENTIALplainnoDisplayed in audit trail
staff_users.password_hashRESTRICTEDbcrypt (cost 12)yes (*.password*)Not reversible
staff_users.mfa_secretRESTRICTEDAES-256-GCMyes (*.mfaSecret, *.mfa_secret)Encrypted with MFA_SECRET_KEY
staff_sessions.token_hashRESTRICTEDSHA-256yes (*.token*)Raw token never persisted
staff_sessions.csrf_tokenCONFIDENTIALplainyes (req.headers["x-csrf-token"])Double-submit pattern

Client contacts (portal users)

ColumnClassificationAt restRedacted in logsNotes
client_contacts.emailCONFIDENTIALplainyes (regex)Used for magic-link delivery
client_contacts.nameCONFIDENTIALplainnoDisplayed in audit
cycle_requests.token_hashRESTRICTEDSHA-256yesMagic-link token hashed
cycle_requests.recipient_emailCONFIDENTIALplainyes (regex)

Employee data (the high-sensitivity tier)

ColumnClassificationAt restRedacted in logsNotes
employee_shadow_snapshots.external_employee_idCONFIDENTIALplainnoClient-assigned ref; not globally unique
employee_shadow_snapshots.employee_nameCONFIDENTIALplainno
employee_shadow_snapshots.gross_payRESTRICTEDplain (DB-level encryption at rest via provider)yes (*.grossPay, *.gross_pay)Salary data
employee_shadow_snapshots.net_payRESTRICTEDplainyes (*.netPay, *.net_pay)
employee_shadow_snapshots.cpf_employeeRESTRICTEDplainyes (*.cpfEmployee, *.cpf_employee)CPF filing data
employee_shadow_snapshots.cpf_employerRESTRICTEDplainyes
employee_shadow_snapshots.ytd_ordinary_wageRESTRICTEDplainyes
employee_shadow_snapshots.ytd_additional_wageRESTRICTEDplainyesCPF AW ceiling calc
employee_shadow_snapshots.is_foreignCONFIDENTIALplainnoIR21 trigger
employee_shadow_snapshots.snapshot_jsonRESTRICTEDplain JSONByes (whole blob)Contains all of the above plus provider-specific fields

Submission + payload data

ColumnClassificationAt restRedacted in logsNotes
submissions.attestation_textINTERNALplainnoFree-text client attestation
submissions.monthly_declarationCONFIDENTIALplain JSONB*.declarantName partialDeclarant name is PII
submission_items.employee_refCONFIDENTIALplainno
submission_items.payload_jsonRESTRICTEDplain JSONByes (whole blob)Contains salary/NRIC/bank depending on change type

Payload JSON fields commonly contain:

  • previousSalary, newSalary — RESTRICTED
  • bankAccount, bankCode — RESTRICTED (redacted as *.bank_account, *.bankAccount)
  • nric, fin — RESTRICTED (redacted as *.nric, *.fin)
  • reason, position — CONFIDENTIAL

Files (uploaded documents)

ItemClassificationNotes
File bytes in S3RESTRICTEDContains offer letters, payslips, IR21 forms — highly sensitive. Encrypted at rest via S3 SSE-KMS.
files.original_nameCONFIDENTIALMay leak identity via filename
files.sha256INTERNALContent hash only
document_classifications.provider_payload_jsonRESTRICTEDOCR output may contain extracted PII
extracted_fields.raw_valueRESTRICTEDOCR extractions — redacted via *.rawValue

Audit events

ColumnClassificationAt restRedacted in logsNotes
audit_events.event_data_jsonVariesplainyes (selective — the outer blob is logged, sensitive fields inside are redacted via pino's deep-path config)Contains context for each audit event
audit_events.actor_idINTERNALplainnoStaff user id or 'portal'

Audit events are retained 7 years and never purged by the retention job (see docs/retention-policy.md).

Cross-reference: redaction configs

Pino redact paths (in apps/api/src/app.ts)

Every RESTRICTED column above must have a corresponding redact path. Current config covers:

req.headers.authorization
req.headers.cookie
req.headers["x-csrf-token"]
req.headers["x-ocr-signature"]
res.headers["set-cookie"]
*.password            *.passwordHash       *.password_hash
*.oldPassword         *.newPassword        *.totpCode
*.mfaSecret           *.mfa_secret         *.sessionToken     *.session_token
*.token               *.tokenHash          *.token_hash
*.csrfToken           *.csrf_token         *.secret
*.apiKey              *.api_key
*.nric                *.fin                *.bankAccount      *.bank_account

Missing from current redact config (TODO — wire these before post-MVP Sentry):

  • *.grossPay, *.gross_pay, *.netPay, *.net_pay
  • *.cpfEmployee, *.cpf_employee, *.cpfEmployer, *.cpf_employer
  • *.ytdOrdinaryWage, *.ytd_ordinary_wage, *.ytdAdditionalWage, *.ytd_additional_wage
  • *.previousSalary, *.newSalary, *.salary
  • *.snapshot_json, *.snapshotJson
  • *.payload_json, *.payloadJson
  • *.raw_value, *.rawValue

These should land when we do 3G (Sentry integration). Until then, log output in dev is verbose but production LOG_LEVEL=info keeps request bodies out of logs at the level anyway.

Data subject access / deletion requests (PDPA)

When a data subject (employee) requests access / deletion:

  1. Locate: query employee_shadow_snapshots, submission_items, extracted_fields for the employee ref
  2. Scope: verify the employee is associated with the requesting client (tenant scope)
  3. Decide: for deletion requests under PDPA, check whether the record is still within the 5-year tax retention window — tax records cannot be deleted on request, they must age out
  4. Act: via scripts/gdpr-export.ts / scripts/gdpr-delete.ts (TODO — build when first request lands)
  5. Log: every DSAR access creates an audit_events row with event_type = 'dsar.access' or dsar.delete

Review cadence

  • Quarterly: engineering reviews this doc against the schema for drift
  • Annually: legal + DPO review for regulatory change
  • On schema change: any PR adding a column to employee_shadow_snapshots, submission_items, or files must update this inventory in the same commit

Internal use only — BreezyCorp