Data Retention Policy
Per-record-type retention based on Singapore regulatory requirements. Verify with legal before production launch — the values here are engineering's best-effort reading of the applicable acts.
Regulatory basis
| Act | Scope | Typical retention |
|---|---|---|
| Employment Act (MOM) § 95 | Employment records | 2 years after employment ends |
| IRAS (Income Tax Act + CPF Act) | Tax records, CPF filings | 5 years from the year of assessment |
| Personal Data Protection Act (PDPA) | Any personal data | As long as necessary + purpose limitation |
The overlap is: payroll records include tax-relevant data, so IRAS's 5-year floor dominates the Employment Act's 2-year minimum.
Per-entity retention
| Entity | Retention | Basis | Purge strategy |
|---|---|---|---|
EmployeeShadowSnapshot | 5 years from cycle archive | IRAS (CPF YTD, wage data) | Batch delete in retention-purge job |
ExportRow | 5 years from cycle archive | IRAS (constitutes the payroll filing record) | Batch delete with ExportBatch |
ExportBatch | 5 years from cycle archive | IRAS | Cascade delete rows + file |
OutputRow | 5 years from cycle archive | IRAS (reconciliation evidence) | Batch delete |
OutputBatch | 5 years from cycle archive | IRAS | Cascade |
Submission + SubmissionItem | 5 years from cycle archive | IRAS (supporting payroll decisions) | Cascade |
File (fileKind=UPLOAD) | 5 years from cycle archive | IRAS (supporting documents) | Delete DB row + S3 object |
File (fileKind=GENERATED_EXPORT) | 5 years from cycle archive | IRAS | Delete DB row + S3 object |
File (fileKind=IMPORTED_OUTPUT) | 5 years from cycle archive | IRAS | Delete DB row + S3 object |
AuditEvent | 7 years | Best practice + PDPA breach trail | Never purged by retention job — manual archival to cold storage |
StaffSession | 90 days | Security hygiene | Daily purge of expired sessions |
CycleRequest (magic-link) | 30 days after expiresAt | Security hygiene | Daily purge |
DocumentClassification, DocumentExtraction, ExtractedField | Same as parent File | PDPA (derived from personal data) | Cascade with file delete |
PostPayrollEvidence | 5 years from cycle archive | IRAS | Delete with cycle |
ValidationRun, ValidationResult, WorkflowIssue | 5 years from cycle archive | IRAS supporting evidence | Cascade |
OutboxEvent (processed) | 30 days after processedAt | Operational — not regulated | Daily purge |
OutboxEvent (dead-lettered) | Retained indefinitely | Incident investigation | Manual review + dismissal |
What the retention-purge job does
apps/worker/src/handlers/retention-purge.ts:
- Finds cycles where
overallStatus = 'ARCHIVED'ANDclosedAt < now - 5 years - For each such cycle, emits a
retention.purge_startedaudit event before deleting anything (so the audit trail captures the purge itself) - Deletes in FK-safe order: export_rows → export_batches → output_rows → output_batches → validation_results → validation_runs → workflow_issues → submission_items → submissions → employee_shadow_snapshots → post_payroll_evidence → files → cycle_requests → payroll_cycles
- For each deleted file, enqueues a
delete-s3-objectjob so the S3 bytes are removed - Emits a
retention.purge_completedaudit event with the count of rows removed - Never touches
audit_events,staff_users,clients,client_contacts
Cadence
- Scheduled monthly, 03:00 SGT on the 1st of the month
- Runs in a dry-run mode (log only, no deletes) during the first month of any new deployment
- Retention run completion is a monitored alert — missed runs page on-call
Safeguards
- Client-level opt-out: a
clients.retentionExemptflag (not yet implemented — TODO) lets compliance teams pin a client's data for legal hold - Legal hold override: manually setting
payroll_cycles.retentionHoldUntil(TODO) blocks purge even if the age threshold is met - Dry-run mode:
RETENTION_DRY_RUN=truelogs what would be deleted without executing
What's NOT purged
clientsandclient_contacts— these are master data, retained indefinitelyaudit_events— 7-year retention, archived manually to cold storage after thatstaff_users— kept indefinitely for audit attribution (deactivated viaisActive = falseinstead)document_requirement_rulesandclient_auth_policies— configuration, retained indefinitelyexport_template_versions— template history must survive for re-parse capability
Verification
After every retention run, an ops engineer checks:
sql
-- No ARCHIVED cycles older than 5 years should exist
SELECT COUNT(*) FROM payroll_cycles
WHERE overall_status = 'ARCHIVED' AND closed_at < NOW() - INTERVAL '5 years';
-- Audit events for the purge run
SELECT * FROM audit_events
WHERE event_type IN ('retention.purge_started', 'retention.purge_completed')
AND occurred_at > NOW() - INTERVAL '1 day';Open TODOs
- [ ] Legal review of the Singapore retention values
- [ ] Implement
clients.retentionExemptflag - [ ] Implement
payroll_cycles.retentionHoldUntil - [ ] Archive job that copies
audit_eventsolder than 7 years to cold storage before any deletion is considered - [ ] Build the retention-purge scheduled job (currently the handler exists but is not scheduled — see
apps/worker/src/schedules/index.ts)