Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

ADR 018: Database Patterns

Status: Proposed | Date: 2025-07-28

Context

Applications need managed persistent storage for databases, datalakes, and objects with automatic scaling and jurisdiction-compliant backup strategies. Workloads that need shared file-system access are covered by ADR 019: Shared File Access.

Decision

Use Aurora Serverless v2 outside EKS clusters with automated scaling, multi-AZ deployment, and dual backup strategy.

Datalakes: Separate the storage format from the access layer:

  • Storage layer: store analytical data in object storage with open table formats
  • Lightweight access layer: use DuckLake with a DuckDB client for local development, scheduled jobs, and simpler analytical workloads
  • Serverless Iceberg access layer: use Amazon S3 Tables for managed Apache Iceberg tables when workloads need AWS-managed table maintenance or multi-engine access
  • Distributed query access layer: use Trino or equivalent Iceberg-compatible engines when workloads need concurrent or larger-scale querying

DuckLake and S3 Tables are not an either/or decision. Choose the access layer per workload while keeping data in object storage and open table formats where practical. See Reference Architecture: Data Pipelines for full datalake patterns.

Implementation

  • Database: Aurora Serverless v2 (PostgreSQL/MySQL) with built-in connection pooling and automatic scaling
  • Datalake Storage: S3-compatible object storage with open table formats for analytics data
  • Datalake Access: DuckDB clients for DuckLake workloads; S3 Tables, Trino, or equivalent Iceberg-compatible engines for serverless or distributed access
  • Object Storage: Amazon S3 for files and objects. Use ADR 019: Shared File Access when workloads need file-system access to object-backed files
  • Deployment: Outside EKS cluster (handles complexity automatically)
  • Credentials: Follow ADR 005: Secrets Management for endpoint and credential management
  • Backup: Follow ADR 014: Object Storage Backups plus AWS automated snapshots
  • Security: Follow ADR 007: Centralised Security Logging and ADR 012: Privileged Remote Access

Consequences

Benefits:

  • Serverless scaling reducing operational costs during low usage periods
  • Automated high availability with managed backup strategies per ADR 014: Object Backup
  • Compliance with jurisdiction requirements through dual backup approach

Risks if not implemented:

  • High operational overhead managing database infrastructure
  • Inconsistent backup strategies across database systems
  • Cost inefficiency from overprovisioned database resources