Database
A database is an organized system for storing, retrieving, and changing data. The important part is not only that it stores data, but that it gives the application a model for querying, consistency, durability, and coordination.
Different databases optimize for different access patterns. A relational database, a search engine, a key-value store, and a columnar analytical database all store data, but they make different tradeoffs around schema, latency, transactions, indexing, and scale.
Types of databases
- Relational database — tables, SQL, joins, constraints, and transactions.
- Document database — JSON-like documents grouped into collections.
- Key-value database — direct lookup by key, usually optimized for speed and simplicity.
- Columnar database — analytical storage optimized for scanning columns across many rows.
- Graph database — nodes and edges for relationship-heavy data.
- Time-series database — timestamped events, metrics, and measurements.
- Search database — indexes for text search, relevance, and filtering.
Theory
- ACID — transaction guarantees: atomicity, consistency, isolation, durability.
- Postgres transactions — how transactions work in practice.
- Postgres indexes — data structures for faster reads.
- Postgres query planner — how SQL gets turned into an execution plan.
- Postgres WAL — write-ahead logging and durability.
- Postgres replication — copying database changes to other nodes.
- Postgres vacuum — cleaning old row versions in MVCC.
Systems
- Postgres — relational database.
- Redis — in-memory data store often used for caching, queues, and coordination.
- Qdrant — vector database for similarity search.
- Pinecone — managed vector database.
Questions to ask
- What access pattern is this database optimized for?
- Does the system need transactions, search, analytics, caching, or relationship traversal?
- Which invariants belong in the database, and which belong in the application?
- What becomes expensive as the dataset grows?