Introduction
Choosing the right database is one of the most important architectural decisions you'll make. SQL (relational) and NoSQL (non-relational) databases each have strengths, and modern applications often use both.
SQL Databases: Structured and ACID
SQL databases organize data in tables with rows and columns. They use Structured Query Language (SQL) for queries and enforce ACID properties.
ACID Properties:
- Atomicity — Transactions are all-or-nothing
- Consistency — Data always valid according to rules
- Isolation — Concurrent transactions don't interfere
- Durability — Committed data survives failures
SQL Database Characteristics:
- Fixed schema (columns defined upfront)
- Relationships via foreign keys
- Complex queries with JOINs
- Strong consistency
- Vertical scaling (bigger servers)
Popular SQL Databases:
- PostgreSQL — Feature-rich, extensible, open source
- MySQL — Popular, widely supported
- SQL Server — Microsoft ecosystem
- Oracle — Enterprise, feature-rich
- Cloud: Azure SQL, Amazon RDS, Cloud SQL
When to Use SQL:
- Complex queries and reporting
- Transactions requiring ACID (financial, inventory)
- Well-defined, stable schema
- Relationships between entities matter
- Data integrity is critical
NoSQL Databases: Flexible and Scalable
NoSQL databases don't use traditional table structures. They're designed for scalability, flexibility, and specific access patterns.
BASE Properties (NoSQL Philosophy):
- Basically Available — System always responds
- Soft state — State may change over time
- Eventually consistent — Data converges to consistency
NoSQL Categories:
Document Databases:
- Store JSON/BSON documents
- Flexible schema, nested structures
- Query by any field
- Examples: MongoDB, Cosmos DB, Firestore, Couchbase
Key-Value Stores:
- Simple key → value pairs
- Extremely fast for known-key lookups
- Limited querying capabilities
- Examples: Redis, DynamoDB, Memcached
Column-Family Stores:
- Store data in column families
- Optimized for writes and time-series
- Examples: Cassandra, HBase, Bigtable
Graph Databases:
- Nodes and edges (relationships)
- Optimized for relationship queries
- Examples: Neo4j, Amazon Neptune, Cosmos DB (Gremlin)
Side-by-Side Comparison
Schema:
- SQL: Fixed schema, ALTER TABLE to change
- NoSQL: Flexible/dynamic schema
Scaling:
- SQL: Vertical (scale up)
- NoSQL: Horizontal (scale out)
Queries:
- SQL: Complex JOINs, aggregations
- NoSQL: Optimized for specific access patterns
Consistency:
- SQL: Strong (ACID)
- NoSQL: Configurable (eventual to strong)
Transactions:
- SQL: Multi-row, multi-table
- NoSQL: Often single-document/item
Choosing the Right Database
Choose SQL When:
- Data has clear relationships
- You need complex queries/reporting
- Transactions span multiple entities
- Schema is stable and well-understood
- ACID compliance is required
Choose Document DB When:
- Data is hierarchical/nested
- Schema evolves frequently
- You query by various attributes
- Scaling reads is important
Choose Key-Value When:
- Simple get/put by known key
- Caching, session storage
- Extreme performance needed
- Data structure is simple
Choose Column-Family When:
- Write-heavy workloads
- Time-series data
- Need massive scale
- Query patterns are known upfront
Choose Graph DB When:
- Relationships ARE the data
- Social networks, recommendations
- Fraud detection, knowledge graphs
- Deep traversal queries
Real-World Architecture Patterns
Polyglot Persistence: Many applications use multiple databases:
User Profiles → MongoDB (flexible schema)
Transactions → PostgreSQL (ACID)
Session Cache → Redis (speed)
Analytics → ClickHouse (columnar)
Recommendations → Neo4j (relationships)
Event Sourcing:
Events → Kafka → Event Store (append-only)
→ Read Models (NoSQL projections)
CQRS (Command Query Responsibility Segregation):
Writes → SQL (normalized, ACID)
Reads → NoSQL (denormalized, fast)
Cloud Database Services
Azure:
- Azure SQL Database (managed SQL Server)
- Cosmos DB (multi-model NoSQL)
- Azure Database for PostgreSQL/MySQL
AWS:
- RDS (managed SQL)
- DynamoDB (key-value/document)
- Aurora (MySQL/PostgreSQL compatible)
- DocumentDB (MongoDB compatible)
- Neptune (graph)
Google Cloud:
- Cloud SQL (managed SQL)
- Cloud Spanner (globally distributed SQL)
- Firestore (document)
- Bigtable (column-family)
Performance Considerations
SQL Optimization:
- Proper indexing
- Query optimization
- Connection pooling
- Read replicas
- Caching layer
NoSQL Optimization:
- Partition key design (critical!)
- Denormalization
- Avoid hot partitions
- Use TTL for expiring data
- Proper capacity planning
Common Mistakes
SQL Mistakes:
- Over-normalization (too many JOINs)
- Missing indexes
- N+1 query problem
- Not using connection pooling
NoSQL Mistakes:
- Using NoSQL for relational data
- Poor partition key choice
- Expecting SQL-like queries
- Ignoring consistency requirements
- Over-fetching data
Migration Considerations
SQL to NoSQL:
- Denormalize data (embed related data)
- Design for access patterns
- Handle eventual consistency in app
- Migrate incrementally
NoSQL to SQL:
- Normalize data
- Define schema and relationships
- Handle schema evolution
- May need data cleanup
Exam Tips
For Azure (AZ-900, AZ-204, AZ-305):
- Cosmos DB APIs and consistency levels
- Azure SQL vs Cosmos DB use cases
- Know DTU vs vCore models
For AWS (Cloud Practitioner, SAA):
- DynamoDB partition keys and capacity
- RDS vs Aurora vs DynamoDB
- DAX for DynamoDB caching
For Database Certifications:
- Deep SQL knowledge
- Indexing strategies
- Query optimization
- ACID vs BASE trade-offs
Key Takeaway
SQL databases excel at complex queries, transactions, and data integrity. NoSQL databases shine for scalability, flexibility, and specific access patterns. Modern architectures often use both—choose based on your data model, query patterns, consistency needs, and scale requirements. There's no universal "best" choice; it depends on your use case.
