SQL Data Types
A practical reference for system design -- which types to use, how much storage they cost, and the gotchas.
Integer Types
| Type | Size | Range | Use case |
|---|---|---|---|
TINYINT |
1 byte | -128 to 127 (0–255 unsigned) | Boolean flags, status codes |
SMALLINT |
2 bytes | -32K to 32K | Port numbers, small counters |
MEDIUMINT |
3 bytes | -8M to 8M | Medium counters (MySQL only) |
INT |
4 bytes | ±2.1 billion | User IDs, most counters |
BIGINT |
8 bytes | ±9.2 quintillion | Epoch timestamps, snowflake IDs |
Rule of thumb:
- Counting normal things (users, products)? →
INT - Timestamps, distributed IDs, massive scale? →
BIGINT
String Types
| Type | Max Size | Storage | Use case |
|---|---|---|---|
CHAR(n) |
255 bytes | Always n bytes (padded) |
Fixed-length codes ("US", "USD") |
VARCHAR(n) |
65,535 bytes* | Actual length + 1–2 bytes | Keys, emails, usernames |
TEXT |
65 KB | Actual length + 2 bytes | Blog posts, descriptions |
MEDIUMTEXT |
16 MB | Actual length + 3 bytes | Large documents |
LONGTEXT |
4 GB | Actual length + 4 bytes | Huge blobs (rarely used) |
*VARCHAR max depends on row size limit
Key insight: VARCHAR(255) storing "hello" uses 7 bytes (5 + 2 length bytes), not 255. Only stores what it needs.
graph LR
subgraph "CHAR(5) storing 'hi'"
C1["h"] --- C2["i"] --- C3["_"] --- C4["_"] --- C5["_"]
end
subgraph "VARCHAR(5) storing 'hi'"
V0["len=2"] --- V1["h"] --- V2["i"]
end
style C3 fill:#4a3a1a,stroke:#ffaa44
style C4 fill:#4a3a1a,stroke:#ffaa44
style C5 fill:#4a3a1a,stroke:#ffaa44
style V0 fill:#1a3a4a,stroke:#44aaff
Rule of thumb: VARCHAR is the default. Use CHAR only for truly fixed-width data. Use TEXT when you can't predict length.
Date/Time Types
| Type | Size | Stores | Example |
|---|---|---|---|
DATE |
3 bytes | Date only | 2025-03-15 |
TIME |
3 bytes | Time only | 14:30:00 |
DATETIME |
8 bytes | Date + time, no timezone | 2025-03-15 14:30:00 |
TIMESTAMP |
4 bytes | Date + time, stored as UTC epoch | 1970 – 2038 |
The 2038 Problem
TIMESTAMP is stored as a 32-bit integer (seconds since 1970-01-01). It overflows on January 19, 2038.
graph LR
A["1970-01-01<br/>epoch = 0"] --> B["2025-03-15<br/>epoch = 1,741,034,400"] --> C["2038-01-19<br/>epoch = 2,147,483,647<br/>💥 OVERFLOW"]
style C fill:#4a1a1a,stroke:#ff4444
Workaround: Store epoch as BIGINT — good for billions of years.
Getting Epoch in SQL
| Database | Function |
|---|---|
| MySQL | UNIX_TIMESTAMP() |
| PostgreSQL | EXTRACT(EPOCH FROM NOW())::BIGINT |
NOW() returns a datetime, not an integer. Can't compare directly with a BIGINT column.
Floating Point Types
| Type | Size | Precision | Use case |
|---|---|---|---|
FLOAT |
4 bytes | ~7 digits | Approximate values (rarely used) |
DOUBLE |
8 bytes | ~15 digits | Scientific, coordinates |
DECIMAL(m,d) |
Variable | Exact | Money! DECIMAL(10,2) → 99999999.99 |
Golden rule: Never use FLOAT/DOUBLE for money. Floating point rounding will bite you.
Binary & Blob Types
| Type | Max Size | Use case |
|---|---|---|
BINARY(n) |
255 bytes | Fixed-length hashes, UUIDs |
VARBINARY(n) |
65 KB | Variable binary data |
BLOB |
65 KB | Small files (usually use S3 instead) |
Boolean
BOOLEAN is TINYINT(1) under the hood. 0 = false, 1 = true.
Quick Decision Guide
flowchart TD
A[What are you storing?] --> B{Numbers?}
A --> C{Text?}
A --> D{Date/Time?}
A --> E{Money?}
B -->|"Counting things"| B1["INT (4 bytes)"]
B -->|"Timestamps/IDs"| B2["BIGINT (8 bytes)"]
B -->|"Flags"| B3["TINYINT (1 byte)"]
C -->|"Fixed width<br/>(country codes)"| C1["CHAR(n)"]
C -->|"Variable, bounded<br/>(keys, emails)"| C2["VARCHAR(n)"]
C -->|"Unbounded text"| C3["TEXT"]
D -->|"Need timezone"| D1["BIGINT epoch<br/>(avoids 2038)"]
D -->|"Just a date"| D2["DATE"]
E --> E1["DECIMAL(m,d)<br/>NEVER FLOAT"]
style E1 fill:#4a1a1a,stroke:#ff4444
style D1 fill:#1a3a4a,stroke:#44aaff