Understanding video formats
Containers, codecs, and three numbers that matter.
The dozen common video formats organised by what they actually are, and the codec decisions that determine file size and playback compatibility.
The two-layer model.
Every video file is a container holding one or more streams. The container (MP4, MKV, MOV, WebM, AVI) is the file format wrapper; it carries metadata, indexes, and the multiplexed streams. The video codec inside it (H.264, H.265, AV1, VP9, ProRes) is the algorithm that compresses the pixels. The audio codec is separate. "MP4" doesn't tell you everything; "MP4 with H.264 video + AAC audio" tells you the full picture.
The four codecs to know.
H.264 (2003): the universal default. Plays everywhere — every browser, every phone, every TV. Reasonably efficient by modern standards. H.265 / HEVC (2013): ~50 % smaller than H.264 at equivalent quality, but encumbered with patents that made adoption uneven. AV1 (2018): royalty-free, ~30 % smaller than H.265, slowly replacing it for streaming. VP9 (2013): Google's contribution, used by YouTube, ~50 % smaller than H.264.
Three numbers govern file size.
Resolution (pixels per frame — 1920×1080 vs 3840×2160), frame rate (frames per second — 30 vs 60), and bitrate (bits per second — 5 Mbps vs 50 Mbps). Higher of any one means bigger file. Bitrate is the dial you actually turn; resolution and frame rate are usually fixed by the source. A 1080p H.264 video at 5 Mbps is YouTube quality; at 10 Mbps it's Blu-ray quality; at 1 Mbps it's tolerable for tutorials but ugly for fast motion.
A worked conversion.
A 10-minute 1080p screen recording, recorded at 60 fps, output as H.264 MOV at 50 Mbps (3.5 GB). Convert for web: H.264 MP4 at 5 Mbps, same resolution and frame rate. Result: ~350 MB, 10× smaller, visually almost identical for screen-recording content (low motion, lots of flat colour). The same source converted to AV1 at 3 Mbps: ~210 MB, looks just as good, takes 5× longer to encode.
Same resolution, different bitrate
50 Mbps source → 5 Mbps web → 3 Mbps AV1
Bitrate is the dial; codec efficiency picks the right value.
3.5GB → 350MB (H.264) or 210MB (AV1)
= Right-sized for delivery
Constant vs variable bitrate.
CBR (constant bitrate) allocates the same bits per second across the entire video — predictable file size, used in broadcast. VBR (variable bitrate) allocates more bits to complex scenes (fast motion, scene changes) and fewer to easy scenes (static shots). VBR produces better visual quality for the same average bitrate, at the cost of less predictable size. For files you'll deliver over a network, VBR with a target average is the right default.
Compatibility map.
MP4 + H.264 + AAC: plays everywhere, the safe default for anything you're sharing. WebM + VP9 / AV1 + Opus: smaller, browser-native, but iOS Safari support has historically lagged (improving). MOV: Apple's container, mostly the same as MP4 internally, used in ProRes professional workflows. MKV: open, supports anything, standard for archives and torrents; not natively supported on iOS Safari or Quicktime.
Encoding takes real time.
Video encoding is CPU-expensive. A 1-minute 1080p H.264 encode takes ~30 seconds on a modern laptop CPU; AV1 at the same settings takes 5-10 minutes; H.265 sits between them. GPU-accelerated encoders (NVENC, QuickSync, VideoToolbox) are 5-10× faster at modest quality cost. For browser-based encoding via ffmpeg.wasm, expect slower than native — WebAssembly's no-SIMD penalty hits hard on these workloads.