Understanding video compression
The CRF dial, not the bitrate.
Why "set the bitrate" is the wrong way to compress, what CRF actually means, and the presets that trade encoding time for file size.
CRF — Constant Rate Factor.
CRF is the right way to compress with H.264 / H.265 / AV1. Instead of "use this bitrate", you say "maintain this perceived quality across the whole file" and the encoder allocates bits where it needs them. The CRF dial runs 0 (lossless, huge file) to 51 (terrible, tiny file). Visually-lossless H.264 is around CRF 18; broadcast-quality is CRF 21; web-quality is CRF 23; aggressive web compression is CRF 26.
Presets — speed vs efficiency.
The same CRF at different presets produces different files. ultrafast encodes 5-10× faster than the default medium but produces files 20-30 % larger at the same quality. slow and slower take 2-5× longer than medium and produce 10-20 % smaller files. The sweet spot for offline encoding is usually medium or slow; for real-time (streaming, screen capture), fast or veryfast.
A worked compression.
A 1080p screen recording, 500 MB original, target a small file for sharing. ffmpeg -i in.mp4 -c:v libx264 -crf 23 -preset medium -c:a aac -b:a 128k out.mp4 — about 80 MB output for screen-recording content (low motion compresses well). Compare with H.265: ffmpeg -i in.mp4 -c:v libx265 -crf 26 -preset medium ... — about 50 MB. AV1 even smaller again, but encoding takes 10× longer.
H.264 web preset
-crf 23 -preset medium
The defaults most YouTube tutorials suggest.
500MB → ~80MB on typical content
= 6× smaller, visually fine
H.265 web preset
-crf 26 -preset medium (note: different CRF scale!)
H.265 CRF is roughly +3 over H.264 for equivalent quality.
500MB → ~50MB
= 10× smaller, same visual
The audio is rarely the problem.
For a typical 10-minute 1080p video, the video stream is 99 % of the file size and audio is 1 %. Aggressive audio compression (64 kbps mono) shaves a few MB. Aggressive video compression saves hundreds. Don't bother optimising audio for file size unless you're also fine with the video being aggressively compressed; the time is better spent on the video codec settings.
Two-pass encoding.
For "I need exactly this file size" (target streaming bitrate, hard cap on deliverable), two-pass encoding gets closer than CRF. First pass analyses the source; second pass encodes with the bits allocated where they matter. CRF is better for "I want this quality"; two-pass average bitrate is better for "I want this size". Most use cases are the former.
Don't transcode if you don't have to.
A 50 MB H.264 video doesn't need re-encoding to be uploaded; the file is already efficient. The case for compression is a clearly oversized source: a 2 GB screen recording for a 5-minute clip, a 4K source you'll deliver at 1080p, a raw ProRes export. If the source is already reasonably sized for the medium, leave it alone.