Understanding checksum verification
Bytes in, bytes match, done.
Why the .sha256 file next to every download exists, when a checksum is enough, and when you actually need a signature.
What a checksum is for.
Two reasons people compute and check checksums. First, integrity — did the download complete without corruption? Network glitches and disk errors can flip bytes; the hash catches it. Second, authenticity — is this the file the publisher intended? That's a stronger claim and a checksum alone can't guarantee it — the same channel that delivers a tampered file can deliver a tampered hash. For authenticity you need a signature, which is a checksum plus a public-key verification step.
The published-hash workflow.
A serious project ships its release as release-1.0.tar.gz plus a companion file like release-1.0.tar.gz.sha256 containing the expected hash. You download both, compute the hash of the archive locally, compare. They match: download succeeded. They don't: redownload. The hashing algorithm is usually SHA-256 or SHA-512 these days; MD5 is still common in legacy contexts but shouldn't be trusted for anything adversarial.
A worked verification.
Linux ISO download: the project publishes 9a1b...c2d3 *ubuntu-24.04.iso on a signed SHA256SUMS file. You sha256sum ubuntu-24.04.iso on your machine and compare. The leading hash is what your CPU computed from the file you actually downloaded; the published hash is what the file should be. Mismatch means the download is corrupt or has been tampered with at some hop along the way.
Local + published
computed hash matches published hash
Compare the two hex strings byte-for-byte.
9a1b...c2d3 == 9a1b...c2d3
= Verified
Why a CRC isn't enough.
CRC32 is fast and small but only 32 bits — collisions are common, and it's trivial to engineer a file with any target CRC. CRC is the right tool for detecting random bit-flips on a noisy channel (Ethernet frames, ZIP entry validation). It's the wrong tool for anything where someone might be deliberately producing a file. Use SHA-256 the moment adversaries enter the picture.
Streaming, not loading.
Hashing a 10GB file does not require holding 10GB in memory. Hash functions are streamable — read a chunk, feed it to the hash state, repeat, finalise at the end. Every reasonable implementation does this; the browser's Web Crypto API is the exception (it expects the whole buffer at once), which forces streaming workarounds for large files in the browser.
From checksum to signature.
A signature is "this is the SHA-256 of the file, signed by my private key, here's the signature". A verifier with the public key can confirm both that the hash matches and that whoever made it controlled the key. PGP signatures (.asc files) and Sigstore / cosign are the modern shape. If the publisher posts a signed hash, the chain of trust extends beyond the download server — even if their CDN is compromised, the signature still proves origin.