How does image compression work?

December 2025 – Lee Robinson

We use images every day but rarely stop to think about how they work.

Reality has a lot of complexity. Have you ever really thought about how images work? Obviously there are files, but what's inside those files? How do we compress them down to smaller files without losing quality?

Compressing data (text, images, video, audio) is a core part of how computers work. It's the foundation of the internet, the file system, and most of the software we use every day. It's a fascinating topic!

There are two approaches. You can throw away the parts humans won't notice. That's lossy compression. Or you can be clever about how you store every pixel. That's lossless compression. These two ideas power JPEG, PNG, ZIP files, web traffic, and most of the internet.

What is an image, really?

Images look smooth and continuous, but zoom in far enough and you'll see they're made of tiny colored squares called pixels.

A beautiful sunset

Each pixel is three numbers^¹: how much red, green, and blue to mix together. Each value ranges from 0 (none) to 255 (maximum). Try mixing colors yourself:

Red255

Green147

Blue41

rgb(255, 147, 41)

If you have an HD screenshot (1920 × 1080 pixels), it contains over 2 million pixels. Each pixel needs 3 bytes (one for R, G, B). That's 6.2 megabytes for a single image.

1,920 × 1,080 = 2,073,600 pixels
2,073,600 × 3 bytes = 6,220,800 bytes ≈ 6.2 MB

You need a way to shrink this to a more manageable size. There are two solutions: throw away data or be clever about how you store every pixel.

The simple approach: throw away data

The fastest way to shrink a file is to delete parts of it.

Lossy compression (used by JPEG) exploits how human vision works. Your eyes resolve brightness in fine detail but blur color together. You notice a blurry edge, but you won't notice if colors are slightly off.

JPEG takes advantage of this in a clever way. First, it converts pixels from RGB to a different format that separates brightness from color. It also reduces color resolution since your eyes won't notice. Then it divides the image into 8×8 pixel blocks and transforms each block into "frequency components." Low frequencies represent smooth gradients. High frequencies represent fine details like sharp edges.

Since photographs are mostly smooth gradients, most of the information ends up in just a few low-frequency components. JPEG throws away the high-frequency details you won't miss, keeping the smooth parts intact^².

Original - every pixel preserved

At high quality, you barely notice the difference. But crank up the compression and you'll see blocking artifacts where fine details get lost.

With lossy compression, you can't get the original pixels back. But the tradeoff is often worth it: 10-20x smaller files with minimal visible difference.

The clever approach: keep everything

What if you need every pixel perfect? For example, screenshots with text or graphics with sharp edges.

Lossless compression (used by PNG) preserves every single pixel while still shrinking the file. It uses three tricks to shrink the file size.

Trick 1: predict your neighbors

Adjacent pixels are usually similar. Look at a row of pixels in a blue sky: they might be 128, 130, 132, 134... all close to each other.

Instead of storing each value, we can store how much each pixel differs from its neighbor:

128

130

132

134

136

138

140

142

Each number represents a blue channel value. Notice how similar they are: 128, 130, 132... all within a small range.

Original pixel values (blue sky)

This is called filtering or prediction. The differences are usually small numbers, and small numbers are easier to compress.

PNG supports several prediction methods:

Sub: Predict from the left neighbor (best for horizontal gradients)
Up: Predict from the pixel above (best for vertical gradients)
Average: Predict from the average of left and above
Paeth: Smartly pick between left, above, or diagonal

PNG encoders try each method per row and pick the one that compresses best.

Trick 2: find patterns

After filtering, our data might look like: 0, 2, 0, 0, 0, 2, 0, 0, 0, 2...

Notice the repetition? A clever algorithm called LZ77 spots these patterns and replaces them with back-references.

Filtered pixel differences

Instead of storing the same sequence twice, we say "go back 5 values and copy 5." This tiny reference is smaller than repeating the values.

LZ77 maintains a "sliding window" of the last 32 KB of data. Any pattern within that window can be referenced instead of repeated. For images with lots of similar regions (like solid backgrounds or gradients), this helps reduce the file size significantly.

Trick 3: shorter codes for common things

After filtering and pattern-finding, some values appear more often than others. The value 0 (no difference) might appear 60% of the time, while 127 might appear only 5%.

Huffman coding gives common values shorter binary codes:

0000000008 bits

2000000108 bits

255111111118 bits

127011111118 bits

Each value normally uses 8 bits

Normally, every value uses 8 bits. With Huffman coding:

0 (appears 60%) gets just 1 bit: 0
2 (appears 25%) gets 2 bits: 10
255 (appears 10%) gets 3 bits: 110
127 (appears 5%) gets 3 bits: 111

The math^³ works out to be 5x smaller: 60×1 + 25×2 + 10×3 + 5×3 = 155 bits instead of 100×8 = 800 bits.

This is the same idea behind Morse code. The letter "E" (common) is just "•" while "Q" (rare) is "— — • —".

The full pipeline

PNG combines all three tricks in sequence:

🖼️Original

📐Filtering

→

DEFLATE

🔍LZ77

→

📏Huffman

→

📦PNG

Raw RGB pixels

6.2 MB

File Size6.2 MB → 2.1 MB

Step 1 of 5

Filtering: Store differences, not values
LZ77: Replace repeated patterns with references
Huffman: Give common values shorter codes
Wrap in PNG format: Add headers and checksums

For a photograph like our sunset, a 6.2 MB image becomes around 2.1 MB. 3x smaller, with zero quality loss. Screenshots with large solid color regions can achieve 10-50x compression.

The PNG file format

All this compressed data gets wrapped in a structured format. A PNG file is organized into chunks, each with a specific purpose:

Raw bytes (hex)

PNG Signature

First 8 bytes identify this as a PNG file

-89: High bit detects 7-bit transfer corruption

-50 4E 47: The letters "PNG" in ASCII

-0D 0A: CR-LF detects DOS/Unix conversion

-1A: Ctrl+Z stops DOS type command

-0A: LF detects Unix/DOS conversion

Bytes 0-7 (8 bytes)

PNG files include some other clever details. There are many issues which could corrupt a file, such as a bad network connection or a buggy program. PNG detects this with checksums: each chunk includes a mathematical fingerprint of its data. When you open a PNG, your computer recalculates this fingerprint. If it doesn't match, the file is damaged.

Why this matters

Without compression, the internet as we know it couldn't exist.

Ideas from the 1940s-1970s (predict from context, reference what you've seen, short codes for common things) still power everything we do on computers:

ZIP files: LZ77 + Huffman (same as PNG)
Web traffic (gzip/Brotli): Same algorithms
Video (H.264): Block transforms + prediction (like JPEG, but across time)
Audio (MP3, AAC): Transform coding + quantization (same idea as JPEG)

Want to learn more? Check out pixo, a Rust library I built that implements these algorithms from scratch.

It's a fun project to play around with and learn more about image compression. I also made guides which go deeper on these topics!

There are cathedrals everywhere for those with the eyes to see.

¹: Images can also have a fourth channel: alpha (transparency). RGBA uses 4 bytes per pixel instead of 3. PNG supports alpha; JPEG doesn't, which is one reason PNG is preferred for graphics with transparency.

²: Real JPEG artifacts look different because JPEG uses a mathematical transform called DCT, not color averaging. But the intuition is the same: nearby pixels get grouped together.

³: Filter differences wrap to unsigned bytes, so -1 becomes 255. PNG's DEFLATE format also Huffman-encodes the LZ77 back-references. See the Huffman guide for technical details.