How does image compression work?

December 2025 – Lee Robinson

We use images every day but rarely stop to think about how they work.

Reality has a lot of complexity. Have you ever really thought about how images work? Obviously there are files, but what's inside those files? How do we compress them down to smaller files without losing quality?

Compressing data (text, images, video, audio) is a core part of how computers work. It's the foundation of the internet, the file system, and most of the software we use every day. It's a fascinating topic!

There are two approaches. You can throw away the parts humans won't notice. That's lossy compression. Or you can be clever about how you store every pixel. That's lossless compression. These two ideas power JPEG, PNG, ZIP files, web traffic, and most of the internet.

What is an image, really?

Images look smooth and continuous, but zoom in far enough and you'll see they're made of tiny colored squares called pixels.

Sunset
A beautiful sunset

Each pixel is three numbers¹: how much red, green, and blue to mix together. Each value ranges from 0 (none) to 255 (maximum). Try mixing colors yourself:

255
147
41
rgb(255, 147, 41)

If you have an HD screenshot (1920 × 1080 pixels), it contains over 2 million pixels. Each pixel needs 3 bytes (one for R, G, B). That's 6.2 megabytes for a single image.

You need a way to shrink this to a more manageable size. There are two solutions: throw away data or be clever about how you store every pixel.

The simple approach: throw away data

The fastest way to shrink a file is to delete parts of it.

Lossy compression (used by JPEG) exploits how human vision works. Your eyes resolve brightness in fine detail but blur color together. You notice a blurry edge, but you won't notice if colors are slightly off.

JPEG takes advantage of this in a clever way. First, it converts pixels from RGB to a different format that separates brightness from color. It also reduces color resolution since your eyes won't notice. Then it divides the image into 8×8 pixel blocks and transforms each block into "frequency components." Low frequencies represent smooth gradients. High frequencies represent fine details like sharp edges.

Since photographs are mostly smooth gradients, most of the information ends up in just a few low-frequency components. JPEG throws away the high-frequency details you won't miss, keeping the smooth parts intact².

Original - every pixel preserved

At high quality, you barely notice the difference. But crank up the compression and you'll see blocking artifacts where fine details get lost.

With lossy compression, you can't get the original pixels back. But the tradeoff is often worth it: 10-20x smaller files with minimal visible difference.

The clever approach: keep everything

What if you need every pixel perfect? For example, screenshots with text or graphics with sharp edges.

Lossless compression (used by PNG) preserves every single pixel while still shrinking the file. It uses three tricks to shrink the file size.

Trick 1: predict your neighbors

Adjacent pixels are usually similar. Look at a row of pixels in a blue sky: they might be 128, 130, 132, 134... all close to each other.

Instead of storing each value, we can store how much each pixel differs from its neighbor:

128
130
132
134
136
138
140
142
Each number represents a blue channel value. Notice how similar they are: 128, 130, 132... all within a small range.

This is called filtering or prediction. The differences are usually small numbers, and small numbers are easier to compress.

PNG supports several prediction methods:

PNG encoders try each method per row and pick the one that compresses best.

Trick 2: find patterns

After filtering, our data might look like: 0, 2, 0, 0, 0, 2, 0, 0, 0, 2...

Notice the repetition? A clever algorithm called LZ77 spots these patterns and replaces them with back-references.

0
2
0
0
0
0
2
0
0
0
Filtered pixel differences

Instead of storing the same sequence twice, we say "go back 5 values and copy 5." This tiny reference is smaller than repeating the values.

LZ77 maintains a "sliding window" of the last 32 KB of data. Any pattern within that window can be referenced instead of repeated. For images with lots of similar regions (like solid backgrounds or gradients), this helps reduce the file size significantly.

Trick 3: shorter codes for common things

After filtering and pattern-finding, some values appear more often than others. The value 0 (no difference) might appear 60% of the time, while 127 might appear only 5%.

Huffman coding gives common values shorter binary codes:

0000000008 bits
2000000108 bits
255111111118 bits
127011111118 bits
Each value normally uses 8 bits

Normally, every value uses 8 bits. With Huffman coding:

The math³ works out to be 5x smaller: 60×1 + 25×2 + 10×3 + 5×3 = 155 bits instead of 100×8 = 800 bits.

This is the same idea behind Morse code. The letter "E" (common) is just "•" while "Q" (rare) is "— — • —".

The full pipeline

PNG combines all three tricks in sequence:

🖼️Original
📐Filtering
DEFLATE
🔍LZ77
📏Huffman
📦PNG
Raw RGB pixels
6.2 MB
File Size6.2 MB → 2.1 MB
  1. Filtering: Store differences, not values
  2. LZ77: Replace repeated patterns with references
  3. Huffman: Give common values shorter codes
  4. Wrap in PNG format: Add headers and checksums

For a photograph like our sunset, a 6.2 MB image becomes around 2.1 MB. 3x smaller, with zero quality loss. Screenshots with large solid color regions can achieve 10-50x compression.

The PNG file format

All this compressed data gets wrapped in a structured format. A PNG file is organized into chunks, each with a specific purpose:

Raw bytes (hex)
89
50
4E
47
0D
0A
1A
0A
00
00
00
0D
49
48
44
52
00
00
00
80
00
00
00
60
08
02
00
00
00
5A
18
ED
1E
00
00
00
0C
49
44
41
54
78
9C
63
60
60
60
60
00
00
00
04
00
EA
81
6A
41
00
00
00
00
49
45
4E
44
AE
42
60
82
PNG Signature
First 8 bytes identify this as a PNG file
-89: High bit detects 7-bit transfer corruption
-50 4E 47: The letters "PNG" in ASCII
-0D 0A: CR-LF detects DOS/Unix conversion
-1A: Ctrl+Z stops DOS type command
-0A: LF detects Unix/DOS conversion
Bytes 0-7 (8 bytes)

PNG files include some other clever details. There are many issues which could corrupt a file, such as a bad network connection or a buggy program. PNG detects this with checksums: each chunk includes a mathematical fingerprint of its data. When you open a PNG, your computer recalculates this fingerprint. If it doesn't match, the file is damaged.

Why this matters

Without compression, the internet as we know it couldn't exist.

Ideas from the 1940s-1970s (predict from context, reference what you've seen, short codes for common things) still power everything we do on computers:

Want to learn more? Check out pixo, a Rust library I built that implements these algorithms from scratch.

It's a fun project to play around with and learn more about image compression. I also made guides which go deeper on these topics!

There are cathedrals everywhere for those with the eyes to see.

¹: Images can also have a fourth channel: alpha (transparency). RGBA uses 4 bytes per pixel instead of 3. PNG supports alpha; JPEG doesn't, which is one reason PNG is preferred for graphics with transparency.

²: Real JPEG artifacts look different because JPEG uses a mathematical transform called DCT, not color averaging. But the intuition is the same: nearby pixels get grouped together.

³: Filter differences wrap to unsigned bytes, so -1 becomes 255. PNG's DEFLATE format also Huffman-encodes the LZ77 back-references. See the Huffman guide for technical details.