Image Compression with Clustering: What I learned

 TL;DR: I used K-means to shrink images by grouping similar colors, and it turns out you don’t need many colors to keep an image looking good (and some choices run a lot faster than others).

This post focuses on intuition and observations rather than implementation details, out of respect for academic integrity and future students.

Why This Was Interesting

In one of my Computational Data Analysis (CDA) homework assignments, I explored how a classic idea from data science: clustering can be used for image compression.

A digital color image is made up of millions of pixels, and each pixel has three numbers describing its Red, Green, and Blue (RGB) values.

The key idea behind this homework is simple: Instead of storing every exact color, we can group similar colors together and represent them with a smaller set of “representative” colors.

This is exactly what K-means clustering does.

  • Each pixel is treated as a point in color space

  • Similar colors are grouped into clusters

  • Each cluster is replaced by its average color

The result is an image with fewer distinct colors, smaller storage size, and often surprisingly good visual quality.

What I Did  in High Level

I implemented K-means from scratch (no library shortcuts) and tested it on:

  • parrots.png

  • football.bmp

  • One image of my own choosing

For each image, I tried different numbers of color clusters:

K = 3, 6, 12, 24, 48

I also compared two ways of measuring color similarity:

For each experiment, I recorded:

  • The reconstructed (compressed) image

  • How long the algorithm took to converge

  • How many iterations it needed

What the Images Show






Effect of Increasing K (Number of Clusters)

Across all images, the pattern was consistent:

  • Small K (3, 6)

    • Very few colors

    • Strong “posterized” look

    • Loss of fine detail

  • Medium K (12, 24)

    • Major structures preserved

    • Much better visual quality

    • Good balance between compression and clarity

  • Large K (48)

    • Very close to the original image

    • Diminishing visual improvement

    • Much higher computation cost

In practice, most of the visual improvement happens early. After a point, increasing K mostly increases runtime rather than noticeable quality.

L1 vs. L2: Do They Look Different?

Visually, L1 and L2 produce very similar images when using the same K.

However, behind the scenes there is a difference:

  • L2 took more iterations and longer runtime

  • L1 generally converged faster and more stably

For real-world applications where speed matters, this difference is important—even if the final images look almost the same.

A Surprise Finding: My Cat Image 🐱

One of the most interesting lessons came from an image I didn’t end up using in the final submission: a photo of my cat, Rika.


At first glance, it looks like a normal color photo. But when I ran clustering on it, things broke.

Why?

The image was almost monochromatic:

  • Dominated by light beige

  • Very little color diversity

  • Only small regions of white, black, and light orange

When I increased K:

  • Some clusters became empty

  • The algorithm failed

  • Errors occurred during execution

This made me realize why the homework explicitly required full-color images.

Lesson Learned

Clustering works best when the data actually has diversity.

Images with very limited color variation can cause instability, especially when asking the algorithm to find too many clusters. This experience also pushed me to add exception-handling logic in my code—an unexpected but valuable takeaway.

How to Choose “The Best” K?

There is no single “correct” K.

To guide the choice, I used the Elbow Method:

  • Measure how compact clusters are as K increases

  • Plot the result

  • Look for the point where improvements slow down

Across all three images, the elbow appeared around:

K ≈ 4

Beyond that, improvements became marginal.

This reinforced an important idea:

More complexity is not always better.



Final Takeaways

Here’s what I took away from this assignment:

  • Image compression via clustering is intuitive and visual

  • Most image quality gains happen at low to moderate K

  • L1 and L2 norms look similar visually, but L1 is more efficient

  • Data characteristics matter—a lot (thank you, cat photo)

  • Writing robust code requires handling edge cases, not just happy paths

Despite the math and theory behind it, this homework was deeply satisfying because the results were immediately visible. It’s one of those moments where theory, code, and intuition all come together. 

Comments

Popular posts from this blog

How To Implement Dense_Rank() In Excel Sheet

Schema and Security Control in SQL server

My Georgia Tech OMSA Year 2 Fall term Takeaway (Fall25)