Ben Gorman

Ben Gorman

Life's a garden. Dig it.

In this course, you'll learn the fundamentals of computer vision. The course is motivated by and centered around a set of tasks that we wish to solve, including

  • Image segmentation
  • Image classification
  • Image compression
  • Image deblurring
  • Edge and contour detection
  • Automatic colorization
  • Structure from Motion

Prerequisites

This course has a number of practice problems that expect you to have some familiarity with

If you need to brush up on NumPy or Matplotlib, you might enjoy the following courses & problem sets 👇

We'll use some other Python tools in this course (OpenCV, Pillow, Scipy, ...), but you don't need to know them ahead of time.

A note about the human vision system 👀

I focus a decent amount of time and effort in this course discussing the human vision system.

Why?

Because human vision is still considered state of the art for many vision tasks, particularly in regards to the Holy Grail of vision problems - understanding images and videos. If your goal is to improve on existing machine vision strategies, you'd be smart to study the human vision system for inspiration.

Marvel at your own vision

Take a moment to look at a tree with thousands of leaves blowing in the wind. Think about the sheer complexity of what you're viewing - thousands of interconnected objects in continuos motion. Whatever arrangement you see right now - you've never seen it before. Yet, within a fraction of a second you immediately recognize this as a tree. You know it has thousands of leaves even though you're not counting them. If a leaf falls off the tree and blows upward in the wind, your brain accepts it as reality even though you probably won't notice it and you probably didn't predict it would happen. If a branch were to fall off the tree and make the same upward movements into the sky, it'd immediately catch your attention. Why?

Think about the fact that you could spend all day classifying trees, squirrels, grass, flowers, segmenting them, predicting their behavor. You can imagine them changing color or changing shape in ways you've never observed. All of this powered by a scrambled egg and some orange juice costing around $0.50.

It's a mircale that we can see at all, but the speed, efficiency, and accuracy of the human vision system is truly mind blowing 🤯. If there's one thing I hope you take from this course, it's to actively think about your own vision system. Notice the amazing things it does. Notice its mistakes. Think about how it operates in different environments and from different perspectives. Think about how it recognize faces. Think about how it recalls people's names. Think about how it stores memories. It's an amazing machine that we take for granted far too often.

References & Resources

In my research for this course, I was pleasantly surprised to find a ton of high quality resources. Much of my content is just information I've gathered from these resources and distilled into my own words.