Computer vision – part 1/3: Understanding computer vision and its real-world applications

Imagine a computer capable of recognising a face, identifying a car in the street, or telling an apple from a pear. That is exactly what computer vision, also known as machine vision, makes possible.

At the crossroads of artificial intelligence, mathematics and digital imaging, this field teaches machines to “see” and understand the visual world, in much the same way we do with our eyes and brain.

In this article, we will explore the fundamentals of computer vision, from how it works to its real-world applications, before diving (in a second article) into more technical and advanced aspects.

What is computer vision?

Computer vision (CV) encompasses all the methods that enable a computer to “see”. By “see”, we mean drawing conclusions from an image, but this also extends to video. Whether it is counting objects in a photo, detecting faces, or tracking a person’s movement in a security video, the range of uses is wide.

This field is a sub-discipline of signal processing and artificial intelligence.

Typical use cases of computer vision

Detection
Identify the presence of an object in an image, determine how many there are, and where they are located.

Classification
Recognise the objects present and describe their characteristics.

Tracking
Detect and follow a moving object in a video.

3D vision
Estimate a human’s 3D pose or reconstruct a complete scene from 2D images.

An old discipline, but constantly evolving

Computer vision has existed since the 1970s. At the time, artificial intelligence researchers wanted robots to be able to perceive their environment in order to act appropriately. The classical approaches from that era still form the foundation of what we do today.

The major challenges of computer vision

The field aims to tackle several major challenges:

Classification
Recognise the content of an image (e.g., “this is a dog”).

Object detection
Locate each object and distinguish it from others.

Segmentation
Analyse each pixel to separate different areas (useful in medicine or for autonomous driving).

Motion recognition and tracking
Follow an object in a video.

Classical methods and limitations

Historically, computer vision relied on fairly simple techniques:

Edge and shape detection
Analysis of colours and textures
Search for key points

These methods work in simple cases but often fail when faced with variations in light, angle, or shadow. This is where deep learning takes over, automatically learning the best visual representations. This topic will be covered in the second article.

Real-world applications of computer vision

The strength of computer vision lies in its ability to adapt to a wide range of fields. It is already transforming several sectors:

Healthcare: tumour detection, diagnostic support.
Mobility: driving assistance, autonomous vehicles capable of understanding their environment.
Security: intelligent surveillance, facial recognition, or video analysis.
Industry: automated quality control on production lines.
Everyday life: augmented reality filters, automatic photo sorting on smartphones.

Computer vision also enables optical character recognition (OCR). It reads text in images or videos. Google Lens is a consumer example, but it also powers professional solutions, such as the automatic recognition of race bib numbers in KUVA, developed by Apptitude.

Current challenges and limitations

Even though the progress is impressive, computer vision faces several challenges:

Robustness
A model that performs well in the lab may fail under real-world conditions (rain, fog, changes in lighting).

Explainability
Understanding why an AI sees one thing rather than another remains difficult.

Ethics and privacy
How can these technologies be used without infringing on individuals’ rights? How can we avoid amplifying biases present in the data?

These issues are at the heart of current research and will shape how computer vision is adopted in our daily lives.

Conclusion

Computer vision is a fascinating field that already allows our machines to perceive and interpret the visual world. From smartphone filters to autonomous vehicles, its applications are everywhere.

This article has introduced the basics of computer vision. The next article will dive into advanced techniques: convolutional neural networks, modern architectures, and deep learning.