Computer Vision: how we teach machines to read and interpret the world

It is through computer vision that we can make machines read and interpret the world as humans can recognize people and objects.

For humans, vision accounts for about 70% of all the information we extract from the surrounding environment. Thus, the computer vision emerges as an interdisciplinary science focused on the re-creation of this sense, although not limited to the limitations of this sense, for example being able to “work” beyond the visible light.

What is a computer vision and how does it work?

Computer vision is an area of study and application that makes use of other types of sciences, such as: physics, mathematics, computer science, biology, among others, which translates into a technology for building artificial systems that obtain the content through multidimensional data, such as images, in a process of capture and interpretation of light/electromagnetic radiation patterns.

Its aim is to discover through images what happens in the world, including patterns, objects, and actions.

It can be divided into two large modules:

  1. Image Acquisition – how to capture the world’s information.
  2. Image Processing – how to process this information so that we have data that really is useful for the tasks at hand.


Example of automatic recognition of people and objects in CCG through computer vision

How is it made?

The first part of computer vision is image acquisition. In this chapter you must select:

  •     Camera (Sensor);
  •     Lighting typology;
  •     Lens;
  •     Filters.

The selection of each element is highly dependent on the purpose and context for which the computer vision solution is being designed.

In the second part of the equation, we have the processing of the acquired data.

The extraction of information from the raw data is implemented through programming and using mathematics and especially algebra, exploring spatial and temporal relations.

Many of the applications of computer vision are intended to detect or classify something. In this sense, computer vision often refers to machine-learning algorithms.

Nowadays, due to the availability of high processing capacity and also available data, deep-learning (a specific area of machine-learning) is highlighted for some application scenarios, which makes use of a neural network with numerous layers and allows you to extract high-level information and standards directly from the data.

Aplications examples

Example of automatic optical inspection in the MaxCut4Fish project at CCG

Computer vision has applicability in multiple sectors, from health to industry, from entertainment to the military field.

Some examples of the application of computer vision by the CCG are:

  • automatic optical inspection;
  • medical image analysis;
  • electronic surveillance;
  • object recognition;
  • facial recognition;
  • detection of defects;
  • semantic recognition of scenes.

Examples of current reference projects:

Computer vision: from photon to visual inference

In this presentation, you can see what is computer vision and how it is done. Get to know the applications of computer vision and many of its specificities and curiosities, from the perspective of someone who works in the area.

“Computer Vision – from the photon to the Visual Inference” is a presentation by Nelson Alves, CCG researcher, from the applied research domain CVIG – Computer Vision Interaction and Graphics.

This is one of the main scientific areas that support the research and development activities carried out in the CVIG.