Serengeti logo BLACK white bg w slogan

What is Computer Vision?

Jasmin Kurtanović, Machine Learning Engineer

After the introduction of digital pictures in the 50s, computer scientists begged the question: can a computer analyzing a picture understand the world?

In 1966, an American professor suggested to a graduate student to write a program which will print what is in the picture. This was impossible to do in the traditional sense of programming. How was he supposed to do this if he could not do this with the tools developed up to then?

With the emergence of computer vision and further developing deep learning, the stated problem above is possible to solve and can therefore create the impression that computers can understand a picture, like people. An example of what computer vision can do is how it can analyze the well-known picture, “dogs playing poker.” Computer vision can recognize the objects that are found in the picture, recognize the dogs and define the space in which they are located.

Computer vision is one of the most attractive areas of study in computing. It is an area of study which looks to replicate the complexities of human vision in order to enable computers to process and identify patterns and objects, replicating human vision and cognition.

Computer vision is an academic and technical discipline which builds systems that retrieve information from pictures, videos or medical devices. Computer vision also recognizes objects, follows them, detects given events, and reconstructs pictures, amongst other things. Computer vision is related to many other academic subjects. Computer vision is highly related to artificial intelligence, photography and physics and its use transcends disciplines.

Computer vision is especially applicable in manufacturing. It helps robots to position and work with their robotic hands. Computer vision is an especially effective tool in identifying inefficiencies and inconsistencies in production. In healthcare, computer vision is used in many medical diagnostic devices like CT scanners. In the transportation industry it is used in autonomous vehicles like cars and is an integral component in the race to produce the first commercially available autonomous vehicles. Computer vision is also used for our security, namely face scanners in airport that can recognize possible terrorist threats and aids police to intervene. Additionally, the military uses computer vision in rocket guiding systems, reconstructing pictures from a projection (CT), and long-distance photography. Additional applications that computer vision is used are in geology, farming, security and measuring products, fingerprint scanners, analyzing blood and many other applications. Over the years, there have been many breakthroughs in the implementation and use of computer vision that has transcended many aspects of our lives.

What is Computer vision?

Computer vision is a field of artificial intelligence (AI) which enables computers and systems to extract important information from digital pictures, videos, and other visual inputs. Computer vision then either takes actions from the data it has collected or gives suggestions on the basis of that data.

Computer vision has a central focus for computers to see and understand pictures which aims to replicate and surpass human cognition. Although computers have better tools to see, they have less advanced tools for understanding what they see. Some well-known hardware that aids computer vision in retrieving more external visual information are high-definition cameras, sensors, radars and thermal cameras. However, performing an analysis on visual inputs is an area that is more heavily associated with artificial intelligence.

computer vision street
Source: Google

The techniques of deep learning and neural networks allows computer vision to understand what it is seeing. By performing these functions, computer vision approaches the capabilities of human vision and cognition. In many ways computer vision surpasses human vision, an example being its recognition of patterns. Computer vision can even be attributed to saving lives. According to researchers, Computer vision and AI can more effectively and accurately recognize neurological diseases on CT scans of the brain.

The tasks of computer vision

Image classification- it sees a picture and can classify the contents of the picture (e.g., a dog, an apple, a person’s face). More precisely, it has the capabilities to predict that a specific picture belongs to a specific class.

Object detection- it can be used to classify a picture so that it can identify a specific class of pictures, and then discover and tabulate their characteristics in a picture or video. Examples include finding damage on an assembly line or recognizing machines that require maintenance.

Semantic segmentation- semantic segmentation is a process of separating every pixel which belongs to a specific label. It does not differentiate in different instances for the same object. For example, if in the picture there are two cats, semantic segmentation gives the same label for all the pixels for both of the cats.

Instant segmentation- Instant segmentation is different from semantic segmentation as it gives a simple label for every instance for specific objects in the picture. As you can see in the picture above, all three dogs are different colors, meaning they are different labels. With semantic segmentation, all of them would be given the same color.

computer vision tasks


Serengeti has developed an application that implements computer vision and the principles of deep learning. RedAI is a solution which can enable FMCG/CPG companies to detect, classify and control their inventory and the inventory of their competitors with a series of photographs taken with a mobile phone or tablet.

RedAI enables an automized solution to everyday tasks, following a store’s activities in real time and for every individual location.

So, how does the app work?

  • A sales representative with the help of a mobile phone or tablet takes pictures of inventory which are on the shelves and in the refrigerators of the stores.
  • The AI algorithm detects, processes, and classifies every product that the client wants to track.
  • In real time, within 30-60 seconds, in the mobile and web applications, it shows the results.

RedAI significantly benefits FMCG companies. It helps sales representatives optimize their time so that they can focus on other tasks in the store, and it helps to plan long-term distribution of inventory in a store. The company additionally has a better idea of what is in their inventory. RedAI aids in eliminating human error and helps a company make easier decisions.

Let's do business

The project was co-financed by the European Union from the European Regional Development Fund. The content of the site is the sole responsibility of Serengeti ltd.