Close panel

Close panel

Close panel

Close panel

Technology> Artificial Intelligence Updated: 11 Nov 2019

Machine learning: What is it and how does it work?

Machines’ current ability to learn is present in many aspects of everyday life. Machine learning is behind the recommendations for movies we receive on digital platforms, virtual assistants’ ability to recognize speech, or self-driving cars’ ability to see the road. But its origin as a branch of artificial intelligence dates began several decades ago. Why is this technology so important now, and what makes it so revolutionary?


Machine learning, or automated learning, is a branch of artificial intelligence that allows machines to learn without being programmed for this specific purpose. An essential skill to make systems that are not only smart, but autonomous, and capable of identifying patterns in the data to convert them into predictions. This technology is currently present in an endless number of applications, such as the Netflix and Spotify recommendations, Gmail’s smart responses or Alexa and Siri’s natural speech.

“Ultimately, machine learning is a master at pattern recognition, and is able to convert a data sample into a computer program that can extract interferences from new data sets it has not been previously trained for,” explains José Luis Espinoza, data scientist at BBVA Mexico. This ability to learn is also used to improve search engines, robotics, medical diagnosis or even fraud detection for credit cards.

Although now is the time when this discipline is getting headlines thanks to its ability to beat Go players or solve Rubik cubes, its origin dates back to the last century. “Without a doubt, statistics are the fundamental foundation of automated learning, which basically consists of a series of algorithms capable of analyzing large amounts of data to deduct the best result for a certain problem,” adds Espinoza.

Old math, new computing

We have to go back to the 19th century to find of the mathematical challenges that set the stage for this technology. For example, Bayes’ theorem (1812) defined the probability of an event occurring based on knowledge of the previous conditions that could be related to this event. Years later, in the 1940s, another group of scientists laid the foundation for computer programming, capable of translating a series of instructions into actions that a computer could execute. These precedents made it possible for the mathematician Alan Turing, in 1950, to ask himself the question of whether it is possible for machines to think. This planted the seed for the creation of computers with artificial intelligence that are capable of autonomously replicating tasks that are typically performed by humans, such as writing or image recognition.

It was a little later, in the 1950s and 1960s, when different scientists started to investigate how to apply the human brain neural network's biology to attempt to create the first smart machines. The idea came from the creation of artificial neural networks, a computing model inspired in the way neurons transmit information to each other through a network of interconnected nodes. One of the first experiments in this regard was conducted by Marvin Minksy and Dean Edmonds, scientists from the Massachusetts Institute of Technology (MIT), who managed to create a computer program capable of learning from experience to find its way out of a maze.

"Machine learning is a master at pattern recognition"

This was the first machine capable of learning to accomplish a task on its own, without being explicitly programmed for this purpose. Instead, it did so by learning from examples provided at the outset. The accomplishment represented a paradigm shift from the broader concept of artificial intelligence. “Machine learning’s great milestone was that it made it possible to go from programming through rules to allowing the model to make these rules emerge unassisted thanks to data,” explains Juan Murillo, BBVA’s Data Strategy Manager.

Despite the success of the experiment, the accomplishment also demonstrated the limits that the technology had at the time. The lack of data available and the lack of computing power at the time meant that these systems did not have sufficient capacity to solve complex problems. This led to the arrival of the so-called “first artificial intelligence winter” - several decades when the lack of results and advances led scholars to lose hope for this discipline.

The rebirth of AI

The panorama started to change at the end of the 20th Century with the arrival of the Internet, the massive volumes of data available to train models, and computers’ growing computing power. “Now we can do the same thing as before, but a billion times faster. The algorithms can test the same combination of data 500 billion times to give us the optimal result in a matter of hours or minutes, when it used to take weeks or months,” says Espinoza.

In 1997, a famous milestone marked the rebirth of automated learning: the IBM Deep Blue system, which is trained from watching thousands of successful chess matches, managed to beat the world champion, Garry Kasparov. This accomplishment was possible thanks to deep learning, a subcategory of machine learning described for the first time in 1960, which allows systems to not only learn from experience, but to be capable of training themselves do so better and better using data. This milestone was possible then - and not 30 years before - thanks to the growing availability of data to train the model: “What this system did was statistically calculate which move had more probabilities of winning the game based on thousands of examples of matches previously watched,” adds Espinoza.

“The ability to adapt to changes in the data as they occur in the system was missing from previous techniques”

This technology has advanced exponentially in the past 20 years, and is also responsible for AlphaGo, the program capable of beating any human player at the game Go. And what is even more important: of training itself by constantly playing against itself to continue improving.

The system that AlphaGo uses to do this, in particular, is reinforcement learning, which is one of the three major trends currently used to train these models:

  • Reinforcement learning takes place when a machine learns through trial and error until it finds the best way to complete a given task. For example, Microsoft uses this technique in game environments like Minecraft to see how “software agents” improve their work. The system learns through them to modify its behavior based on “rewards” for completing the assigned task, without being specifically programmed to do it in a certain way.
  • Supervised learning occurs when machines are trained with labeled data. For example, photos with descriptions of the things that appear in them. The algorithm the machine uses is able to select these labels in other databases. Therefore, if a group of images has been labeled that show dogs, the machine can identify similar images.
  • Finally, in the case of unsupervised learning, machines do not identify patterns in labeled databases. Instead, they look for similarities. The algorithms are not programmed to detect a specific type of data, such as images of dogs, but to look for examples that are similar and can be grouped together. This is what occurs, for example, in facial recognition where the algorithm does not look for specific features, but for a series of common patterns that “tell” it that it’s the same face.

Flexibility, adaptation and creativity

Machine learning models, and specifically reinforcement learning, have a characteristic that make them especially useful for the corporate world. “It’s their flexibility and ability to adapt to changes in the data as they occur in the system and learn from the model’s own actions. Therein lies the learning and momentum that was missing from previous techniques,” adds Juan Murillo.

In the case of AlphaGo, this means that the machine adapts based on the opponent's movements and it uses this new information to constantly improve the model. The latest version of this computer called AlphaGo Zero is capable of accumulating thousands of years of human knowledge after working for just a few days. Furthermore, "AlphaGo Zero also discovered new knowledge, developing unconventional strategies and creative new moves," explains DeepMind, the Google subsidiary that is responsible for its development, in an article.

This unprecedented ability to adapt has enormous potential to enhance scientific disciplines as diverse as the creation of synthetic proteins or the design of more efficient antennas. “The industrial applications of this technique include continuously optimizing any type of ‘system’,” explains José Antonio Rodríguez, Senior Data Scientist at BBVA’s AI Factory. In the banking world, deep learning also makes it possible to “create algorithms that can adjust to changes in market and customer behavior in order to balance supply and demand, for example, offering personalized prices,” concludes Rodríguez.

Another example is the improvement in systems like those in self-driving cars, which have made great strides in recent years thanks to deep learning. It allows them to progressively enhance their precision; the more they drive, the more data they can analyze. The possibilities of machine learning are virtually infinite as long as data is available they can use to learn. Some researchers are even testing the limits of what we call creativity, using this technology to create art or write articles.