It’s very common for a lot of people who jump into the world of AI to get drown in the terms and abbreviations. Artificial Intelligence? Machine Learning? Deep Learning? AI? ML? DL? … and many more. In this short article, Here is my overview about those keywords.
Artificial Intelligence (AI)
Let’s start with artificial intelligence (AI). I always prefer to describe AI as an umbrella term which covers everything in this world. It is not a technical method or a name of a specific algorithm. AI is a research field in computer science that focuses on developing methods which can perform tasks that a human can accomplish.
The term artificial intelligence was first coined by John McCarthy in 1956 and his definition was “the science and engineering of making intelligent machines”. Although this term was coined by John McCarthy, the concept was discussed many times including in one of the papers of Alan Turing. In his paper, he discusses the machines being able to simulate human beings and the ability to do intelligent things. Now, since we discussed what AI, is we can continue with other keywords.
Machine Learning (ML)
The term machine learning (ML) was first coined by Arthur Samuel and he described this term with this definition in 1959: “field of study that gives computers the ability to learn without being explicitly programmed”. Although this is a very high-level definition of machine learning, it roughly explains what machine learning is. Before jumping into it, I want to provide a more technical definition and this was stated by Tom Mitchell’s: “A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P if its performance at tasks in T, as measured by P, improves with experience E.” Yes, I said, it is way more technical ;)
Let’s start digging into the first definition to understand what machine learning is. Samuel mentions that if a computer has the ability to learn without explicitly programming, it is called machine learning. Explicitly programming means telling the computers what to do by providing exact rules. If you are responsible to write a software, you can’t leave a vague area, you need to give precise commands. Let’s say you are responsible to implement a software system for a robotic arm and you want it to move items from one bucket to another bucket. You have to provide the exact coordinates of the items so the robotic arm can go there and then you have to provide the exact details of the pressure so the robotic arm can handle it. And then, you have to provide the exact details of the destination coordinates so the robotic arm can move to that specific coordinate, and lastly, you have to provide information to release the item. The goal of machine learning is to complete those tasks without being explicitly programming.
When we jump into the 2nd definition, we will see that Mitchell explains a program that can complete a task based on learning and some performance metrics. Let’s go to the previous robotic arm example again. If we can provide this robotic arm a metric like “number of successfully moved items from one bucket to another” and ask this program to increase the metric value by learning itself and using its historical experience in each trial-error session, we can call this a machine learning based program.
I believe the previous two definitions make the definition of machine learning clear. But, if I have to explain in one simple sentence, I would describe it as: AI as an umbrella term that covers everything related to this world, ML is the technical part/mathematical part of this world.
Deep Learning (DL)
For the last couple of years, a new term became one of the most known terms in the AI world; deep learning. Deep learning is a subset of machine learning methods. Actually, deep learning methods are based on neural network methods (which is also a machine learning method) and those methods are around since the 1960s. Deep learning is, in very basic terms, is creating multiple layers of neural networks. This wouldn't be possible in the 1960s because of required process power and huge amount of data. With the help of GPUs, huge amount of process power and with huge amount of data, it is possible to create deep learning architectures now.
Deep learning methods started taking attention in 2012, when a deep learning architecture named AlexNet became the winner of ImageNet competition. The error rate was 15% while the runner-up was 10% more. This was one of the milestones in deep learning history. The goal of ImageNet competition was to classify the images; this is a car, this is a cat, ... What seems normal to us was a giant step 7 years ago.
After this accomplishment, many others followed. Google improved its translation service by replacing its statistical methods with deep learning methods. Microsoft successfully implemented a deep learning based speech recognition system which provided the similar accuracy as human transcribers.
Although I gave only two example applications (translation and speech recognition), there are dozens of application fields that use deep learning; including computer vision, speech recognition, natural language processing, audio recognition, social network filtering, machine translation, bioinformatics, drug design, medical image analysis, material inspection and board game programs.
In this article, I tried to provide the definitions of artificial intelligence, machine learning and deep learning. AI can be considered as an umbrella term of this world, ML is the technical part of this world and DL is the subset of ML which helped the progress of AI to jump to another level.