Our explainer series tackles complex Machine Learning concepts in five levels of escalating difficulty, starting from a kindergartener and moving up to an ML expert.
We’re going to talk about something that’s a core part of how Alectio works: Active Learning.
00:00 – Intro
00:38 – Level 1: Kindergartener
02:25 – Level 2: Teenager
03:56 – Level 3: Non-expert adult
05:30 – Level 4: Computer Science major
07:42 – Level 5: Machine Learning expert
Contact us to learn how we can help your team build better models with less data. We’d love to show you how it works!
A 5 year old
As you are watching this video, you might notice that the way that I speak to you is a lot different than the way that other people speak to you. That’s because I’m assuming that you’re a five-year-old and I should adjust the way I speak, so that you could understand me even if you are not in the first grade yet.
But if you were 10, I might speak very differently. See! That’s a very important part of active learning.
Now, I can’t see you of course because you’re behind the camera but if I could see you and I’m seeing a very big smile on your face you’re being happy nodding and getting excited then I’d go a lot faster. I will make things a little bit difficult and I will keep teaching you new interesting topics and you could learn of other stuff in a very short time.
But if you are scratching your little head, if you’re getting confused and unhappy or even with your eyebrows then I’d go very slow or even repeat from the very beginning, so that you could understand every single word that I’m saying.
Now, I guess you have a better idea of Active Learning.
Imagine being in a math classroom and the teacher keeps explaining the same things over and over again, because some students do not get it. Now, you might feel justifiably annoyed, because they’re not giving you any new information that you can learn from in the machine learning community.
This is equivalent to supervised learning where the practitioner keeps giving the same data points and the same information to the model hoping that something will change, the model will learn something new but this does not happen.
Compare this to another situation. Let’s say, you have a tutor. You go and you ask your tutor really specific questions about things you do not understand. They give you relevant information which will help you learn on topics that you do not have much experience with.
Now, the good news is something like this exists for Machine Learning: it’s called Active Learning. Here, models can actually request for new information that they have trouble with to the practitioner. The practitioner comes back with information regarding those relevant examples and the model can actually learn something new. This process is much faster, reduces compute time and also makes the model a much better learner.
A non-expert adult
Did you know that Active Learning actually comes from the fields of psychology and education? Many teachers today have realized that Active Learning can actually be used in their classrooms, as they’ve noticed that when they teach students the same ways, many students often feel left out or just don’t understand the concepts for that reason.
Raising Active Learning teachers can actually incorporate the feedback that students provide during the teaching process and actually create individualized lesson plans for each student.
However, it’s one thing for teachers to understand that each student has different learning capabilities, but it’s a lot different and a lot harder for teachers to actually use the feedback the right way that students provide in order to teach students more effectively in Machine Learning.
We’re dealing with the same types of problems except instead of students we have models. Machine Learning scientists are still yet to realize the benefits of Active Learning, as teachers have in the classroom. The models that we’re training all learn in different ways what that means is that we have to utilize the feedback during the training process for each model in order to train models more effectively. This means training models with less time and less resources.
However, just as it’s difficult for a teacher to use feedback from students the right way and revise the list and plan so it’s individualized to each student’s needs, it’s just as difficult for machine learning scientists to take in the feedback from the machine learning model during the training in order to increase the efficiency with which the model is trained.
A CS Student
I don’t know you personally, but one of the very first things that you learn inside an introductory machine learning class is this concept called supervised learning. You take some data, you label the data, you build a model and you train the model on a training data set.
Although the implementation details of the Machine Learning model may be really complicated, this process and overall flow is really simple in a common approach to solving Machine Learning problems today. It’s desirable to get a large training dataset.
However, getting a large training data set in a supervised learning problem can often come with a large amount of costs from data labeling costs, data storage costs and compute costs. Inevitably, it makes it a very expensive process. However, it really doesn’t have to be if you really think about it. It’s a lot like human learning humans can learn on less data and still get more information. For example, your machine learning model might be able to perform at the same level of performance it gets at 100 of the data with only 20 of the data. Now executing this process and finding out the right logic to doing this can actually be very hard, but a common approach to doing this is something called Active Learning.
Active Learning is a way of breaking down the training process into incremental steps where at each incremental step you are applying a training and inference process. You are training the model again on whatever data it currently has and applying inference as in looking at what the model needs to see next in order to improve its performance. At the most, doing this process is something that can really maximize the amount of potential that you get out of a training data set. It means that you’re able to find all the bits and pieces that are critical to improving the model’s performance. That way you can use much less data and still get the same level of performance, if not better.
Now, Active Learning is something that isn’t popularly used inside academia nor industry. However, it still has a wide range of applications and there are a great number of querying strategies to look into. If you haven’t tried active learning, I highly recommend that you try it out right now.
A ML Expert
When I started my career as a Machine Learning scientist, collecting data was extremely painful and extremely time consuming. So usually the very best you could do was set up a data collection process and then sit idle for three months until you had enough historical data to work with. And then if you were to find out that the data wasn’t quite right or you missed a feature, then that was it the data you had was the data you had to work with.
Fast forward, we now have much better hardware which means the data collection process is much cheaper and much easier. So technically we now have the opportunity to be much more agile with the entire process than we used to, but somehow because we got used to thinking of data collection as something that needs to happen before we build our machine learning models, then it seems that nobody is really seeing the opportunity for a paradigm shift here.
It’s interesting, because I personally see way more inactive learning than a process designed to reduce labeling costs. In fact, I think this entire paradigm shift could happen thanks to Active Learning. Unfortunately, even today we don’t see a lot of companies experimenting with Active Learning and many of those who try usually give up really quickly, because they turn out to be disappointed with the results.
For me, this is somewhat reminiscent of the early days of Deep Learning where many of us got excited, tried Deep Learning out and then when we didn’t see the results we were hoping for, we would just give up. We now know today that the reason why we didn’t see the great results that we see today was that none of us machine learning scientists had been trained and knew how to do hyperparameter tuning the right way. So today I want to encourage people who are already doing Active Learning and about to give up, and those who want to try out, not to give up to keep going and try different querying strategies, different loop sizes, even though they think that the results are disappointing.
It’s a complex process that needs to be tuned just the same way that a Deep Learning process needs to be tuned. The good news though is that I can foresee a lot more research in this field to help us over the next few years. Because there’s never been as many papers in Active Learning than in 2019 and I would bet that 2020 is set to get even better.