Today I will be writing about machine learning. I consider myself an enthusiast not an expert, so keep that in mind. With that said I want to explain why machine learning matters. To be quite frank about it machine learning and the eventual advent of artificial intelligence is probably humanity’s greatest invention since fire. Why? It is the belief of many anthropologists that it was the use of fire that gave our bodies the fuel to grow bigger smarter brains and become the dominant species of the earth. Machine learning has the potential to do the same, allow our level of intelligence to grow, but this time there does not appear to be any limitations on how big our new brains can grow. Most of our energy still comes from burning things, yet it took thousands of years from the advent of fire to the invention of the car. We are just discovering the fire of machine learning now, it is hard to say where this new fire will lead us, what new inventions await us in 10 years or 25, or 50? It is hard to say, the field of machine learning is by its nature going to grow exponentially. So long as we are smart about it and do not allow this new fire to burn us, I believe machine learning will solve many of the major problems facing us right now. Now that I have your attention, let’s get into the details of how it works.
Answering the question of how machine learning works is going to be a tough question, as there are multiple ways it works. Machine learning is a subset of artificial intelligence. Firstly there are several main methods you can divide machine learning into. There is supervised learning and unsupervised learning, you can combine the two methods to have semi-supervised learning, and then there is Reinforcement Learning. Supervised learning is when you have a data set, let’s say pictures of cats and dogs, and you let the machine learn the differences between a cat and a dog. You then can show it a picture of a cat and be fairly certain that the machine will guess that the picture is in fact a cat and not a dog. This is a basic explanation of supervised learning when you give the machine a set of data, and tell it what each set of data means, it is then able to guess what a thing is based on that given data. Semi-supervised learning is a lot like how a child learns to speak. A parent might help by pointing to a cat and saying “cat” or a dog and saying “dog” but the child has to learn other words and concepts on their own, without any direct help, as there are too many words and concepts to teach directly. A parent can only do their best to teach the child as much as possible, but at the end of the day the child still has to learn on their own. Then there is unsupervised learning, for unsupervised learning the machine is given pictures of cats and dogs and is left to categorize them itself and come to the conclusion that the two are different, this could be done by comparing the size of dogs to cats, given that most dogs are bigger than most cats, but what about small dogs, well this is one of the limitations of unsupervised learning, it can miss-categorize things, but it can also find patterns that humans are incapable of finding. Finally there is Reinforcement learning, think of reinforcement learning as trying to teach a car to go around a race track, for every foot the car moves in the right direction give it a thumbs up but for every foot in the wrong direction give it a thumbs down. This can be used to train the machine that moving the car forward is good and going backwards is bad. You could add a time element to this and teach the machine that the faster it goes forwards the better of a reward it gets. There are probably more ways to categorize the algorithms of machine learning but these are the few that I am aware of, I will now go on to explain types of algorithms that use these methods.
Let us start with supervised learning algorithms, there are several of these, and they fall into categories too. First let us start with Classification, Classification is when you have labels and you assign them to the data that is output, as in the cat and dog example above, this is called Binary Classification. However we can have as many categories as we want this is called Multi-Class Classification, so we could have a large brown dog, or a small yellow cat, the point is that there are multiple possible labels for Multi-Class Classification. Second there is Regression, Regression is when there is a real value output such as dollars, or a recommended cat video to watch next, it takes in real time data and past data and makes recommendations or assumptions about the future based on that data. Finally there is Forecasting, this is perhaps the easiest to explain as it is what it sounds like. Forecasting takes past and present data and makes predictions about the future, such as what will the stock market look like in an hour, or what tomorrow’s weather will be like. This is achieved through analyzing the trends in the past and then projecting what will happen in the future. Now that those explanations are out of the way let’s finally explain some algorithms.
I am going to start with perhaps the simplest algorithm the Decision Tree, it is a Supervised Learning algorithm that uses Classification, or Regression depending on the problem. The Decision Tree works by making decisions, let’s say we have a Decision Tree to determine what shape an object is. We could have a branch in that tree that looks for curved or straight lines, if we have curved lines we could then compare it to an oval or a circle, if we have straight lines we could compare it to a square or a rectangle. If our tree is large enough and all our data is properly labeled we could determine what any basic shape is. The Decision Tree is fast but not always accurate, if you want accuracy you can use a Random Forest.
A Random Forest is a bunch of decision trees, because it takes several trees to make a forest. This algorithm has the advantages of being very accurate compared to a single Decision Tree, but due to the size of the forest it can be slower. It works by having multiple Decision Trees, and each one gets a vote. In the example above some might vote circle while others will vote oval, whichever gets the most votes wins and that is what the input is classified as. This can remove bias as one tree might like to vote ovals more than circles and vise versa, the larger the forest the more accurate it tends to be.
Next is my personal favorite, the Neural Network, the idea behind how this algorithm works is to replicate the structure of the human neuron and then make a network of them, a brain like structure. I am not a biologist so I can not explain how a neuron works biologically, but what I can do is explain how it works algorithmicly. So with that in mind, let’s look at how a virtual neuron works. First you have your input neurons, these could be considered the eyes or ears of the network, then you have your hidden layers, you can have as many of these as is necessary, and finally you have your output neurons. Think of this as the mouth of the network that communicates the answer to the outside world. A neuron will send a signal to its neighbors in the network starting with the input neurons, and propagating throughout the network, each neuron making a decision based on the inputs from its neighbors. Let us use the shapes example from above, let us say that the shape is a picture of a certain size, and we divide the picture into equal squares, or pixels, and for each pixel it can be on or off. Then we have one layer determine if the pixels form straight lines or curvy lines, then we have another layer look at the proportions of the shape to determine if it is a square or rectangle or an oval or a circle. This is a gross oversimplification of how they work, if you would like to learn more you can watch this video. A Neural Network is very flexible and can be both Supervised, or Semi-Supervised, and can use Classification, or Regression, or Forecasting. The flexibility of the Neural Network should be no surprise as it is based off of the most sophisticated thing in the universe the human brain.
There are many other forms of Supervised learning, but I believe I have covered the basics. Let us now focus on Unsupervised learning. Unsupervised Learning is a little more abstract in my opinion. So let’s give a practical example, I am a bit of a space nerd, so we are going to use the example of stars. Some stars are big some are small, some are hot some are cold, some are red, some yellow and some blue, but there is a pattern of stars called the Main Sequence
(see picture). This picture has a lot of data points, and there are even more stars in the sky, how can a human possibly hope to categorize all the stars in the night sky? Well this is a perfect task for machine learning, and since it is impossible to label all the stars this is definitely a job for Unsupervised Learning. The method used to categorize the data points such as stars, and graph them is called Clustering, there are many methods of Clustering, such as: Hierarchical Clustering, K-Means Clustering, Gaussian Mixture Models, Self-Organizing Maps, and Hidden Markov models to name a few. Clustering allows us to see how the data is organized into groups, and helps us see the natural patterns. There is also another major method of Unsupervised Learning, that is Dimension Reduction. Let’s continue with the star example and say we only want to look at stars that could have earth like planets around them, well what do we know about the sun? It’s not too hot or too cold and its of medium size, and of a certain age. We can take these relevant data points and ignore the others, let’s ignore size to some extent and include stars the same size as the sun or smaller that are in the main sequence, this is a form of Dimension Reduction. We can perform several Dimension Reductions on the main sequence until we are left with habitable stars where we might find earth like planets. This method is being used to hunt for earth like planets around stars. There are many complicated and multi method approaches of Machine Learning being used to process astronomical data.
Machine Learning has many practical applications, and there are many methods available to use. Above is a flow chart to help decide what algorithm to use. In the future there will not be an industry untouched by machine learning, its applications are limitless, health care, scientific research, education, government, you name it, machine learning will change it. However machine learning is not perfect, it is prone to biases just like humans and it is important to be aware of those biases. Knowing what algorithm to use is key, as there is not always a perfect fit for each problem, but there is usually a best method to use, and it is possible to use multiple methods, I have not mentioned all the possible methods available just the major ones that I know of. This is a vast topic, and there is a lot to cover, I hope I have been able to explain the complexities of machine learning.
Sources:
Which machine learning algorithm should I use?:
Supervised and Unsupervised Machine Learning Algorithms:
Unsupervised Machine Learning: Crash Course Statistics #37
FUNDAMENTAL PARAMETERS OF MAIN-SEQUENCE STARS IN AN INSTANT WITH MACHINE LEARNING:
https://iopscience.iop.org/article/10.3847/0004-637X/830/1/31
https://upload.wikimedia.org/wikipedia/commons/6/6b/HRDiagram.png
Machine learning technique for finding hidden patterns or intrinsic structures in data:
How Deep Neural Networks Work:
Code Bullet (warning lots of swears):