By: Christian Metzger
Artificial intelligence is easily the hottest trend in tech at the moment. Businesses have been collecting useless data for years, hoping that some day they will have the means to analyze it effectively and glean valuable insights. Now, they are finally starting to gain access to artificial intelligence that can see trends that other technologies could not, and all that previously useless data finally has a purpose.
But what exactly is artificial intelligence? What does it do? The easy answer is self-explanatory: it’s an algorithm that learns. But how exactly does it do that? Like anything a computer does, artificial intelligence is rooted in mathematics. Artificial intelligence, and specifically deep learning, is a lot of linear algebra and regression analysis. Those terms might sound intimidating to someone not trained in mathematics or statistics, but the purpose of this article is to make those things a little more manageable for the average person.
Let’s start with the linear algebra. Here’s a basic augmented matrix:
It’s not important to know what an augmented matrix is, but now let’s look at the same matrix as a system of equations:
If you’ve taken a lower level algebra class, you’re probably familiar with the concept of a system of equations. Notice that the entries in the matrix above are just the coefficients of the variables in the system of equations below. This is the main difference between algebra and basic linear algebra, so for simplicity we’ll continue to talk in terms of a system of equations.
If you’ve ever solved a system of equations, you know they take exponentially longer to solve as you add more variables. A system of three equations with three variables takes significantly longer to solve than a system of two. Four takes even longer, and five is pushing it in terms of what humans can do with paper and pencil, but obviously not a problem for a computer. Now imagine a system of ten equations with ten variables. Or a hundred. Or a million. Solving a huge system of equations like this is basically deep learning, or at least a part of it, in a nutshell. Each data point in a data set has many quantifiable characteristics that the algorithm boils down to a single equation that can be used to predict the outcome given characteristics of a new data point.
It should be noted though that this isn’t always perfect. If we recall the systems of equations from before and it was bothering you, I’ll tell you that: x = 2, y = -1, and z = 1. This is the only solution to this system of equations, but now let’s consider a new system of equations, three equations but only one variable:
What is x? If you’ve taken algebra and studied systems of equations before, you’ll remember that sometimes they don’t have a solution, and this is one of those times. It also the same problem that you will run into when using data from the real world, nothing fits perfectly. We call these imperfections error vectors in linear algebra, and the goal of any deep learning algorithm is to minimize them when coming up with the single equation I mentioned before. If having the least error possible sounds like a familiar concept to you, you’re probably on the right track. Quite literally, the algorithm creates a line of best fit.
But how does the algorithm choose which variables to use when solving this massive system of equations? The answer is simple but hard to explain if it’s unfamiliar: regression. Essentially, the same way the algorithm created a line of best fit with its final equation, it tested different combinations of variables it had data for in order to see which were most representative of the outcome. Drawing parallels for regression is especially difficult because it’s something that’s not really viable for humans to do by hand in the first place.
Nonetheless, I hope this convinced you that artificial intelligence is not magic, just math, and it shed some light on how deep learning algorithms work under the hood. If you’re interested in hearing more about deep learning, check out this talk by Dr. David Silver, a researcher at DeepMind, on the applications of deep learning to games like Go and Chess.
This piece was inspired by CGP Grey’s How Machines Learn, give that a watch as well.