This post is designed so that you can walk away with a fundamental understanding of Artificial Intelligence after less than 10 minutes reading it, no matter your background. I have spent almost 3 years of my life studying neural networks (the technology behind AI) and it brings me great joy to arm you with this knowledge, that I believe has enabled me to make some of the best (investment) decisions of my life.
Let´s begin.
How Does AI Work?
Firstly, let me start by saying that you can dismiss any fear you may have of AI being too hard for you - it is likely not. As we move forward, you will see that AI is nothing but a bit of high school math with lots of computation thrown at it. The good news is that to understand AI you do not even have to be great at math: you just have to develop an intuition of what is happening inside a neural network. Once you do, it will stick forever.
What is a neural network then? It is just a function like any other, that converts an input into an output, but with a peculiarity. The job of a neural network (NN) is to output predictions (inferences). For instance, we can train an NN so it learns to spot pictures of cats. A well trained NN will output a 1 (100% chance) if there is a cat in the picture and 0 if not (0% chance). In this case, for example, it is effectively predicting if there is a cat in a picture.
How can we teach a NN to do this? By simply showing it lots of pictures with cats and no cats and telling it which is which. If an NN outputs a 0 (no cat) for a picture with a cat in it, we tell it (with a label, that is also a form of data) that it should be outputting a 1 and vice-versa. The neural network goes back to “reconfigure” and tweak itself, to account for this mistake. Once it does so, the NN has taken a step forward in its learning process.
If we repeat this thousands of times, with pictures of different cats and no cats, eventually the NN learns to spot cats, at a super human level.
Now, AI is not some mystical creature. NNs are basically statistics on steroids. We feed them lots of data and tell them what the truth is.
In turn, they learn to associate the input data with ground truths so that, after training, when we feed them data from the real world, they are able to make accurate predictions. Today, a NN´s ability to make predictions is limited by the quality and volume of data we feed it.
The more and better data we feed it, the better the NN works.
How exactly does a NN learn, however? The learning process is simple and has 2 key steps, which it performs many times and hence requires a lot of computation:
Forward propagation
Back propagation
In #1, data moves forward through the NN, as it makes lots of multiplication and addition operations to the data (high school linear algebra). It gives some operations more importance than others, by assigning a weight (other known as a parameter) to each one.
The higher the weight, the more importance a given operation has and vice versa. The output of the network is the sum of all these operations, accounting for their relative importance.
In #2, the NN calculates the magnitude of its mistake (comparing the output to the label) and works backwards to change the weights it assigns to each operation in the forward propagation process.
It may increase some weights and decrease others, so to minimize the mistake it just made (high school calculus). If it sounds somewhat abstract and mysterious, it is because it actually is. We cannot predict how a NN will tweak weights.
By doing steps 1 and 2 many times, the NN ends up with a bunch of weights that capture the reality we have presented it with (data + labels). When the NN is done going through the dataset, it is ready to make predictions. To do so, in this case we feed the NN a picture after it is trained and let it propagate the data forward (step #1) and produce the output.
As you may be glimpsing, even after training it, the NN is quite dumb in tasks that are not spotting cats - it does not think for its own, but rather learns to make associations in a narrow domain. Now you understand why general AI (human-like, broad reasoning) is quite a challenge.
Further, a NN does not see the world as we do. It does not see cats or not cats, but simply numbers. In order to break down images into numbers, we use a series of techniques that make up the Computer Vision domain, which I explain briefly in the next section.
In essence, however, we break down an image into a bunch of numbers that the NN can understand: variables #1 to #4 in the example NN above.
The NN takes these numbers in and performs step #1. The data goes through the layers of the network (in the case of the graph above there is only one layer in between the input and the output) until it comes out the other side in the form of an output.
In practice, a NN can have many layers and as many inputs and outputs as you like, depending on the use case in question. In the example below, the network has 2 layers in between the output and the input.
As you can see, the output of each operation (a) is multiplied by its weight W and the outputs are just a result of the forward propagation process.
Let´s dissect Layer 3, for clarity. As you can see, each neuron in Layer 2 is connected to each neuron in Layer 3. How does the NN calculate a(3,3), for instance? It simply takes all the a values in Layer 2, multiplies each one by a weight that it has come up with and adds them up. It does the same with each node in each layer until it produces the output.
At first, the output of the NN is going to be really far off. Then, as it repeats steps #1 and #2 time and time again, the weights in the network will eventually be optimized so that the output will be good.
If you get into the details of how a NN works and into the latest advances, it quickly turns into one of the most cerebrally demanding fields out there, if not the most. Still, you now understand the fundamental building blocks and if I have explained it well, you may be surprised with how relatively simple it is.
From a more linguistic perspective, a NN is simply a device that encapsulates reality into a set of weights. We give it data and we let it do its own work, so that it comes back to us with its own understanding of the little slice of the world we have shown it. Like a genie in the bottle, but for narrow applications.
Note: In practice, the output of a NN is passed through an activation function: a mathematical operation that translates the output in the desired form. For instance, if we wish to express a probability, as is the case in cat (100%) and no cat (0%), we use an activation function known as softmax. But do not get lost in the weeds: all a NN is output number/s.
How Does AI Apply to the World?
You now have the knowledge to begin to tackle the fields of Computer Vision, Natural Language Processing and Reinforcement Learning (the tech that created the first AI to defeat a human Go player). They are all are based on the type of NN you have learned above: linear neural networks.
In essence, linear neural networks are always used in these fields to convert a compressed lower dimensional representation of a given slice of reality into insights. No matter how we wish to apply AI across any field, we need to distill reality into a bunch of numbers that a linear NN can work with, to ultimately output predictions.
For instance, computer vision. As I explained, we do not feed an image directly into an NN: we feed it a bunch of numbers that the it can understand. To obtain these numbers, we pass images through a convolutional network. For now, forget about how it works - the point is that it produces the little chunk of data that you see circle in red below.
This is a numerical representation of the original image and it comes as a “box”, with dimensions 7 by 7 by 512. It is referred to as “compressed” and “lower dimensional” because it is just a bunch of numbers, compared to the image which is a much richer data type to our eyes.
We then flatten it so to produce a column of numbers, that we can plug into a linear NN comfortably. The NN then does forward propagation and back propagation to learn from its mistakes and tweak the weights, as usual.
It always come down to plugging data into a linear NN.
Note: If you are wondering, the “green bars” in the graph above are just a way of representing the linear NN that is attached to the Convolutional Network VGG16. The layers inside a linear NN are also known as “fully connected”, which the layers in a Convolutional Network are not.
Now, how does this apply to investments and business in general?
Right now, business problems are increasingly becoming networking problems. The top companies in the world now have figured out a way to lay down infrastructure that turns their particular scope of business into an electron management game - just moving information around and processing it.
Amazon took over books first and then commerce overall, by reducing it to electron management. Google took over the world´s information, in just the same way.
Going forward, this trend will intensify and a growing % of business will be about capturing and interpreting data via infrastructures of this type. Linear NNs will effectively be the key enabler to make sense of this data and drive insight and automation. This will account for much of the wealth creation we are going see in the coming decades.
You are now equipped with the necessary knowledge to capitalize on this shift.
⚡ If you enjoyed the post, please feel free to share with friends, drop a like and leave me a comment.
You can also reach me at:
Twitter: @alc2022
LinkedIn: antoniolinaresc