Neural networks are a crucial component of artificial intelligence (AI) models. They work based on an architecture that imitates the human brain's neurons.
After reading this article you will be able to:
Copy article link
A neural network, or artificial neural network, is a type of computing architecture that is based on a model of how a human brain functions — hence the name "neural." Neural networks are made up of a collection of processing units called "nodes." These nodes pass data to each other, just like how in a brain, neurons pass electrical impulses to each other.
Neural networks are used in machine learning, which refers to a category of computer programs that learn without definite instructions. Specifically, neural networks are used in deep learning — an advanced type of machine learning that can draw conclusions from unlabeled data without human intervention. For instance, a deep learning model built on a neural network and fed sufficient training data could be able to identify items in a photo it has never seen before.
Neural networks make many types of artificial intelligence (AI) possible. Large language models (LLMs) such as ChatGPT, AI image generators like DALL-E, and predictive AI models all rely to some extent on neural networks.
Neural networks are composed of a collection of nodes. The nodes are spread out across at least three layers. The three layers are:
These three layers are the minimum. Neural networks can have more than one hidden layer, in addition to the input layer and output layer.
No matter which layer it is part of, each node performs some sort of processing task or function on whatever input it receives from the previous node (or from the input layer). Essentially, each node contains a mathematical formula, with each variable within the formula weighted differently. If the output of applying that mathematical formula to the input exceeds a certain threshold, the node passes data to the next layer in the neural network. If the output is below the threshold, no data is passed to the next layer.
Imagine that the Acme Corporation has an accounting department with a strict hierarchy. Acme accounting department employees at the manager level approve expenses below $1,000, directors approve expenses below $10,000, and the CFO approves any expenses that exceed $10,000. When employees from other departments of Acme Corp. submit their expenses, they first go to the accounting managers. Any expense over $1,000 gets passed to a director, while expenses below $1,000 stay at the managerial level — and so on.
The accounting department of the Acme Corp. functions somewhat like a neural network. When employees submit their expense reports, this is like a neural network's input layer. Each manager and director is like a node within the neural network.
And, just as one accounting manager may ask another manager for assistance in interpreting an expense report before passing it along to an accounting director, neural networks can be architected in a variety of ways. Nodes can communicate in multiple directions.
There is no limit on how many nodes and layers a neural network can have, and these nodes can interact in almost any way. Because of this, the list of types of neural networks is ever-expanding. But, they can roughly be sorted into these categories:
Shallow neural networks are fast and require less processing power than deep neural networks, but they cannot perform as many complex tasks as deep neural networks.
Below is an incomplete list of the types of neural networks that may be used today:
Perceptron neural networks are simple, shallow networks with an input layer and an output layer.
Multilayer perceptron neural networks add complexity to perceptron networks, and include a hidden layer.
Feed-forward neural networks only allow their nodes to pass information to a forward node.
Recurrent neural networks can go backwards, allowing the output from some nodes to impact the input of preceding nodes.
Modular neural networks combine two or more neural networks in order to arrive at the output.
Radial basis function neural network nodes use a specific kind of mathematical function called a radial basis function.
Liquid state machine neural networks feature nodes that are randomly connected to each other.
Residual neural networks allow data to skip ahead via a process called identity mapping, combining the output from early layers with the output of later layers.
Transformer neural networks are worth highlighting because they have assumed a place of outsized importance in the AI models in widespread use today.
First proposed in 2017, transformer models are neural networks that use a technique called "self-attention" to take into account the context of elements in a sequence, not just the elements themselves. Via self-attention, they can detect even subtle ways that parts of a data set relate to each other.
This ability makes them ideal for analyzing (for example) sentences and paragraphs of text, as opposed to just individual words and phrases. Before transformer models were developed, AI models that processed text would often "forget" the beginning of a sentence by the time they got to the end of it, with the result that they would combine phrases and ideas in ways that did not make sense to human readers. Transformer models, however, can process and generate human language in a much more natural way.
Transformer models are an integral component of generative AI, in particular LLMs that can produce text in response to arbitrary human prompts.
Neural networks are actually quite old. The concept of neural networks can be dated to a 1943 mathematical paper that modeled how the brain could work. Computer scientists began attempting to construct simple neural networks in the 1950s and 1960s, but eventually the concept fell out of favor. In the 1980s the concept was revived, and by the 1990s neural networks were in widespread use in AI research.
However, only with the advent of hyper-fast processing, massive data storage capabilities, and access to computing resources were neural networks able to advance to the point they have reached today, where they can imitate or even exceed human cognitive abilities. Developments are still being made in this field; one of the most important types of neural networks in use today, the transformer, dates to 2017.
With locations in more than 330 cities around the world, Cloudflare is in a unique position to offer computational power to AI developers anywhere with minimal latency. Cloudflare for AI lets developers run AI tasks on a global network of graphics processing units (GPUs) with no extra setup. Cloudflare also offers cost-effective cloud storage options for the vast amounts of data required to train neural networks. Learn more about Cloudflare for AI.