What is machine learning?

Machine learning is a type of computer program that can learn how to perform tasks without definite instructions.

Learning Objectives

After reading this article you will be able to:

  • Define machine learning
  • Explain how machine learning works
  • Differentiate between machine learning models and algorithms

Copy article link

What is machine learning?

Machine learning refers to a type of statistical algorithm that can learn without definite instructions. This enables it to do certain tasks, such as pattern identification, on its own, by generalizing from examples. Machine learning is a part of artificial intelligence (AI), which refers to a computer's ability to duplicate human cognitive activity.

Machine learning has a wide range of uses, including:

  • Identifying email spam
  • Detecting bot activity
  • Recommending content to users on streaming platforms and social media apps
  • Providing search engine results
  • Voice and image recognition
  • Chatbots and language translation
  • Medical research

Machine learning vs. AI

Machine learning and AI are not exactly the same thing; rather, machine learning as a discipline falls under the umbrella of AI. But not all AI involves machine learning, as AI can include a range of other abilities as well.

How does machine learning work?

Machine learning is based on inputs and outputs. A machine learning algorithm is fed data (input) that it uses to produce a result (output). A machine learning model "learns" what kind of outputs to produce, and it can do so through three main methods:

1. Supervised learning

For the most basic kind of machine learning program, the programmer curates a set of example inputs and the correct outputs. The machine learning algorithm attempts to generalize from these examples so that, when fed an input by itself, it can produce the desired output.

Imagine a chef who is given a kitchen full of ingredients (the input) and a menu with a large number of examples of finished dishes from the menu (the output). By combining the ingredients in different ways and comparing the finished product to the example dishes, the chef can eventually develop the necessary recipes to create the menu items. Similarly, supervised learning enables an algorithm to learn how to produce the correct results without programmed instructions (or a recipe).

2. Unsupervised learning

Unsupervised learning is when a more advanced machine learning algorithm is fed raw data. It then identifies patterns on its own. Think of a chef who is skilled enough to simply look over a menu and come up with recipes to make those items.

3. Reinforcement learning

In this style of learning, the machine learning algorithm is trained through feedback. There are "good" outputs and "bad" outputs, and it learns over time how to avoid the bad outputs.

Reinforcement learning is a process of trial and error. Imagine that the chef has no menu to start with — but instead, everything they cook is evaluated by a food critic. Eventually, the chef is able to curate a list of dishes that the food critic likes, after ruling out all the items that the critic dislikes.

What is a machine learning model?

An algorithm is a set of preprogrammed steps; a machine learning model is the result when an algorithm is applied to a collection of data. Despite this distinction, the terms "machine learning model" and "machine learning algorithm" are sometimes used interchangeably. But the difference is important: two machine learning models can produce different results even if they use the same algorithm, as long as each model has been fed different data as a starting point.

What is deep learning?

Deep learning is a type of machine learning. It uses neural networks in order to learn to recognize patterns and make associations in raw, unstructured data. Deep learning is unsupervised and can perform extremely complex tasks. It is often used for speech recognition, automated driving, and other advanced applications.

What is a neural network?

A neural network is a method of machine learning that imitates the structure of the human brain. Neural networks are comprised of nodes that connect to each other. These nodes are spread across at least three layers: an input layer, an output layer, and one or more hidden layers.

Each layer contains several nodes that connect to each other. If a node perceives data as significant, it passes that data to the next node.

Think back to the chef making dishes in the kitchen:

  • If the chef has to make a cake, they might start by examining the ingredients in the pantry; this is like the input layer of a neural network.
  • The chef selects ingredients like flour, eggs, sugar, and cocoa powder. Ingredients like chicken stock or rice, meanwhile, are not selected. This is like passing statistically significant data on to the next node.
  • The chef combines ingredients in various ways — mixing cake batter, making frosting, and so on. Think of this as the hidden layers of a neural network, with nodes passing data to each other.
  • Finally, the chef bakes, frosts, and serves the cake; this is like the output layer. Along the way, irrelevant or incorrect data (like unnecessary ingredients and incorrectly mixed combinations) was eliminated.

What is a vector database?

A vector database is a method for storing data that enhances machine learning. Vector databases allow for similarity searches and identifying related items, as opposed to exact match queries. Storing data in this way helps machine learning models understand the context for the inputs they receive.

A vector database stores items in a matrix with various dimensions, and with vectors specifying each item of data's position along those dimensions. This allows machine learning models to find data in relation to other data. For example, a streaming platform can pair machine learning with a vector database in order to identify which movies to recommend to a viewer, based on their past viewing history.

What are some of the challenges to building machine learning models?

Data egress: Even the most advanced deep learning models require access to massive data sets to obtain accurate results. Cloud storage is ideal for saving these big data sets, since cloud computing is almost infinitely scalable. However, accessing that data often results in egress fees: charges from cloud providers for transferring data from storage.

Compute power and infrastructure: Machine learning, and especially deep learning, requires a lot of computational power. Machine learning models require the use of specialized, and expensive, hardware or cloud services — for instance, multiple fast, GPU-powered servers. (A GPU or graphical processing unit is more powerful than a traditional CPU.)

How does Cloudflare help developers build machine learning?

Cloudflare offers a collection of services to make it easy for anyone to use machine learning. Cloudflare Workers AI is a global network of GPUs that developers can use for running generative AI tasks. Cloudflare Vectorize enables developers to use a globally distributed vector database. And, Cloudflare R2 is object storage with no egress fees, enabling developers to store large data sets in the cloud and transfer that data for free. Learn more about Cloudflare for AI.