What is object storage?

Object storage is a method for saving large amounts of data, especially unstructured data, in the cloud. Much of the data generated by business activities is unstructured — including logs, video and photo content, sensor data, and webpages, among many other examples. Object storage maintains this data across multiple cloud servers, with each file or segment of data as its own object, complete with metadata and a unique name or identifier for data retrieval.

Object storage does not store these objects in folders, as in a traditional file-based hierarchy — instead all objects are stored together in a single "data lake" (also called a "data pool"). For this reason, object storage can store vast amounts of data very quickly — just as tossing clothes into a bag is a faster way to pack for a trip than folding and sorting clothes carefully into a suitcase.

Object storage can contain so much data that it is practically unlimited. It is also more cost-effective than some of the other cloud storage methods available. However, the cost of accessing the data after it is stored (known as "data egress") is sometimes prohibitive, depending on the vendor.

How does object storage work?

Cloud computing in general involves renting computing power and storage space from cloud providers, rather than using on-premise servers and computers. Cloud storage simply means storing data on a cloud provider's infrastructure — which may exist in one or more remote physical locations.

Objects

In a cloud storage context, an object is a unit of data. An object can be in any format and of any size. Photos, audio files, network logs, and emails alike can all be stored as an object.

No file hierarchy

Unlike typical desktop computer local storage, or cloud-based file storage, object storage is not sorted into folders. There is no one hierarchical path to get to each object; objects can be reached through a variety of paths. If Jerry saved a picture of a squid on his computer in his C: drive, he might save it in a folder called "Photos" and in a subfolder called "Squid Pictures." To reach this photo later, Jerry opens C:, then "Photos," then "Squid Pictures," then the photo itself. Jerry's path to the photo looked like this:

Desktop computer --> C: --> "Photos" --> "Squid Pictures" --> open photo

But if Jerry's computer worked more like object storage, he would instead use metadata about the squid picture — perhaps the file's name, the date it was taken, or its precise dimensions — to find it later. And instead of following a structured path like the one up above, he would simply find and open the file:

Desktop computer --> search for photo --> open photo

This is more like how object storage works. Objects are accessed directly, and instead of being stored in a series of subfolders, they are all stored together in a data lake (defined below).

Other critical components of object storage include:

Metadata: Metadata is information about a file, like its name, its type, or its size. The use of metadata helps set object storage apart from block storage, another cloud storage method. Because object storage is unstructured, metadata can be as extensive as desired and can take any form. For example, Jerry could assign any number of metadata labels to his squid photo to ensure he could find it quickly later. He could even assign it a unique number, or a "unique identifier."

Unique identifier: The unique identifier is a string (a sequence of characters) that is assigned to each object in object storage. This enables faster lookup and retrieval of that object later.

Data egress: When an object storage customer requests to load or access an object, the storage provider must transfer it to them over a network. This process is called data egress. Many object storage providers charge high fees for reading stored data, which can make object storage less cost effective for many businesses.

What is a data lake or data pool?

A data lake, or data pool, is a collection of unstructured data that can be as large as needed. Data lakes store any amount of data. Data in a data lake does not need to be put into a structure, reformatted, compressed, or have anything else done to it before it goes into the lake — just as water can enter a lake in the real world from multiple rivers and streams, and in both solid and liquid form.

How does object storage differ from blob storage?

Blob storage is a type of object storage. It stores Binary Large Objects (known colloquially as "blobs"), just as object storage does. Blobs do not have to follow a given format or have any metadata associated with them. They are a series of bytes, with each byte made up of 8 bits (a 1 or a 0, hence the "binary" descriptor), and any type of data can go in a blob.

What are the best use cases for object storage?

Any activity that generates large amounts of data may work well with object storage. This is particularly the case if the data does not need to be accessed frequently. Some examples include:

Application assets: All the images, JavaScript, CSS, docs, and files for an application can easily be stored via object storage.
Backup and recovery: Object storage is ideal for storing system backups (regular backups are a best practice for ransomware recovery).
Analytics: Network events and application activity generate vast quantities of data. This data can be logged for later analysis in object storage.
Data archiving: Data that is not needed regularly but cannot yet be erased can go into object storage.
Media: Video, audio, and photographic files can be quite large, especially if they are high quality. The scalable nature of object storage makes it a good fit for such files.
Data for machine learning: Machine learning algorithms require large quantities of data in order to be trained effectively. Object storage can act as a repository for these large quantities of training data.

What are the benefits of object storage?

Object storage tends to be:

Scalable to any amount of data
Searchable via metadata and unique identifiers
Not complex, with no file hierarchies and no need for data to be reformatted or structured
Resilient, as cloud storage providers have many server pools for failover
Low-cost since customers pay only for the storage they need

What are the downsides of object storage?

Data egress fees can counteract the cost benefit of object storage. This blog post has a good breakdown of how object storage providers sometimes charge large markups for accessing stored objects.

Performance can be slower with object storage, particularly for data retrieval. Block storage is designed to load requested data faster, but it often costs correspondingly more.

Does Cloudflare provide object storage?

Cloudflare R2 is an object storage solution with no egress fees, making it much more affordable than many other cloud storage options. Cloudflare R2 integrates with the Cloudflare global CDN for maximum performance. It also integrates with Cloudflare Workers for enhanced decisions and customized request routing.

FAQs

What is object storage?

Object storage is a cloud-based method for storing massive amounts of unstructured data, such as videos, photos, and system logs. Instead of using a traditional folder hierarchy, it stores data in a data lake where each file is treated as a distinct object with its own metadata and a unique identifier.

How does object storage differ from file-based storage?

File storage uses a hierarchical structure that sorts files into folders and subfolders. Object storage keeps all files together in a single flat hierarchy; data is retrieved directly using its unique identifier or metadata rather than following a file directory path.

What are the main benefits of using object storage?

Object storage is highly scalable and cost-effective, allowing businesses to store virtually unlimited amounts of data and pay only for what they use. It is also highly searchable because it allows for extensive and customizable metadata, and it is resilient because cloud providers typically maintain multiple server pools for failover.

What are some common use cases for object storage?

Because it handles large volumes of data well, object storage is ideal for system backups, data archiving, and storing media assets like video and audio. It is also frequently used to house large datasets for machine learning or to store application assets like JavaScript and CSS files.

What are data egress fees, and why do they matter?

Data egress refers to the process of transferring data out of a cloud storage environment. Many providers charge high fees for this transfer, which can significantly increase the total cost of object storage even if the initial hosting price is low.

How does Cloudflare R2 improve upon typical object storage models?

Cloudflare R2 provides a more affordable object storage solution by eliminating data egress fees entirely. It also integrates with Cloudflare’s global network and Workers platform.