Object storage is a flexible and scalable cloud storage model for unstructured data.
After reading this article you will be able to:
Copy article link
Object storage is a method for saving large amounts of data, especially unstructured data, in the cloud. Much of the data generated by business activities is unstructured — including logs, video and photo content, sensor data, and webpages, among many other examples. Object storage maintains this data across multiple cloud servers, with each file or segment of data as its own object, complete with metadata and a unique name or identifier for data retrieval.
Object storage does not store these objects in folders, as in a traditional file-based hierarchy — instead all objects are stored together in a single "data lake" (also called a "data pool"). For this reason, object storage can store vast amounts of data very quickly — just as tossing clothes into a bag is a faster way to pack for a trip than folding and sorting clothes carefully into a suitcase.
Object storage can contain so much data that it is practically unlimited. It is also more cost-effective than some of the other cloud storage methods available. However, the cost of accessing the data after it is stored (known as "data egress") is sometimes prohibitive, depending on the vendor.
Cloud computing in general involves renting computing power and storage space from cloud providers, rather than using on-premise servers and computers. Cloud storage simply means storing data on a cloud provider's infrastructure — which may exist in one or more remote physical locations.
In a cloud storage context, an object is a unit of data. An object can be in any format and of any size. Photos, audio files, network logs, and emails alike can all be stored as an object.
Unlike typical desktop computer local storage, or cloud-based file storage, object storage is not sorted into folders. There is no one hierarchical path to get to each object; objects can be reached through a variety of paths. If Jerry saved a picture of a squid on his computer in his C: drive, he might save it in a folder called "Photos" and in a subfolder called "Squid Pictures." To reach this photo later, Jerry opens C:, then "Photos," then "Squid Pictures," then the photo itself. Jerry's path to the photo looked like this:
Desktop computer --> C: --> "Photos" --> "Squid Pictures" --> open photo
But if Jerry's computer worked more like object storage, he would instead use metadata about the squid picture — perhaps the file's name, the date it was taken, or its precise dimensions — to find it later. And instead of following a structured path like the one up above, he would simply find and open the file:
Desktop computer --> search for photo --> open photo
This is more like how object storage works. Objects are accessed directly, and instead of being stored in a series of subfolders, they are all stored together in a data lake (defined below).
Other critical components of object storage include:
Metadata: Metadata is information about a file, like its name, its type, or its size. The use of metadata helps set object storage apart from block storage, another cloud storage method. Because object storage is unstructured, metadata can be as extensive as desired and can take any form. For example, Jerry could assign any number of metadata labels to his squid photo to ensure he could find it quickly later. He could even assign it a unique number, or a "unique identifier."
Unique identifier: The unique identifier is a string (a sequence of characters) that is assigned to each object in object storage. This enables faster lookup and retrieval of that object later.
Data egress: When an object storage customer requests to load or access an object, the storage provider must transfer it to them over a network. This process is called data egress. Many object storage providers charge high fees for reading stored data, which can make object storage less cost effective for many businesses.
A data lake, or data pool, is a collection of unstructured data that can be as large as needed. Data lakes store any amount of data. Data in a data lake does not need to be put into a structure, reformatted, compressed, or have anything else done to it before it goes into the lake — just as water can enter a lake in the real world from multiple rivers and streams, and in both solid and liquid form.
Blob storage is a type of object storage. It stores Binary Large Objects (known colloquially as "blobs"), just as object storage does. Blobs do not have to follow a given format or have any metadata associated with them. They are a series of bytes, with each byte made up of 8 bits (a 1 or a 0, hence the "binary" descriptor), and any type of data can go in a blob.
Any activity that generates large amounts of data may work well with object storage. This is particularly the case if the data does not need to be accessed frequently. Some examples include:
Object storage tends to be:
Data egress fees can counteract the cost benefit of object storage. This blog post has a good breakdown of how object storage providers sometimes charge large markups for accessing stored objects.
Performance can be slower with object storage, particularly for data retrieval. Block storage is designed to load requested data faster, but it often costs correspondingly more.
Cloudflare R2 is object storage with no egress fees, making it much more affordable than many other cloud storage options. Cloudflare R2 integrates with the Cloudflare global CDN for maximum performance. It also integrates with Cloudflare Workers for enhanced decisions and customized request routing. Learn more about R2 object storage.