The Internet is borderless and decentralised. It allows information to cross the globe in a matter of milliseconds, making services and business models commonplace that would have been unimaginable a few decades ago.
But the legal and regulatory reality that organisations face concerning information and data is far more complicated. Data privacy concerns have prompted the creation and enforcement of stringent privacy regulations, such as the General Data Protection Regulation (GDPR) and the California Consumer Privacy Act (CCPA). Well over 100 countries have passed a unique set of data regulation laws, each implementing their own framework for how data can cross their borders.
Data regulations aimed at protecting consumer privacy are sometimes hard to interpret, constantly changing, and difficult to comply with given that the globally interconnected network of the Internet does not recognise national borders. Data regulations vary across the world, and at times, by industry, making it difficult for organisations to adhere to the latest standards, necessary certifications, and physical data storage requirements.
Caught between these two realities, many organisations lean on data localisation: the practice of keeping data within a given region, rather than allowing it to cross the world or leave a certain cloud region for processing and storage.
Yet data localisation introduces its own set of challenges.
Modern organisations have four main choices for where they want to run their applications:
Which model they choose has a major impact on both how their business will scale and on how data localisation can be implemented.
1. On-premise data centre: Storing data from in-region customers in an on-premise data centre makes localisation relatively simple. As long as the on-premise infrastructure is adequately protected, the data within remains local.
But for out-of-region customers, an on-premise approach makes data localisation all but impossible. To serve those customers, their data must be brought into the internal data centre, and out of the region of the data's origin.
2. Public cloud: In many ways, public cloud computing makes serving a global audience simpler compared to on-premise computing, since cloud-based applications can run on servers in a wide range of global regions. However, cloud computing also offers less visibility into where data is processed, creating a challenge for organisations that want control over where data goes.
Organizations that use public cloud computing and want to localise their data should consider where their public cloud vendor's cloud regions are located. A "cloud region" is the area where a cloud provider's servers are physically located in data centres. Restricting data to a given cloud region should make localisation possible. However, not all public cloud providers will have data centres in the required regions, and not all can guarantee that the data will not leave the region.
3. Private cloud: Like the on-premise data centre model, a private cloud model partially solves the problem of data localisation: if the cloud is located within the required region, then the data within is localised as a matter of course. But customers outside the cloud region cannot have their data localised unless additional private clouds are configured within their region as well. Running a private cloud in every region where an organization’s customers reside can become expensive to maintain. (Private clouds cost more than public clouds since the cost for the physical infrastructure is not carried by multiple cloud customers.)
4. Hybrid infrastructure: Similar data localisation challenges to what has already been described apply to hybrid models as well. Organizations often struggle to ensure data goes to the right place in a hybrid cloud model — especially a challenge when synchronising data across multiple different cloud platforms and types of infrastructure.
Keeping all infrastructure within one region inhibits the ability to reach a global audience; conversely, maintaining infrastructure all around the world is untenable for most organisations.
The best approach is to partner with a global edge network — either a CDN vendor or a vendor that offers additional services along with CDN caching — that is infrastructure-agnostic. This allows websites and applications to scale up to global audiences, no matter whether they use a hybrid cloud, public cloud, private cloud, or on-premise model.
Without granular control over where data is processed, data localisation is not possible. But without a widely distributed non-local presence, serving a global audience is not possible. Localizing and globalising are two opposite abilities — but ideally, a data localisation partner will be able to offer both simultaneously.
For organisations that need to localise data, the end goal is controlling where data is processed and stored. Organizations must evaluate edge network vendors to make sure they allow for localised control of where data goes and how it is processed.
Organizations that collect user data use encryption to protect that data both in transit and at rest, so only authorised parties can view, process, or alter it. For data that crosses networks, the encryption protocol in widest use today is Transport Layer Security (TLS). TLS relies on asymmetric encryption, which requires two keys: a public key and a private key. While the public key is made available to the entire Internet, the private key is kept secret.
Where the private key is stored determines where encrypted data, including potentially sensitive data, is decrypted. This is important for localisation because once data is decrypted, it becomes visible to any parties with access to the decrypted data.
TLS encryption is strong enough to stand up to encryption-breaking attempts from almost anyone. This means that data encrypted with TLS can safely traverse areas outside of the localised region — as long as it remains encrypted. To ensure that decryption only takes place within a designated region, organisations require two crucial capabilities:
Once data is localised, organisations must take precautions to ensure it remains in its localised region. Internal access control is extremely important for keeping data localised, especially for organisations with an international presence. If an employee outside the localised region accesses data from within the region, this counteracts all that was done to keep the data local.
Unfortunately, today many organisations have legacy authorisation systems in place that trust anyone within the corporate network, regardless of their location. This setup, known as the castle-and-moat model (with the network perimeter being the moat), does not easily map onto a data localisation approach. If anyone in the organization can access data, regardless of location, the data might as well not be localised.
Organizations can solve this by treating the location as an authorisation factor for accessing data.
This is easier to implement when organisations adopt a Zero Trust model rather than a castle-and-moat model. In a Zero Trust model, no user or device is trusted by default, even from inside the corporate network. Several factors can be evaluated by the Zero Trust solution before it grants access: device posture, user identity and privileges, location, and more.
In edge computing, applications run on an edge network with many points of presence, rather than in a few isolated data centres. This offers the advantage of running code all around the world, simultaneously serving users more efficiently and processing data as close to those users as possible. This aspect of edge computing makes localisation more feasible.
Another advantage of edge computing, from the localisation and regulatory compliance standpoint, is that different code can run at different parts of the edge. This makes for both effective data localisation and localised regulatory compliance: slightly different application functions can be deployed in different regions, depending on the regulations within those regions.
Privacy on the Internet is critical to the safety and security of our personal and professional lives, yet the Internet was not built with privacy in mind. As a result, fear around how Internet-based technology companies handle data and privacy abounds.
Cloudflare’s mission to help build a better Internet includes a focus on fixing this fundamental design flaw by building privacy-enhancing products and technologies.
The Cloudflare Data Localization Suite ingests traffic at over 250 locations around the globe, then forwards all traffic for localisation customers to data centres within the localised region. Traffic is neither inspected nor decrypted until it reaches an in-region data centre. With Geo Key Manager, customers can keep their TLS keys within a specified region. This enables Cloudflare customers to combine the benefits of relying on a global network for performance, security, and availability with the need for localisation.
This article is part of a series on the latest trends and topics impacting today’s technology decision-makers.
Learn more about how Cloudflare maintains data privacy with encryption and localisation.
Get the whitepaper