top of page
Writer's pictureHarry Tan

Going Global: A Guide to Multi-Region Deployment with Snowflake

Many tech and SaaS companies are increasingly building data centers across the globe to better serve their diverse customer base. By localizing data centers, these companies not only improve service speed and reliability but also address region-specific compliance and data sovereignty requirements. This global expansion strategy enhances customer trust and provides a competitive edge in today’s data-driven market.


AC, the company I used to work for, is a SaaS company and has clients from 170 countries and about 50% revenue coming out of the US. Last year, the company took a significant step to better serve our European clients: the launch of a new data center in the EU. The push for this expansion comes directly from our EU clients. Their needs go beyond speed and performance; they’re also about compliance and trust. EU regulations like GDPR make data sovereignty a non-negotiable aspect of doing business in Europe. A local data center simplifies compliance, reducing legal complexities. Our clients want their data within the EU. By establishing a local data center, we’re showing our commitment to meeting their needs and building long-lasting relationships.

As part of our company’s effort to deploy the entire service in EUDC, we, the data team, needs to recreate the data infra to support our platform reporting. We use Snowflake as our backend data warehouse to power our platform reporting. Snowflake’s multi-region capabilities make this expansion seamless, offering global reach while maintaining local compliance.

In the upcoming sections, we’ll dive into why Snowflake is the perfect tool for managing data across different regions. We’ll cover how easy it is to set up a new Snowflake account in a new region, the global data availability that Snowflake offers, how to automate Snowflake setup through CI/CD, and the ins and outs of moving data between regions.

Easy to start a New Snowflake instance in the new region

Creating a new Snowflake account is a quick and easy process that can be done in 10 minutes or less. To create an account, you will need to have the ORGADMIN role. If you do not have this role, you will need to contact your Snowflake account executive or support team.

To create a new account, follow these steps:

  1. Log in to the Snowflake web interface.

  2. Click on the Admin tab, then click on the Accounts tab, then click on the Create New Account button.

  3. Enter the following information:

  • Account name: A unique name for your account.

  • Cloud platform: The cloud platform that you want to use for your account.

  • Region: The region where you want to create your account.

  • Edition: The edition of Snowflake that you want to use.

  • Initial administrative user: The name and password of the initial administrative user for your account.

4. Click on the Create Account button.

Once your account is created, you will be able to log in to it and start using Snowflake.

  • You can create up to 10 accounts per organization.

  • The initial administrative user of an account is granted the ACCOUNTADMIN role. This role has full administrative privileges over the account.

  • You can create additional users for your account later.

Snowflake’s Global Data Availability

Snowflake is a cloud-based data warehouse that offers global data availability. This means that you can store your data in any of Snowflake’s data centers, regardless of your location. As of September 2023, Snowflake has data centers in all seven continents:

  • North America: United States (13 regions), Canada (2 regions)

  • South America: Brazil (1 region)

  • Europe: United Kingdom (2 regions), Germany (2 regions), France (1 region), Switzerland (1 region), Netherlands (1 region), Spain (1 region), Ireland (1 region), Sweden (1 region), Finland (1 region)

  • Asia Pacific: Australia (1 region), India (1 region), Japan (2 regions), Singapore (1 region), Hong Kong (1 region), China (1 region)

  • Middle East: UAE (1 region)

  • Africa: South Africa (1 region)

This global data availability gives us the flexibility to store our data where our company wants to build our next data center, while still ensuring that our data is secure and accessible.

CI/CD for Snowflake Initialization

Continuous integration and continuous delivery (CI/CD) is a set of practices that automates the process of building, testing, and deploying software. CI/CD can be used to automate the initialization of a Snowflake environment, which can help to ensure that the environment is consistent and repeatable across different deployments.

There are a few different ways to implement CI/CD for Snowflake initialization. We chose to use Terraform, an open-source infrastructure as code (IaC) tool. Terraform can be used to create and manage all of the resources that are needed for a Snowflake environment, such as users, roles, warehouses, and network policies. At the time we started, we were using CZI’s terraform extension. Now Snowflake has its own official implementation, which is built on top of CZI’s original repo.

Once the resources have been created using Terraform, they can be deployed to Snowflake using a CI/CD pipeline. In addition to using Terraform, you can also use SQL scripts to manage the data-related objects in a Snowflake environment. For example, you can use SQL scripts to create tables, views, and load data.

The combination of Terraform and SQL scripts can be used to create a comprehensive CI/CD solution for Snowflake initialization. This solution can help to ensure that the Snowflake environment is consistent, repeatable, and secure.

Data Replication across regions

Unlike data sharing, which mandates that associated accounts be in the same geographical region, data replication can occur across multiple regions. On the backend, this process essentially involves copying S3 files from one AWS region to another. Snowflake transparently passes on the S3 copy costs to us without any additional markup.

We employ data replication for two specific use-cases:

User Data Migration: Several of our existing European clients initially stored their data in our U.S. data centers. Now, they wish to transition this data to our EU facilities. We facilitate this migration using Snowflake’s replication features, following these steps:

  • Initialize replication tables in both the U.S. and EU Snowflake instances.

  • Set up the corresponding tables in the U.S. replication database.

  • Transfer client data into the U.S. replication tables.

  • Execute the replication command, which moves the data to the EU replication database.

  • Migrate the data from the EU replication tables to the primary tables.

  • Delete the data from the U.S. tables, including both primary and replication tables.

Aggregated Data Replication to the US: In our US Snowflake instance, we collect user activity data at the most granular level and aggregate it to derive metrics, statistics, and trends. To manage costs, we only replicate these aggregated metrics back to the US, where they are merged into the main metrics tables. This approach allows us to maintain a comprehensive, globally visible metrics and reporting system while keeping expenses in check.

Other Learnings: Considering the Standard Edition

While our U.S. account operates on the Enterprise edition, it’s not mandatory for all our accounts to follow suit. The Enterprise edition offers valuable features like 90-day time travel, key rotation, and multi-cluster warehousing. The latter is particularly crucial for us, as it allows for automatic scaling of unused resources. This is invaluable during high-traffic periods but may not be essential for a newly launched instance. The Standard Edition also offers cluster scalability, albeit without the multi-cluster option.

In terms of pricing, the Standard Edition is considerably more affordable than the Enterprise Edition. For our AWS EU region, the Standard Edition costs $2.60 per credit, compared to $3.90 per credit for the Enterprise Edition — a 50% difference. You can find pricing details for your specific cloud provider and region on the designated page.

Keeping an Eye on Expenses

As we venture into a new region, it’s challenging to predict the level of traffic and the time required for full-scale operations. Initially, we over-allocated computational resources, which in hindsight, was overly optimistic regarding EU user engagement. After a month, we began fine-tuning our settings to optimize Snowflake’s performance and align resources with actual query loads. This included a potential downgrade to the Standard Edition, as previously discussed. Ultimately, these adjustments led to a 90% reduction in costs.

At the same time, be aware of the three parts of the replication cost, especially the Data transfer costs. Snowflake charges for the data that is transferred between the source and target databases. This includes the data that is transferred during the initial replication process, as well as the data that is transferred when the replication process is refreshed.

1 view0 comments

Comentarios


bottom of page