Snowflake Data Masking

Snowflake Data Cloud

by Venkat A. 30th June 2023 0 Comments

Snowflake Dynamic Data Masking (DDM) is a data security feature that allows you to alter sections of data (from a table or a view) to keep their anonymity using a predefined masking strategy. Data owners can decide how much sensitive data to reveal to different data consumers or data requestors using Snowflake’s Dynamic Data Masking function, which helps prevent accidental and intentional threats. It’s a policy-based security feature that keeps the data in the database unchanged while hiding sensitive data (i.e. PII, PHI, PCI-DSS),  in the query result set over specific database fields.

Static vs Dynamic Data Masking:

There are two types of Snowflake data masking: static and dynamic. By modifying data at rest, Static Data Masking (SDM) permanently replaces sensitive data. Dynamic Data Masking (DDM) strives to replace sensitive data in transit while keeping the original data at rest intact and unchanged. The unmasked data will remain visible in the actual database. DDM is primarily used to apply role-based (object-level) security for databases. 

Benefits of Dynamic data masking

How can we create Dynamic Data Masking Policy?

Below are the steps to create the Dynamic Data Masking policy:

The masking policy name, “ddr_Lname_Mask” is the unique identifier within the schema and the signature for the masking policy specifies the input columns in this example “last_name” alongside data type(string) to evaluate at query runtime. The return data type must match the input data type followed by the SQL expression that transforms or mask the data which is last_name in this example. The SQL expression can include a built-in function or UDF or conditional expression functions (like CASE in this example).

In this example, the last_name is masked if the current role of the user is Call_Center. Once the masking policy is created, it needs to be applied to a table or view column. This can be done during the table or view creation or using an alter statement.

Once the masking policy is applied, and a user (with a specific role) queries the table, the user(call center agent) will see the masked result .

Things to remember

Operating on a masking policy also requires the USAGE privilege on the parent database and schema.
Snowflake records the original query run by the user on the History page (in the web interface). The query is found in the SQL Text column.
The masking policy names that were used in a specific query can be found in the Query Profile.
The query history is specific to the Account Usage QUERY_HISTORY view only. In this view, the Query Text column contains the text of the SQL statement. Masking policy names are not included in the QUERY_HISTORY view.
If you want to update an existing masking policy and need to see the current definition of the policy, call the GET_DDL function or run the DESCRIBE MASKING POLICY command.
Currently, Snowflake does not support different input and output data types in a masking policy, such as defining the masking policy to target a timestamp and return a string (e.g. ***MASKED***); the input and output data types must match.
Dynamic Data Masking Privilege can be seen here.

Conclusion

Snowflake’s Dynamic Data Masking is a very powerful feature that allows you to bring all kinds of sensitive data into your data platform and manage it at scale. Snowflake’s policy-based approach, along with role-based access control (RBAC), allows you to prevent sensitive data from being viewed by table/view owners and users with privileged responsibilities.

If you’re looking to take advantage of Snowflake’s Data Masking feature, the data experts at 64-squares would love to help make this a reality. Feel free to reach out today for more information.

Author

Venkat A.

Venkata S is a Solution Architect working on Cloud projects. He has profound skills on Azure and Snowflake and has a 12+ years of experience in leading the projects.
View all posts

Social Share

18th October 2024

Snowflake Data Masking