Data within Salesforce must be kept secure. Efforts often focus on Production (live), but any system holding Personally Identifiable Information (PII) data needs care. This includes sandboxes, where Data Mask can help.
Here, I’ll overview what a sandbox is. Salesforce Data Mask is then explored to see how it can help with data security in sandboxes.
What is a Sandbox?
A sandbox is a copy of your Salesforce Production system. They have various use cases, such as development, testing or training. Importantly, they are isolated from Production. This allows you to try things out, without breaking the live system!
There are several types of sandbox. Each fulfil different use cases. The types available depend on your licensing. Whilst they all copy metadata (i.e. structure of the system), only two copy data:
- Developer: Metadata only
- Developer Pro: Metadata only
- Partial Copy: Metadata and a sample of data
- Full Copy: Metadata and all data
High quality data is essential for Production. However, it is important in sandboxes too. Sandboxes are often used for testing and training. Poor data quality can cause issues. For example, automation can fail and reports show spurious results. Both take time to troubleshoot and fix.
This poses a challenge. Data quality needs to be sufficient to be useful. Conversely, PII and confidential data must be secure. For instance, it may not be appropriate to share PII data with third parties working in a sandbox. Equally, users may be granted higher permissions than in Production. These situations can cause confidentiality issues, compliance concerns (e.g. GDPR) and the potential for reputational harm.
Anonymising data manually can be time intensive. This is where Data Mask can help.
Salesforce Data Mask
Data Mask is a paid add-on provided by Salesforce. It is compatible for specific editions and products.
The product is provided as a Managed Package. Once setup, Admins and Developers can obfuscate data within a sandbox. In other words, field values can be:
- Replaced with random characters (e.g. John could be updated to u4tF1)
- Replaced with random characters, following a pattern (e.g. UK postal codes)
- Replaced with random, comparable values from a library (e.g. Jane could be replaced with Barry)
- Delete values within fields (e.g. deleting comments or PII data)
This can be applied to standard or custom objects. However, there are limitations. For example, picklist, formula, checkbox and roll-up summary fields are not supported. Click here for more information.
Whilst I have not had a chance to be hands-on with the product, there appear to be various benefits. From what I’ve seen, these include:
- Easily assign access via Permission Set Licensing
- Native obfuscation methods are used. No external services are required (e.g. no API calls)
- Obfuscation is irreversible and non-deterministic (i.e. cannot be systematically reversed)
- Select whether to anonymise Case Comments or delete emails and/or Chatter comments. This helps to remove potentially sensitive data easily
- Masking rules can be created in Production and cloned into sandboxes during creation and refreshes. This prevents recreating rules multiple times
- Masking rules can only be run in sandboxes. This prevents accidents in Production!
- Flexibility to decide whether to replace field values with random characters, patterns of random characters, values from a library or deleted entirely
- Filter records to limit where the Masking rule is applied. This is helpful where there are large data sets or complex requirements which could be time intensive
- Automatically turns off processes, workflows, triggers, validation rules, flows, field history and field tracking during the process and re-enables after them after completion
- Shipped by Salesforce as a Managed Package. Any updates are automatically applied
There are various considerations for Data Mask, summarised here.
Beyond this, there are other factors worth bearing in mind:
- How complex are your needs? Not every org will need Data Masking. Consider how complex your org is. For example, how many sandboxes do you use? How frequently are they requested, etc. Can you manage the process manually or create your own solution?
- What types of sandbox are you using? The documentation I have read so far refers to Data Mask being used in Partial and Full Copy sandboxes. I am unsure if Developer and Developer Pro sandboxes are supported (if seeded manually). I will update once I know more
- How are you currently seeding your sandbox? Are you already using a tool for data seeding? Does this tool already support data masking? For example, there are a number of AppExchange products which support sandbox seeding and anonymising data!
In short, it is important to select the option which suits your needs. If in doubt, it is worth a chat with your Salesforce Account Executive to learn more. Having said that, if you are interested in Data Mask, I would probably recommend checking out Salesforce Backup and Restore too.
Where can I learn more?
If you wish to learn more, the best resources I have found on Data Mask are:
- Salesforce Platform Demos: Data Mask: 2 minute video summarising key features
- Sandboxes and Data Masks: ~1.25 hour webinar by Salesforce Support
- Secure Your sandbox Data with Salesforce Data Mask: Help and Training Documentation
- Trailhead – Salesforce Data Mask: 30 minute Trailhead Module
Data Mask is a Salesforce add-on, allowing you to anonymise sandbox data with ease. This has the potential to simplify data management and compliance challenges in sandboxes. Be sure to consider your options carefully and select the right approach for your business’s needs.
Bonus Penguin Fact
This week in Penguin news, Greensboro Science Center in the US are celebrating the arrival of an African Penguin chick!
They are asking for help in selecting a name. If you want to help name a penguin, check out their blog here.
Found this article useful? Why not share!