top of page
Bernard K

How to mask sensitive data in Tableau

masking sensitive data in tableau

Introduction

Data masking is a crucial technique used to protect sensitive information by creating a version of the data that retains its structure but conceals the actual values. This method is particularly significant for organizations that need to use data for purposes such as software testing, user training, or development while ensuring compliance with data privacy regulations.

It involves replacing of sensitive data with fictitious but realistic equivalents, ensuring that the original data cannot be reconstructed or reverse-engineered. This allows organizations to use the data without exposing personally identifiable information (PII) or other sensitive details, thereby mitigating risks associated with data breaches and compliance violations

In this article, I will be demonstrating two ways in which you can mask data in your Tableau reports.

Scenario 1: masking data that falls below certain threshold

Think of a case scenario where you need to mask data that falls below certain threshold. A good case is in the example below, showing the number of respondents by Gender and Age Group for different ratings. In order, to conceal the identity of respondents, you’re required to mask responses for cases where the total number of respondents falls below a certain number (let’s say below 35 for this case).

masking data that falls below certain threshold

(In this case, I want to mask the highlighted sections – which have the total number of respondents below 35).

To achieve this – in this survey scenario.

I’ll first compute the masked values using the following calculation.

calculation for masking data that is below certain number

Note: The above calculation returns “Respondent ID’s” for respondents that fall into the highlighted sections with a total count of below 35.

Note: Later I’ll aggregate the above calculation using the ATTRIBUTE () function to return asterisks (*) for the highlighted sections (assuming the number of responses per cell is more than one).

Next, I’ll compute the unmasked values using the following calculation.

To create the final view.

  • Add the Measure Values to the text/label shelf.

  • Filter out measures to remain with Masked and Unmasked measures in the view.

  • Aggregate the measure Masked using ATTRIBUTE () function, and the measure Unmasked using SUM () to get the final view below.

a sample view showing masked data

Note - In the above final view, sections where the total number of respondents is below 35 are represented with asterisks (*) (are masked) as intended.

Scenario 2: masking personal data such as phone number or addresses

Think of a reporting project where you’re required to mask personal data such as phone number and email address. For example, a phone number such as ‘12345642345’ could be presented as ‘1*********5’ in a visualization, or an email such as ‘ben.kilonzo@gmail.com’ could be presented as ‘b*********o@gmail.com’ in a report masking sensitive personal data.

To demonstrate this, I will be using the following dummy data generated for learning purposes.

practice data (sample dataset)

To mask the phone details as per the example above, all I need is to write a simple calculation using the MID function as shown below.

a calculation masking phone numbers

(The above calculation simply replaces all the 10 digits starting at position 2 with *, hiding the original values)  

Adding this calculation to the view results to.

Another calculation can be used to mask the email data – showing the first and last digit of username while replacing other digits with ‘*’, a simple calculation to accomplish this would be as follows.

a calculation masking email addresses

Adding this calculation to the view we’ve.

Note: With original fields hidden, you can build reports (using the masked fields) that guarantee sensitive personal information isn’t exposed to unintended data consumers.

final view showing masked phone numbers and email addresses

Conclusion

Data masking in reporting is a critical practice that ensures sensitive information is protected while still allowing for the generation of meaningful reports. Data masking in reporting is essential for protecting sensitive information, maintaining data useability, while facilitating safe data sharing.

If this post was helpful and you would like to receive more of these Tableau tips and tricks, kindly subscribe to our mailing list below.

If you like the work we do and would like to work with us, drop us an email on our contacts page and we’ll reach out!

Thank you for reading!

Comments


Blog.png
Black & white.jpg

About Me

More About the Author

Bernard K

Analytics Consultant | 3X Tableau Certified

Bernard is a data analytics consultant helping businesses reveal the true power of their data and bring clarity to their reporting dashboards. He loves building things and sharing knowledge on how to build dashboards that drive better outcomes.

Let’s discuss your data challenges! Let’s work together!

bottom of page