Anonymize relational data Sql Scripts
Sometimes we need to anonymize sensitive data that was used as an id of rows - consider the following tables:
create table Person(
social_security_number int,
name varchar(50)
)
create table transactions(
person_social_security_number int,
month int,
amount decimal
)
you can find a script to create this test data on [github] (gist.github.com/noam-honig/1f83d152b31b8272..)
The social_security_number
is sensitive and needs to be anonymized but it's also used in the transactions table, and the relation needs to be maintained.
- Create a temp table with the original value, and an anonymized
newId
select social_security_number as old_Id, ROW_NUMBER() OVER (order by name) as new_Id into id_map from Person
Update the original table with the generated
new_id
update Person set social_security_number = new_id from id_map where social_security_number = old_id
Update any other table that uses the
social_security_number
columnupdate transactions set person_social_security_number = new_id from id_map where person_social_security_number = old_id
This process will anonymize the social_security_number
while still maintaining the cross table relationships
You can find more relevant SQL scripts at https://gist.github.com/Tyriar/d3635c6b6e32ac406623