Your Old Data is Not a Family Heirloom

Picture a house with a big attic. Every time someone doesn’t know where to put something, they toss it upstairs. Old cables. Mystery boxes. Papers from a job someone left three years ago. A treadmill that became a coat rack.

This human moment of uncertainly happens in organizations the same way, except the attic is a data attic. Instead of the treadmill, we end up with boxes nobody wants to open because everyone is vaguely afraid of what’s inside them. When teams aren’t sure what to keep, where it belongs, or how long it should live, data moves into the attic, but retention fails in the attic because the attic is invisible. It’s unused for a reason, and it becomes forgotten quickly. It’s invisible and harmless until it’s the problem, like when an incident happens or when someone says, “Shouldn’t this have already been deleted?” and suddenly, the answer is simple.

We have ROT (redundant, outdated, or trivial data)! And we’ve had it longer than we intended in more places than we can name, and no one takes responsibility for it. And that’s how “We just never bothered to delete it” turns into “Please hold while we go spelunking through the attic with a flashlight.”

The Practical Reason Data Retention Gets Skipped

Simple. It’s hard to operationalize data retention and data deletion efforts in a way that doesn’t give your workforce ulcers. Data retention spans multiple owners, touches multiple systems, and most of all it requires decisions people are nervous about. On top of the nerves is the spinning alarm bells that deleting data is irreversible.

So, we do the understandable thing. We postpone, and our data attics grow.

Deletion Reduces Incident Costs, Blast Radius & Hard Costs

There are a few things a play when you follow your data retention schedule. This might be preaching to the choir of the privacy professionals out there, but reducing ROT isn’t a paper victory or an organizational victory. It’s measured in positive consequences and risk mitigations.

One of my old partners in big law used to say, “You can’t steal what isn’t there.” I think this is a great starting point, but I would expand it because there’s so much more:

You don’t have to pay storage fees for what isn’t there.
You don’t have to pay to protect what isn’t there or monitor the dark web for it.
You don’t have to search what isn’t there during investigations.
You don’t have to produce what isn’t there during legal disputes.
You don’t have to decide which “version” is the right one that extra shadow copy isn’t there.
You don’t have to notify a long-stale customer or employee because someone took their data when you should have deleted it already.
You don’t have to try to sift through it when your business merges with a buyer, and they decide they don’t want any ROT introduced into their datasets.

Sometimes, as they say (whoever they are), more is just more. It’s not better, just more. Good retention (and importantly deletion) practices streamline your organization on multiple fronts.

The Three Layers of Data Retention (Policy, Schedule, and Deletion)

I like to think of data retention/deletion as three separate layers. If any of the layers is mucked up, it can make the whole thing lumpy. Maybe not eloquent but hopefully give me some grace on that analogy. Anyway, the layers are:

Data Retention policy (e.g., governance): Who owns the retention domain, how decisions are made, review schedule, and how exceptions work.
Data Retention schedule (also known as the map or the trigger for the mechanism): What data types exist in which systems, how long they are supposed to retained for, and who owns each record.
Data Deletion reality (the mechanism): The action of actually deleting the data, including some of the stuff that doesn’t always get thought about, such as what “delete” means in each system (e.g., hard delete, soft delete, anonymize), how vendors are expected to and actually handle deletion, and what happens with data in backups.

Most organizations have a policy, often because they are required to show it to a client or partner or investor. Some have a schedule. It’s when we get to the deletion reality that things get dicey. It’s often overwhelming to develop a mechanism that is documented and implemented in a way that engineers, IT, and vendors can execute consistently.

Practical Implementation Without Overwhelm

If retention (or deletion) feels too big, which is probably the case for most of us, I like to recommend starting with a sensible pilot. Pilot is a fancy way of saying just start somewhere, and sensible is a diplomatic way of saying picking your poison.

Practically speaking, pick 3–5 high-impact systems (e.g., customer CRM, support ticketing, HRIS, analytics warehouse, core product database) that should have been dealt with yesterday. Lucky for you, the next best day to deal with those systems is today. Identify your top 10 data types in each system and establish a default retention rule for everything else.

Note: This is not legal advice, and I’m not giving any regulatory retention advice in this article. If you are a highly regulated entity and have statutory retention requirements, your job may be a little harder. Luckily, you can also bring this project to your legal team or outside counsel and get help establishing those default retention rules.

However you start this journey, perfection isn’t the point. You want to start creating a defensible process that you can repeat for all your systems. So, start with your sensible pilot systems and decide how long each data type in your systems needs to stay and then take the next step and figure out how you plan to delete the old data, both immediately and going forward.

Not every system is equal. It may be harder or easier depending on the configuration, nature of the data, etc. You might be able to automatically purge some data and then know that it will be 60 days after that the data will also disappear from backups. You might have to manually delete other types of data from shared drives, etc.

One recommendation I make is to have a quarterly deletion day inside your company, where the entire workforce is prepped, gets a refresher on what needs to happen, and then everyone spends the day going through systems and purging their own data and that data within their sphere of ownership.

Fast forward 6 months. You’ve gotten your pilot systems under control, and you’ve moved on toe less critical systems. You feel like you’re finally getting your arms around your data attic. How do you maintain a clean house? Well, that’s where PDL’s practical privacy tool for the day comes into play.

Become a paid subscriber to get access to all of the mini tools that we publish with each post. For instance, this post includes a Data Retention Trigger Map that you can deploy this week!

Finally, reminder that the opinions expressed in this article are the opinion of The Privacy Design Lab. They are not legal advice, and no attorney-client relationship is formed by reading this article or downloading the Data Retention Trigger Map. If you need to consult legal counsel, you can book a consult with ARLA Strategies or other legal counsel you trust!

If you’re tired of privacy advice that only works in theory, you’re in the right place.

The Privacy Design Lab exists for people who want to practice privacy, not just talk about it. It focuses on practical, repeatable ways teams actually learn. We offer hands-on workshops, downloadable systems, and the Privacy Design Lab Studio community where teams and practitioners can go deeper. Paid newsletter subscribers get access to all of our micro tools.

If that sounds like your kind of work, we’d love to have you.

Your Old Data is Not a Family Heirloom

The Practical Reason Data Retention Gets Skipped

Deletion Reduces Incident Costs, Blast Radius & Hard Costs

The Three Layers of Data Retention (Policy, Schedule, and Deletion)

Practical Implementation Without Overwhelm

Keep Reading

The Privacy Design Lab