Why current tech stacks will fail DPDP (And what needs to change)

Organizations must decide how they want to handle data going forward—and that decision touches every corner of the business.

Last Updated: Feb 05, 2026, 16:59 IST6 min

The new DPDP law is explicit that data fiduciaries mus...

For more than a decade, businesses have invested heavily in building systems that capture everything, store everything, and remember everything. Whether you run a bank, a marketplace, or a consumer app, the philosophy has been the same: more data means more insight, personalization, and competitive advantage. Teams were encouraged to collect broadly and store indefinitely because the trade-off seemed obvious: the more you knew about your users, the better you could serve them.

The Cloud Was Built for Immortality, Not Amnesia

Executives believe they can comply with deletion requirements because their teams assure them: “We can delete records from the database.” But what no one says out loud is that the primary user table is only the first stop; everything after that is guesswork. Over time, data flows into analytics dashboards, customer support tools, third-party SaaS vendors, and AI training datasets. It also lives in the shadow copies that different microservices created in their own databases for performance or ownership reasons.

The new DPDP law is explicit that data fiduciaries must provably delete personal data when the purpose is fulfilled or when a data principal requests erasure. Failure to do so can trigger some of the Act’s highest penalties, including fines up to ₹250 crore for non-compliance with data deletion and retention obligations.

If this “Ghost Data” lingers anywhere in the stack, the organization is automatically out of compliance, even if the primary database was scrubbed. And it often remains hidden in places teams rarely think to check; Snowflake or BigQuery partitions, even Slack messages shared between developers. Few organizations can reliably map all this out, and it’s never for the lack of trying.

The Futility of “Search and Destroy”

In the face of regulation, enterprises may attempt a seemingly rational response: “Let’s find every place personal data exists and wipe it.” In practice, it quickly collapses under its own weight. You can’t delete what you can’t find, and you can’t find what you didn’t know was copied. With every new tool your teams adopt, and every new AI workflow you introduce, the number of places your data travels multiplies. And the faster an organization scales, the more unmanageable this approach becomes. At some point, “search and destroy” stops being a strategy and starts looking like a perpetual game of regulatory whack-a-mole.

The Real Fix Requires an Inversion

If you can’t confidently erase data everywhere it has traveled, the only sustainable solution is to prevent it from traveling in the first place. Modern privacy requires an inversion of the traditional data flow: instead of distributing raw personal identifiers across dozens of systems, you replace them with tokens; format- and function-preserving references that reveal nothing about the underlying identity.

In this model, analytics tools, support platforms, AI pipelines, and internal services all operate without holding any real personal data. Tokenization by itself isn’t enough though; what makes this durable is pairing it with encryption and centralized key control, so that identity resolution happens only in one place: a data privacy vault. The enterprise shifts from a copy-based architecture to a reference-based one, reducing exposure and simplifying compliance.

A Simple Analogy for a Complex Problem

Imagine every piece of personal data: the phone numbers, emails, Aadhaar IDs, as entries in a diary. In the old model, every team, tool, and vendor needed a line or two from that diary, so they each got their own photocopy in plain text. Over time those copies multiply, and even if you tear out a page from the original diary, dozens of photocopies may still be lying around the organization.

Tokenization flips that entirely. Instead of giving everyone a copy of the diary, you give them a reference tag: a token that behaves like the original entry for their workflows, but doesn’t contain a single word of the actual text, encrypted or otherwise. The real diary stays in one place (the aforementioned data privacy vault), encrypted and tightly access-controlled. Teams and systems get what they need to operate, but no one outside the vault ever holds the real writing.

Importantly for AI systems, this protects downstream models as well. Training an LLM or a RAG pipeline on raw PII is a permanent mistake. Once a model internalizes someone’s name or identifier, you can’t meaningfully “delete” it. With a reference-based architecture, the model only ever sees tokens. If a user invokes their Right to Erasure, you don’t chase every photocopy; you simply destroy the relevant contents within the vault. The AI’s inputs become instantly privacy-safe, and every ciphertext reference becomes functionally useless, even if tokens exist across many systems.

It’s Not About Deleting Better, It’s About Architecting Better

There’s a temptation to treat any new compliance law as a checklist: update consent flows, revise retention policies, add dashboards, train teams. Those steps are important, but they don’t address the core issue.

True compliance requires architectural change. Organizations must decide how they want to handle data going forward—and that decision touches every corner of the business. The ability to personalize and build AI responsibly depends on controlling data lifecycles. Operational costs rise or fall based on how much data sprawl exists. Vendor ecosystems must be evaluated through a privacy-first lens. And risk posture shifts dramatically when deletion becomes provable rather than “we tried our best.”

And importantly, leaders must recognize that privacy-first architecture isn’t a slowdown—it’s how you future-proof the business. As AI becomes more deeply embedded in every workflow, the companies that can trust the integrity, lineage, and legality of their data will be the only ones able to scale confidently. Clean, well-governed data is great for compliance, but it’s also the raw material every AI-driven enterprise depends on. Those who treat it that way will pull ahead.

What Now?

DPDP arrives at a moment when most companies don’t even have a current, honest picture of how their data estate functions. And that’s the real starting point. Not a compliance checklist, but a recognition that modern digital systems have grown too sprawling to manage on instinct. Once organizations take stock of where data actually moves, the next question becomes unavoidable: Is this flow even defensible anymore?

That’s where architecture does the heavy lifting. The only way to shrink exposure, simplify governance, and make deletion real rather than aspirational is to redesign how identity travels inside the business. Centralizing sensitive data in a vault, pushing tokens outward, and allowing every downstream system, including AI, to operate on reversible references creates an environment where change is possible without incurring permanent risk. It’s not just about solving for today’s deletion requests; it’s about ensuring tomorrow’s models, workflows, and automations don’t inherit yesterday’s mistakes.

Privacy is finally acting as a forcing function for engineering discipline. The shift isn’t cosmetic. It requires rethinking the foundation so the business can move faster, scale AI safely, and evolve without dragging a decade of accumulated data debt behind it. The systems we built were designed to remember. The systems we need now are the ones that can let go.

Roshmik Saha is a Co-founder & CTO of Skyflow.

First Published: Feb 05, 2026, 17:10

Subscribe Now

Roshmik Saha

Home

Upfront

Column

Why-current-tech-stacks-will-fail-dpdp-and-what-needs-to-change

Lists

Multimedia

Thought Leadership