If you’re reading this, no doubt you take data security seriously. Your company might even have an industrial-strength firewall, rock-solid data governance rules, a professional security team, and a roster of data protections like encryption. You feel good about it. You sleep well at night.
But should you? Sorry to ruin your sleep here, but the truth is that most companies overlook three major data protection risks that can be the difference between kinda-sorta secure (maybe!) and virtually airtight.
Those three security challenges are untrusted computation programs, unvalidated input, and unauthorized data movement. Let’s break them down.
Also see: The Successful CISO: How to Build Stakeholder Trust
1) Untrusted Computation Programs
So, you’ve given people, teams, and companies access to your first-party data. Can somebody just come in and start processing it?
How do you know, whether they’re internal or external, that they’re using the program and algorithm that they said they were going to, and not something more risky? If you don’t have it registered, if you don’t have some proof that it’s the same one, you have no idea what’s really happening. It’s purely trust — and we live in a zero-trust world when it comes to data.
Every network, every device, every person and every service needs to be untrusted until it proves itself. What you need are zero trust technologies that are right for a zero-trust world.
What are these? How do you know that somebody’s computer program, algorithm or analytic routine is safe? Zero trust technologies can validate with perfect confidence, such as with a “fingerprint,” that the agreed-upon code can be compared against code that is executing, in real time. (Failure to do this compromised the computers in the recent Ukraine cyberattack, by the way.) Otherwise, you can’t look your compliance team or your legal team in the eye and say, “Yes, I know — authoritatively! — that our data is always being used the right way.”
Also see: Secure Access Service Edge: Big Benefits, Big Challenges
2) Unvalidated Input
When you’re dealing with first-party data (or any sensitive data) you need absolute certainty that only the data that you permit to be accessed is accessed.
There are some tools for that out there. For example, access control-based tools, whether on-premises or in Cloud-based warehouses like Snowflake, can help. But in many cases, even with the best legacy tools, you’re providing access to an entire dataset, not necessarily to just the data that you need. And the innovation that’s really required for safety in a big data world is to be able to constrain access to just certain data in a data set.
You need a validation technique that is absolutely reliable, not just initially, but all the way through the processing and use of the data — so you can understand not just how someone got it, but also what happened to it.
For example, what if one of your data scientists wants to try cross-indexing or sorting on another column? That might seem okay in a world where everyone has good intentions, but what if the data science guy or gal isn’t quite so benign? What if suddenly super sensitive data on social security numbers, credit cards numbers, race, gender, household income and sexual orientation are used deliberately (or inadvertently) for more nefarious purposes?
These are real things that happen and it’s critical that in a protection-first world you’re able to not just reduce the likelihood but eliminate the possibility of that ever happening.
Also see: Best Website Scanners
3) Unauthorized Data Movement or Mining
Data scientists get creative with data sets all the time; it is part of the job description. But without the right controls in place, things get messy.
It’s like someone in financial services saying, “I’m just going to move this money to Dave’s account for a few days because I did something wrong and need to cover it. I promise I’ll move it back to Diana’s account later. She’ll never know.” That never works, and in today’s regulatory environment it creates opportunities for people to go to jail and for companies to get in hot water on CNN. And that fast and loose behavior happens all the time in data science when there are no controls in place.
What you need is an ability to control access for even informal calls and queries of data. It’s an extra step for the data analysts, but it’s also the protection that you need. It brings more discipline and arguably a better process and more accountability. If you are running a company, whether it’s in ad tech or another sector, you need to be able to address this issue definitively, consistently, and forever.
Also see: Top Data Mining Tools
A Virtual Magic Bullet
The good news is that technology can solve all three of these problems. There are now ways to defuse these three security risks without disrupting your business — and still have your data uses grow. New technologies like machine learning can fingerprint computation programs. Trust-enabling secure enclaves — also called “confidential compute” — can keep data processing safe.
And new controlling and processing surveillance technologies like digital contracts can extract the best of legal contracts and technical parameters, including the “fingerprint” of an algorithm, and make input validation and data movement extra secure by surveilling processing 24×7 and reporting anomalies — and stopping data use in real time.
Add the capability to provide audit quality data, and you are finally able to close the data risks and prove it to your lawyers and compliance overseers. Better yet, all of these eliminate the risks that come with putting your highly curated and highly secure data to work.
And then you can really sleep well at night.
Also see: Tech Predictions for 2022: Cloud, Data, Cybersecurity, AI and More
About the Author:
David Blaszkowsky is Head of Strategy and Development of Helios Data, and a former senior U.S. financial regulatory official and bank data executive.