I spoke with Patrick Barch, Sr. Director of Product Management at Capital One Software, about best practices for cloud data cost optimization; he also highlighted the benefits of Capital One Software’s data management solution, Slingshot.
See below for an edited transcript of our conversation.
Listen to the podcast:
Also available on Apple Podcasts
Watch the video:
An edited transcript of our conversation:
Why is cloud data cost optimization so crucial for companies these days?
Well, the cloud is really a story of more: more power, more flexibility, more speed of adoption; it puts more data into the hands of more people to make better business decisions.
But if you lean too heavily into that story of more, you risk spending way more than you intended, way faster than you thought. And so by putting the right governance, the right controls, the right practices in place, as you scale up your usage of the cloud, you can really achieve all of the benefits – the promise of the cloud. And do it in a way where you’re wringing every last ounce of efficiency out of every last dollar of spend.
What are some key best practices for cloud data cost optimization?
Because the cloud has really introduced the new model of paying for data, in the past, your data spend was a capital expense. You spend some time at each budget cycle deciding how much data capacity you needed for the following year and that was what you bought.
Today’s model is more pay-as-you-go. And so your bill could be a surprise at the end of the month. So the first thing that you really have to do is think differently about how you allocate your spend and the processes by which you engage your business teams.
At Capital One we call this a federated approach. We’ve engaged our teams, and what we do is we say, “Hey, business team, we don’t want our shared services group, we don’t want our central IT team to be a bottleneck, to stand in the way of you doing your job. The promise of the cloud is to do more faster. But we need you to adhere to some best practices.”
And so we built some centralized tools in a centralized platform with some centralized governance applied that really enable our teams to move at their own pace while ensuring that they’re doing it in a cost-controlled, effective, governed, well-managed way. So that’s number one: engage your teams.
Second, while you no longer have to manage the physical servers in a data center, you do have to right-size and optimize your infrastructure. There are very few reasons why a dev or a QA environment needs a large or a too-extra-large compute. Testing probably doesn’t require that kind of horsepower.
Make sure that you’re spinning up the right infrastructure for the job to be done at the moment. And this is where some of the policies that we’ve created internally come into play. We build all of those rules into our central platform so that our teams don’t have to keep all of that straight. But piece of advice number two is: make sure that you’re provisioning the right compute at the right time and at the right size.
And your teams are always going to bias toward getting their jobs done faster. Their first priority is not creating a well-managed data environment. Their first priority is driving more value for your business, serving their customers. And so it’s on you as a shared services team or a central IT group to enable them to do that at their own pace, but do it in a well-managed way.
The third thing is keep a close eye on your queries and your workloads that are running on your infrastructure. In the old days, when you had a fixed set of compute capacity on-prem, the consequences of a poorly written or bad query were really just taking up so much of that fixed capacity that other people also using the system noticed performance degradation.
In the cloud, that poorly written query can keep a warehouse up, it can keep a cluster up, it can make a cluster scale in ways that you don’t expect it to. That incurs more cost. And so now for the first time, bad queries, bad workloads mean unoptimized spend. And so look for ways to identify problem queries, identify potentially problematic users, and look for ways to optimize how they’re using the tools.
Let’s talk about Slingshot — the enterprise software created by Capital One Software. How does Slingshot help companies?
Capital One uses Slingshot to do all of the things that I just described. We’ve been using the tool internally for a number of years, and it really helps companies put in place the right kind of governance from the start so that you can create a well-managed Snowflake environment down the line.
And so we enable you to, through a single pane of glass, manage all of your Snowflake accounts – regardless of what region they’re in, what account they’re a part of, what business unit they’re under. You can apply a common set of standards to all of those accounts, and enable the users of those accounts to operate within a shared framework of governance.
Internally, we’ve seen a reduction in our overall Snowflake spend. We’ve seen improved efficiency as measured by cost per query. And we’ve really seen a reduction in the amount of manual effort that our shared services teams spend in one-off meetings with our business groups, provisioning infrastructure, answering questions, that normal back and forth that goes into provisioning new platforms.
The key is really making sure that if you’re spending more, you’re driving more business value, you’re not spending more because there’s wastage in the system. And so really rather than looking at this as a cost reduction play, you should be thinking about it as: how do you wring every last ounce of efficiency out of every dollar of spend?
News: What’s happening with Slingshot currently?
So this week we’re super excited to announce all of the new functionality that’s being released as part of the product. We spent the last year really working with our customer base, talking to prospects, getting feedback, learning how we can add more value. And that’s really taking shape in three major themes.
One is that the product is way easier to use and has way more customization and flexibility. Our dashboards are easier to navigate and drill into. The insights where we surface potentially problematic inefficiencies in your ecosystem are bubbling up to the top of the experience.
You can now allocate spend for all Snowflake cost drivers by whatever custom entity makes sense for your business, whether its line of business, whether its project, whether you want to charge back to your own customers. So just the ease of use, the flexibility – we’ve made a big push there.
The second big announcement we’re making this week is around increasing the breadth of our recommendation engine. Up till this point, our recommendations have really focused on helping you reduce spend and save money.
But there are times when you have a really important business process or a really important workload, and you want to spend a little bit more to make sure that things finish faster or within an SLA. And so rather than just telling you when you can save money, we’re also highlighting opportunities where you might want to spend a little bit more to give your users increased performance. And then we’re giving you all the information that you need to make the right decision for your business.
We’ve made it really easy to find and apply the highest value recommendations, whether that’s in terms of reducing cost or boosting performance. And so we’re excited about the additional value we can add there.
And then third, as I mentioned before, you really want to keep an eye on your queries. And so we’re launching the first version of what we call our query advisor. It’s really cool technology backed by a couple of patents. And what it enables you to do is take some of your more inefficient problematic queries, run them through some technology, and get recommendations for how those queries could be improved.
And so we have a long roadmap, we have a lot of runway left to go with how we optimize workloads. We’re excited to get the first version of this tool into customers’ hands, start getting some feedback, and really start driving some value.
What is it that really differentiates Slingshot in the market?
This is the only product that I know of that enables companies to implement that federated model of data ecosystem management that I mentioned before. We built it that way because we had some really aggressive deadlines for when we needed to move to the cloud. And so if you’re looking to operate in a way that the cloud demands, you need Slingshot.
Second, the breadth of our recommendation engine is something that I am particularly proud of. Not only do we look at where you can save money, but we’re getting smarter about where you may have an important process, where it might make sense for your business to actually go up because efficiency is all about running at the right performance for the right at cost.
And so we’re the only company, at least that I know of at the moment, that’s looking to help you strike that perfect balance and wring all of the efficiency out of your spend.
And then third, our query advisor tech is pretty cool. It’s generated a ton of savings internally. We’ve reduced our cost per query by about 43%.
There’s a core challenge that companies face with cloud data cost optimization: maximizing efficiency. And there’s a number of strategies for that. How do you see this issue?
One thing that we’ve learned from talking to lots of different customers and prospects is the word efficiency can have different meanings. So for some companies, efficiency just means running at the lowest possible cost at all times — user experience, forget about it.
For some companies, efficiency means make sure my really important jobs with really important SLAs finish by, let’s just say nine o’clock in the morning, come hell or high water. I don’t care how much I have to spend, these jobs have to finish on time because I get charged for some reason if they don’t. And I would say that maybe 40% of cases fall into those two groups.
The bulk is really about striking that perfect balance between user experience and cost optimization. So running with the best performance possible at the lowest possible cost. And that’s a tricky one because what does that mean? And I can’t answer that question. A product can help you get to that question, but really that’s about, how can you leverage our tool to help you strike that right balance for your company. Because really that’s a unique situation that’s unique to you.
To learn more: