50 percent of all corporate data is stored in the cloud, according to Statista.
That's a lot of data in the cloud, given how much data is collected and produced daily. Most of this data is stored in Amazon S3 buckets, Google Cloud Storage, Azure Blob, and a host of different storage options available on cloud platforms.
The question then, is how do we secure the data stored in these storage options, and in particular, how do we secure the data stored in Amazon S3 buckets?
Vinoo, who oversees critical data pipelines as Head of Business Engineering at Citadel Investment Group, shares some advice on securing your S3 buckets and also talks about how IAM audits and the culture surrounding IAM audits should change.
Tune in to this episode of Ask A CISO to hear:
- What S3 Buckets are
- What are the various ways to provide permissions to S3 Buckets and the vulnerabilities associated with each
- What could have happened in a real-life case study of data leaked from S3 Buckets
- What IAM audits are how they should be conducted
- What needs to be changed about how IAM audits are conducted and the culture surrounding IAM audits
About The Guest: Vinoo Ganesh
Vinoo is a Distributed Systems Enthusiast, Speaker, and Technologist who is currently the Head of Business Engineering at Citadel Investment Group. In this capacity, Vinoo oversees critical data pipelines as well as investment platforms.
Previously, Vinoo was CTO of a geospatial intelligence data-as-a-service startup where he built systems that processed over 2 TB of geospatial data on a daily basis and enabled over 80 customers to operationalize this data.
Prior to that, Vinoo led the Compute team at Palantir Technologies (managing Spark and its interaction with HDFS, S3, Parquet, YARN, and K8s) and worked across a number of commercial and government verticals.
Vinoo is also an experienced startup advisor who has advised Databand.ai’s development of tools to solve data observability problems across the stack, and Horangi’s development of their best-in-class cybersecurity product, WardenWarden.
About The Host: Paul Hadjy
Paul Hadjy is co-founder and CEO of Horangi Cyber Security.
Paul leads a team of cybersecurity specialists who create software to solve challenging cybersecurity problems. Horangi brings world-class solutions to provide clients in the Asian market with the right, actionable data to make critical cybersecurity decisions.
Prior to Horangi, Paul worked at Palantir Technologies, where he was instrumental in expanding Palantir’s footprint in the Asia Pacific.
He worked across Singapore, Korea, and New Zealand to build Palantir's business in both the commercial and government space and grow its regional teams.
He has over a decade of experience and expertise in Anti-Money Laundering, Insider Threats, Cyber Security, Government, and Commercial Banking.
Welcome, everyone. I just want to introduce Vinoo who's here with us today. Vinoo is a distributed systems enthusiast, speaker, and technologist who is currently the Head of Business Engineering at Citadel Investment Group. In this capacity, Vinoo oversees critical data pipelines as well as investment platforms.
Previously, Vinoo was CTO of geospatial intelligence Data-as-a-Service startup where he built systems that process over two terabytes of geospatial data on a daily basis to enable over 80 customers to operationalize this data. Prior to that Vinoo led the compute team at Palantir Technologies where he managed Spark and its interaction with HDFS, S3, Parquet, YARN, and K8s, and worked across a number of commercial and government verticals.
Vinoo is also an experienced startup advisor who has advised Databand.ai's development of tools to solve data observability problems across the stack and development of their best-in-class cybersecurity product Warden.
Welcome, Vinoo, and good morning!
Hey, Paul. Happy to be here.
Yeah, thanks for joining us from New York, so we're 13 hours away and, I think, almost exactly on the opposite side of the globe.
That's crazy! Actually, yeah, I just realized that.
How's your morning going? How are things there?
Things are going well. You know New York's in a much better place post some of the big waves of COVID. Weather's been a bit temperamental but you know it's New York.
City's generally looking up so it's exciting to have a good reopening period.
Yeah, definitely, I imagine Singapore is going through the same so it's good to be getting close to reopening.
Yes, so maybe a first question: kind of just tell us a bit more about your career and kind of how you came to do what you do now and maybe even a bit of what you enjoy doing outside of work?
So I can start off by saying that I became obsessed with data at a young age but I mean the reality is I kind of ended up in a world where I developed an obsession around data.
Starting off at Palantir, I actually worked on Palantir's large-scale distributed structure and storage data tool for a few years. It's actually where I met Paul first in Singapore, so that kind of grew, and as I took on more and more responsibility, ended up touching more of the mission-critical data sets in our customers' workstreams, whether that was government customers, healthcare customers, financial customers, and eventually culminated in me leading what we call the compute team, so all open-source distributed computation products that Palantir uses to actually power data analytics fell under my purview.
And that time is seeming pretty interesting. It taught me that the analytical landscape have developed so fast and so quickly that we just had these immense tools that were able to do incredibly complicated analysis, computations, and so on and so forth, but we kind of lack the scale of data needed to make these tools meaningful. In other words, you'd have a one terabyte SPARK cluster and someone would put a 10 megabyte CSV in it.
Not just like wasteful. It's kind of just we don't have the data to warrant tools this powerful.
So that was a problem I wanted to solve. So I switched gears, jumped over to Veraset which was only focused on the data solution. So VerasetVeraset is a Data-as-a_Service company, meaning we take in data, we process it, we clean it, we prepare it, and then we actually distribute it to our customers all in AWS, GCP, and just make sure that our customers can actually consume things properly in the cloud.
There's a hugely educational experience. Taught me a lot about the selling side and then I was interested in the buying side. So I'm at Citadel now where I oversee our data engineering teams that are procuring large-scale data for our investment teams.
Awesome. Exciting stuff and exciting career, so maybe you can tell us kind of one of your most favorite memories of Palantir since it's, you know, everyone sort of knows them at this point?
Absolutely. I think one of the most interesting memories is realizing that software can have bugs in a visceral scale, not just like your program in school or something else.
So I think there was a moment working on this distributed storage tool where we had a use case for a full-text type of search functionality. So previously you do an exact match search, the exact string you know would come back for some of our use cases we'd want these full-text search capabilities.
We built a system where we'd actually pre-index things in memory to make speed of lookup faster, and the way I built it was not optimal, to say the least, and so we would actually start up our server and the whole thing would crash just immediately going out of memory.
So I think it was one of my favorite memories mostly because it was a huge learning experience of actually understanding the nuances of distributed systems and computation, but also really understanding at a visceral level of teamwork and just, you know, like excitability, passion, and drive some of my coworkers. They were all willing to help out, all willing to jump in, and you know, kind of changed my approach to education from this oh my god, everything has to be perfect all the time to hey, things happen. I just, you know, push through and set up best practices moving forward, but it's okay things go wrong here and there.
Yeah, for sure. Definitely had lots of things go wrong. I learned the hard way never to change the hostname on an Oracle box which I think I've actually mentioned in a previous podcast, but, yeah, was fun times back in the early days, and I think I know the specific technology you're talking about from Palantir as well. I don't know if Paul actually likes me or hates me over this technology, but we'll see.
I can't remember correctly but, yes, so let's talk a bit about buckets, more specifically S3 buckets and maybe for the sake of listeners who aren't familiar with AWS which is probably not that many, but maybe you can explain what S3 buckets are and why they're called S3?
Absolutely. So a lot of this knowledge comes from just having to broker data. I think the first thing that people need to know about S3 is it's not a file system and I want to say that just like at the outset. A lot of what I say following this is really going to address that, but S3 stands for a simple storage service and that's what it's actually intended to be: a very simple place to put data. When you hear people talk about storing data in the cloud, if they're using Amazon, they're generally talking about S3. Now, because it's not a file system, it comes with a bunch of complications and a bunch of nuances that are I guess a little bit difficult to understand and work through.
So the first one is really again there it's not a file system. A file system has these wonderful delimiters, you know, like A/B/C. S3 doesn't really have those. It has them from a more UI perspective, but in practice, everything served in S3 is just an object that's dumped there, meaning you can't really infer easy relationships from one object to another unless you're doing, unless you prepare your data correctly and prepare your access model correctly, but it's a very very flexible tool for storing data.
Awesome, yeah, we know S3 buckets quite well at Horangi because it's one of the most, sort of, common security issues that we find and the customers that are using our product for the first time, but, yeah, in mind of that let's talk a bit about access.
So access can obviously be given in a couple of ways. So you got like a bucket policy, bucket ACL, and object ACL. Can you explain each and even address like associated vulnerabilities between them?
Absolutely. So, just kind of to start things off, the ACL model is the original way of accessing data in S3. There were bucket objects which pretty much said how you can access or how you can secure data in a bucket or manage permissions in a bucket, then there were object ACLs which are how you manage permissions on an object.
Now, going back to this S3 is not a file system is really going to dictate a lot of why these have happened the way they do. In the old pre-IAM world ACLs were effectively, and we'll talk a little bit about IAM, I think, probably later on. IAM stands for Identity And User Management but in the old IAM world there were objects that need to be permission to individuals groups, resources, and you do this through S3 ACLs. They tended to be very complicated and it was very easy to lose like ... make an update, forget what the update was, debug in this complicated debugging process, and so for good reason they are legacy now. They're not deprecated, meaning we can still use them, but they are legacy.
A bucket policy is the next iteration in the world of S3 permissions. It pretty much says who can do what where. So in S3 terms which user, they call a principal, can do which actions like an S3 resource, get object, put object on what resources. So on what S3 buckets or what prefixes. It's a lot easier to understand the permission model here just because you can see all the policy in one place. You can debug things just by looking at the bucket explorer, the bucket policy tester, itself is you know interesting, not super easy to use, but it just gives you a single entry point for managing permissions.
The challenge comes from really trying to manage all of these together so some people have bucket ACLs on top of object ACLs with a bucket policy and then you're kind of just trying to debug everything and trying to debug why you can't have permissions to this object and it just gets really really complicated.
Yeah, I don't pretend to have to deal with this every day anymore but definitely had my fair share of dealing with it back in the day.
Let's kind of like bring this alive with a case study. In a recent breach, an S3 bucket was actually totally unsecured and exposed. Around three terabytes of sensitive airport employees data across Colombia and Peru.
How could this kind of ... how could this have happened and what would lead to that sort of issue in the wild?
Yeah, great question. I think one of the things that we need to remember as technologists is that managing these things can be incredibly complicated. There's a number of ways to accomplish the same task. You know, we just described the bucket policy, bucket ACL, object ACL, example as one of them.
Managing your infrastructure has actually never been more difficult, especially because the infrastructure is no longer necessarily yours. It can live in these cloud vendors and so managing permissions on an S3 bucket just can be so difficult and so cumbersome that an accident actually goes unnoticed for potentially months or even years. I don't have too much information about what went wrong in this specific case, but it's entirely possible to imagine a scenario where a developer comes in, seemingly makes an innocuous change, and opens up permissions broadly or, as an example, removes a bucket policy in its entirety and all of a sudden, you know, ACL or something along the same lines can actually access the data now.
A lot of the S3 buckets are actually managed permission is security by obscurity. There's a lot of data out there and there's a lot of iterations of bucket policy changes that can cause things to go down in a pretty drastic way.
So I think in scenarios like this it's easy to blame the developers but I actually think what it is it's almost a culture of not understanding that these infrastructure changes are actually code and need to be controlled as such. Version controlling your bucket policies using technologies like terraform just in the onset can be very very powerful, but more importantly, the fact that this was noticed in the open means either an audit didn't occur or if they're not using a technology that would be able to alert on these types of issues proactively.
And so I think a couple of things may have gone wrong here, and obviously, data leaking is a very dangerous and very unfortunate situation as a whole.
Yeah, definitely, and you know obviously that's kind of like where Warden comes in: helping organizations that are moving fast in the cloud and giving that sort of sense of sort of comfort that something's out there checking that you know, of course, things are changing all the time. That's the beauty of the cloud is that you can iterate very quickly, but also some of the dangers as well, which is where Warden I think really helps a lot of our customers and identifying those issues quickly and also helping them fix it as you know.
So let's switch the game a bit here and talk about IAM audits, so maybe you can explain what is IAM and what are audits, and how those actually are used in practice?
Sounds good. So IAM stands for Identity And Access Management. It's pretty much the user management system of AWS. There are users, there are groups, there are roles.
You can actually assume a role and take a certain operation. IAMIAM is, at its core, a way of managing what a user can do and what permissions are attached to that user. As a basic example users are limited in permissions just like bucket access can be limited in permissions, so I can say something in my bucket policy, like Vinoo can access resource A slash B slash C in this S3 bucket but if my IAM permissions don't allow me to read S3 buckets I'm not going to be able to actually access that resource.
So the user permissions and the bucket permissions actually need to happen in conjunction for these to work together properly. Now, in terms of auditing ... I think it's funny when we say auditing. Auditing almost sounds like this invasive-like process of a doctor coming in and saying I need to look at x, y and z, you know, thing, make sure you're actually operating properly or operating in the same way ...
I think we should actually change the culture around audits and change a culture around the feeling of an audit. I think this may be an unpopular answer, but I think we should think of audits like penetration testspenetration tests - they happen randomly without advanced notice and you know if the system fails an audit, it's not the worst thing. It means we caught something internally that could have been caught externally.
The idea is, I mean now the idea obviously is we would like a 100 percent permissioning, security, and things to work sanely and right off the bat, but in reality that's not how software works, so I would even say the idea of an audit, extrapolating even further, is: rather than have this fixed interval or some invasive process that happens, I think we should change the culture to ensuring audits are not reactive but are actually set up on demand.
Like Warden is a great example of this, right? How do I actually build an AWS permission set or operate an AWS account in a way such that if I were to make a misconfiguration, I'm at least alerted of it, it's surfaced and I can be action if by anyone who would need to understand what happened. The whole has actually started moving to a world where it's not necessarily just an audit anymore, like someone coming in and giving you a PDF.
Systems are actually auditing themselves like Netflix has Chaos Monkey which randomly kills servers in a data center, is actually a form of auditing. It's making sure that Netflix is able to stay highly available in the event of downtime, and I think when we start thinking about it as less of a reactive operation and more of a, hey, there's a thing that we need to do just keep this secure, and it's okay if something goes wrong, but we need to be in front of that and address it as quickly as possible, we almost ....
We remove this assigning of blame and we also make it a growth and learning opportunity.
Actually, I think I've mentioned this to you, Paul. So many of the best practices that I have in my head are because of a technology like Warden. It flagged something and I realized, oh that was probably not the best practice. So it's almost, you know, with having these audits especially with a reputed vendor coming in, you actually democratize knowledge and people can share the learnings and best practices.
I watched a previous episode where you were talking about the Log4J vulnerability and how you were able to make a patch and distribute that, or make a check and distribute that.
Those types of knowledge sharing are invaluable among organizations that need to secure data and need to secure anything.
An audit to me is, and it was kind of a long speech, but I don't think we should think of these as these negative things. I think right now they happen to happen at a fixed interval or when some compliance period is kicking in, but changing the culture around that is I think mutually beneficial for both the auditor the engineers who are building the system, and the companies.
Yeah, I totally agree and, I mean, of course, I like the idea of them being automated for many reasons, efficiency being the main one, but also because our product does that for the specific areas we kind of touched on.
In your eyes, how often should IAM audits be done and how often should they be conducted?
So I would say consistently, frequently, as much as possible. I think, you know, it's ... an audit to me is not necessarily an introspection of a system for reprimanding the individuals who set it up; it's really a learning opportunity for the whole organization, and I think if you ... an audit to me is almost like a code review for the architecture of your system. We do a code review every time we make a change. We run integration tests every time we make a change. We make sure that the system can deploy and we blue-green deploy it or roll it back depending on whether it works or not every time we make a change.
I think we should think about auditing the same way.
In a world where there is a way to describe integrated testing in your cloud environment but it tends to be very complicated. Alerting in every change and actually running a set of tests just like Warden does is actually incredibly powerful, and Veraset said this was a game-changer for us, you know. We had the ability to make changes in disparate parts of the codebase, run the Warden security check, and actually action anything that came out of it fairly quickly.
So I guess, answering your question, at the minimum, I would say monthly for just an infrastructure audit, but I think this can increase much more aggressively as we talk about permissions audits or IAM audits, and so again my goal is always let's minimize the level of effort required to do an audit. You know, if we're spending engineers' time internally to reassess our own work, it's going to be frustrating, but again this democratized vendor of best practices coming in and flagging things that may not be standard is really powerful, so bringing it someone it should be an asynchronous process getting these audits done.
Yeah, I think that makes a lot of sense and I like the way you worded around like being like a code review because you know when ... I mean a code review is like part of the process at this point and everyone knows they have to go through it when they make a change, right? And, yeah, I think our audit is, you know if you can, and you can afford it, then automating it as much as possible on the data set of things is just as important, especially in your infrastructure and in your access controls, right, because I mean that's where a majority of the attacks happen right?
It's basically infrastructure configuration mistakes and IAM being over permissive. So those are the areas where you want to focus and definitely using a product makes it easy but you know even if you don' have the budget for that setting up something manual at a certain interval is a good first step at least.
So what do you think about the compliance standards like GDPRGDPR, PCIPCI, et cetera, do you think they really help, and how do you think about that when you're running an engineering team?
Yeah, so I love the fact that there are standards. Standards come from somewhere and they define some minimum amount of operating that are operating behavior that tech firm should be compliant with. I do think we need to think about these standards as necessary but not sufficient from an organizational data security perspective.
Certainly GDPR, you know, CCPACCPA is a similar one coming out, they do some minimal protections in that the data must be handled in a particular way but they say very little about how the data must be stored, secured, like even copies like backups is a new topic I know CCPA is talking about, like how do we actually prune data out of, let's say, an S3 or like object stored in Glacier, for example.
So I think the interesting thing is: the standards do a really good job of giving us some form of a baseline but they're not where I think we should be as a data-centric world.
Again, people are collecting data at an unprecedented rate and securing this data, just like in the example you gave me with the Colombian data set. Leaks of that data can be detrimental. And not just your operation but sometimes even people's health, well-being, lives, so securing that really starts with beating compliance expectations which you 100 percent absolutely should meet and going beyond that.
Yeah, for sure.
Well, I think that that's kind of like my list of specific questions. I got a bit more fun one so before we kind of get to the final question: we talked a lot about buckets today, I'm kind of interested to know like what are some of the items in your bucket list personally but also for your career.
Absolutely. So I think from of a personal perspective, I just got married which is exciting!
I definitely want to be able to travel and adventure around the world like we did before COVID. So from a personal perspective, I think I really want to get back into travel, get back into experiencing different cultures, revisiting, I guess, you and Singapore, Paul.
And, so you know, just really getting back into a more healthy environment where like the world is open again and we can actually move around it.
From a ... I don't know if it's a bucket list item, but actually, I mean, it is ... I definitely want this to happen before I die. I think from the professional side, I really think, you know, products like Warden are just ... I was running a small data company Veraset and I don't know what we would have done without Warden.
And if I think about ... at its core what we're trying to safeguard, it really is data and I'm not you know one of those people with a tinfoil hat saying, oh you know, Paul, everything's going to be hacked all the time, but we've been collecting data as humanity at an unprecedented rate, and securing this data, actually ensuring that it's even usable, making sense out of it, is I think, such a professional passion for me that I know it's an area I want to spend some time and energy moving forward.
There are serious privacy implications to some data sets. There are HIPPA implications, PCI-DSS implications.
They're just looking at the standards, so I think for me from a per ... I guess professional perspective, I really want to focus on how we can use data, operationalize it in a safe, privacy-compliant, security-compliant, and meaningful way like if we're collecting it and not using it, that's a problem.
So I think ... I don't know what this looks like yet and there are companies that are making some great headway. OpenMind is a great one, but I really want to spend some time, energy professionally focused on this data space.
Yeah, definitely, it's the future. I mean we saw, of course, Palantir here and it continues to grow even since then. Yeah, 100 percent agree that both securing and analyzing and working with data is the future, and there's going to be a lot of great companies built in that space across those spaces, I guess I should say.
In terms of security specifically, what do you kind of see happening in terms of misconfigurations and IAM and 2022 and then, you know, do you have any advice do you like to leave around that with our listeners?
Absolutely. I guess like the ... so if we look at S3 permissions as a whole, there's generally four areas that cause issues. First, it's just something getting outdated, right, like I have a policy I ... a contract ends with a customer but access isn't pulled for example. So it's really managing these policies and managing when something gets outdated the frequent permission refresh is that big issue.
Second: overly broad permissions, you know, the dreaded S3 star I should be able to read write do whatever I think it's so problematic when you see somebody who just wants something to work give it star permissions and then all of a sudden the data is deleted, and we're like how did this happen? Oh well, I gave permissions of everything to everyone. Management as a whole, like actually managing the cross interactions between ACLs and S3 policies is incredibly complicated.
And I think the fourth is really just version controlling things. So when you make a change, how do I actually revert back to working versions, or I guess, more importantly, how do I root cause when something goes wrong?
So I think, I mean, these are the areas of like serious problems right now and so I would say in 2022, 2021, 2022 I would expect a lot of these to actually be addressed and ... starting from the idea of version control, Infrastructure as Code, I think will continue to be a top of mind concern. Being able to, when a cloud can go down, being able to operate across clouds or multi-cloud way becomes more and more important technologies, like terraform make this super easy. The same code in one place is deployed to many places.
That was the first one. I really think also in terms of management, in terms of understanding cross interactions by leverage is a big piece of advice, like, it's so easy and tempting and it feels like it's cheaper to just try and build cloud experts internally so they can you know audit and run their own checks but it's so difficult to develop a full-scale understanding of what's happening and when in Amazon, or in Google, right?
So buying leverage in the form of experts, in the form of vendors, in the forms of anyone who can actually come in and really share best practices is so important. And so I can see developments in the Amazon front on all of these. They released AWS Access Points which worked in some regards to address some of the issues that we had with bucket policies but not all of them, so I expect to see some developments there.
And I think just overall like, you know, as technologies ... as Amazon store ... S3 rather exists as a system to store data, there are going to be developments on the technological side as well. We all know S3 readers support technologies like Parquet but new cutting edge Apache tools like Apache Iceberg are also gonna start becoming used more and more frequently not just from an audit perspective, but from a data access perspective.
So I think there's a ton that can happen in this space some of it from Amazon and a lot of it from external vendors.
I guess you asked me for some advice too. The biggest piece of advice I would say is attitude reflects leadership. You know if you're running a company that not just works, that works with data in any way, which is virtually any company, understanding and setting the tone for how to handle the cloud, how to manage data in the cloud how to actually use data in a meaningful and sane way, is ultra important not just for the culture of your organization but also for the safe handling and security of the data itself.
So my biggest piece of advice is really, especially for the executives that are running these organizations - buy leverage and ensure that the people who are reporting to you understand how important data security, is understand how important cloud security is, and really build a culture of your company around that.
Awesome, and thanks for the advice, Vinoo. I think you know lots of interesting nuggets there for the listeners. Appreciate you taking the time of your early morning to do a podcast with us and I hope you have a great morning!
Thank you guys so much. Yeah, I really appreciate being on here.