All right, so this is the talk on SpiceDB. Thanks everyone for showing up. So early in the morning, I'm starting to lose my voice because there's a long day yesterday of talking and meeting awesome people. This is my first FOSDOM. So who am I? My name is Jimmy Zalinski. I'm the co-founder of a company called OthZed, an OthZed-billed SpiceDB. Previously, I worked at Red Hat and CoralS. So I've been around in the container and Kubernetes ecosystem for a pretty long time, basically since the beginning. There, I'm actually a maintainer of OCI, which is the standard specification for Linux containers. And I've also started a bunch of projects in that space, notably Kubernetes operator framework and some others. This talk is entitled SpiceDB. But since FOSDOM is more of a developer community conference, I really wanted to focus less on this talk being a vendor pitch for SpiceDB, but actually more of a level set about the problems in the authorization space and the history and status quo of that. So that everyone understands what might be the best tool to solve their problems. I'm not going to try to sell you SpiceDB for all problems, because the more informed you are, the better you can pick the product that's actually going to complement your software stack and what you need. And that means there's going to be way more qualified people using SpiceDB and way more qualified people using other authorization tooling. Obviously, I'm the most jazzed about SpiceDB because I created it. So why are we all here? We're all here because there is a not-for-profit organization called OWASP, which is the Open Worldwide Application Security Project. And they kind of got started in the early 2000s. And they're famous for having this list called the Top 10. And the Top 10 is basically an enumeration of the highest risk, the highest threats for web security. And as of 2017, broken access control was number five. As of 2021, broken access control is number one. That means this is the biggest threat to the web and to all the applications running internet facing to the web. But really, the question is, how do we actually get to this point? Like, how did this happen? And how did it happen so quickly? I'm not going to point any fingers, but what I'm actually going to do is kind of dive into two different groups of stakeholders in kind of the history of authorization. There is kind of the academia and people publishing papers in this space, kind of defining concepts. And then there's the industry practitioners that are actually building the software and realizing these systems as they're actually connected to the web. I'm going to start with academia first. So on the right-hand side, you're going to see a timeline. And then on the left-hand side, there's going to be some notes. And then not for this slide, but you'll see QR codes in this corner as well. Those QR codes are going to link to the specific novel paper. So if you're interested in any of these particular concepts, you can feel free to scan the QR codes. But our history kind of of authorization is going to actually start in the 80s. And it kind of gets really kicked off with this publication of the Trusted Computer System Evaluation Criteria, which is a security practices book published by the US Department of Defense. And in it, it's outlining a lot of different security practices that are effectively a part of the military, the United States military. And in it, they kind of describe these two different access control systems, discretionary and mandatory. Now, discretionary is effectively just if you created the idea or the information, you can share it. And if you're then given access to that, you can share that. It's at your discretion. I kind of use file systems. And Google Docs is an example here. It's not a perfect one-to-one match. But if someone shares a file with you on a UNIX file system, you can copy that file if you have read access. And then you can change whatever permissions on that and share that, similarly with Google Docs. So it's at your discretion how you're going to share that information once you're given read access. Then there's mandatory access control, which is effectively a long list, an exhaustive list, of all the access for a particular thing. Most notably, people are kind of most familiar with SE Linux as the example of this. If you're unfamiliar with SE Linux, it's a way of locking down the Linux kernel. Honestly, it kind of comes with a negative connotation because mandatory access control is very verbose and very difficult to get right because you have to enumerate absolutely everything. Some people say that the three-letter agency at the US government that created this are the only people who actually know how to configure this correctly. I don't know if that's actually true or how many people use it. I know Red Hat is one of the folks that actually does promote SE Linux. But the one thing about this slide I really wanted to kind of drive home is these ideas, they're as old as the military and war itself. There's nothing novel about the 80s where these ideas got invented. But what actually happened was someone only actually ever thought to write this down in the 80s. So it took that long after using these ideas for many, many, many years. So we jump roughly 10 years, 9 years to 1992, which happens to also be the year I was born. That makes me feel relatively old. But in 1992, we get this paper published on role-based access control. And role-based access control, often called RBAC, is kind of where actually most people believe the state of the art for authorization systems is. The core idea is basically there is a group that is assigned access to a particular thing. And those groups are called roles. And then you map users into these roles. And by means of being in this role, you get access delegated to you. The kind of number one problem with RBAC is that everyone kind of defines it differently. If you build any enterprise software, you're going to talk to clients and they're going to ask you for RBAC. But the difference is if I look at two different enterprise applications, how they implement RBAC entirely differently. The only commonality is this mapping of users into groups that then have access. This is kind of going to be a recurring theme across all of these papers published in academia, anything with Starback, because they're documenting concepts, but not actually specifications that would give you an ultimately cohesive and designed and secure system. So kind of most famously, the biggest issue kind of with RBAC is that there really is no scope. If you say someone is an admin, does that mean they're an admin of the entire web app? Does that mean they're an admin of a particular resource in the app? You just don't know until you actually build it yourself. So there's not really an easy way to reason about these systems until you actually touch them. So we jump actually well into the future now into 2015. And now is when the paper on ABAC, which is attribute based access control, is written. Effectively, the idea behind ABAC is to kind of generalize on RBAC and say, the role that you're assigned, that is just one attribute that your user can have. And other attributes might be that you logged in with this IP address. Many other dynamic attributes can be assigned to you. The kind of really important thing about ABAC is it's providing this real time context. So now you can kind of write rules, like are they connecting from this country, this subnet, this time. You can delegate access at particular windows of time and kind of perform more logic on these attributes that folks have. And now we're going to take a huge digression back to 1965. So if you're unfamiliar, Multix is actually this operating system that was developed between MIT, GE, and Bell Labs. You might not remember it, but it actually inspired an operating system you're probably familiar with. Unix. So Unix is actually an attempt at making Multix concepts ported to less expensive hardware. Multix is often credited as the operating system, like the first operating system that has access control for the file system. I actually don't know if that's true, but it's often credited as that. So in Multix, you have a file system tree, so you get hierarchical structure. And then at every branch, which would be a file or a directory, you can have five different attributes assigned to that file. You get read, write, exact append. These are all kind of file operations that you'd be familiar with. But there's this fifth one that's super interesting called Trap, and that actually gives you the ability to do callbacks and to see functions. And it was initially designed so you could do file walking in user space. But kind of like the whole thing with Multix and the reason why I bring it up is because there was inheritance, there was aback, and there was user-defined functions in an authorization system. In 1965, when in academia, the ideas behind attributes were published in 2015. So there are systems using these concepts, but they maybe haven't been formalized and written down in the concrete form. And this is kind of like a huge issue with the whole space, because people are doing things, but they're not really studying how to make these systems robust with these ideas. They're kind of more just documenting these ideas ad hoc. So getting back to the normal timeline, we hit 2019. It's actually in 2007 that the term is coined relationship-based access control. And the idea behind this is actually that by establishing a chain of relationships, like Jimmy is a speaker at FOSDOM and speakers at FOSDOM have access to the FOSDOM speaker matrix chat. If you can follow these chains of relationships, you can actually go from Jimmy has access to the FOSDOM speaker room. So this term is kind of coined around then, and it's looking forward at what tech in the Web 2.0 era will look like. It's published initially while considering how Facebook, the social graph, works internally. So when you share photos on Facebook, you say, friends of friends can view this. You're literally defining it in terms of relationship to yourself. So we hit 2019, and actually that's when Google publishes a paper called Zanzibar, which is documenting an internal system at Google powered by these concepts. And the difference and the reason why I have 2019 for you back is because Google is documenting a concrete implementation of this. Unlike a lot of these other papers talking about concepts, it's talking about an application of these concepts and really giving you a framework for how to use this effectively and in a correct way across multiple products at Google. So then in 2021, SpiceDB is open source, which is also implementing the similar concepts to Zanzibar. And obviously, I'm going to get into that later. There are other models like Starbucks, but that's kind of like the primary ones that I see mostly in industry. You can dive into Wikipedia if you're interested in other ones. But now you've got kind of the industry side of things. We're leaving academia. And industry has this problem, which they go to build in a web application. And your first job is just build the MVP, the minimum viable product of your web application. So what you're going to do is do what you do with everything in a web application, which is store data in a database, probably the relational database you're using for everything else. And you're going to try to check if a user has particular access based on some data you store in the database. It might maybe going to be a role if you're inspired by RBAC. But maybe it's just a numeration of the list of users that can do a particular thing. So you may have written code that looks like this. But the problem is this falls over at some point in time, whether fundamentally you build a system that actually is just really slow, or you have to build a new system that is way faster than you ever intended it for it to be. Or you basically get users of your software that demand new functionality that is not actually possible for you to implement until you refactor your authorization code. So a great example of that is if they want recursive teams. So if you have groups of users, what if you have groups of groups? Or groups of groups of groups, right? That is something that most people cannot build, or they don't build in their initial MVP. And when you get functionality like that, you're forced to completely rewrite your authorization system. The other thing that could happen to you is your company buys another company, and they're based in a different continent. And that means all the requests for checking permissions now have to travel across an ocean if they want to be correct. That's a huge problem. And making sure that the performance is actually going to be viable, and the answers you're going to get for authorization questions are correct is a difficult problem. So you hit one of these kind of big issues, and then you kind of are forced to enter the cycle that I'm going to get into. But these numbers are kind of fudged. But the whole point is that if you take an engineer, probably with expertise in that web app, has worked on this authorization system, it's going to take them a while to implement this. It's going to be super sensitive because someone else is going to have to review it. That person is going to also have to be deeply embedded in that code base. They're going to be extraordinarily careful because any mistake that happens in this code base is going to be a CVE. It's giving access to people that shouldn't otherwise have access. So that's going to take a long time. Then you're going to do QA. You might actually have to perform a security audit before you can deploy this software because you're deploying to enterprise environments. And then you're also probably going to want to take extra time rolling out these changes into production. You probably don't want to deploy it to everyone all at once. You probably want to deploy to a minor subset just in case you find something wrong with the code. And all of this just takes time. And the problem is it's actually putting security of your software at odds with development velocity. Fundamentally, it's going to take you too long to add this functionality. And you're going to want to take shortcuts. But shortcuts are security flaws in your software. So then as rinse and repeat, you basically don't know how long until the pain is going to build up where you're forced to rewrite these authorization systems. And that is the mystery box entirely. You could finish or not even be finished rewriting your authorization system. And then all of a sudden, a new user sets some requirement for you. And you're doomed. You have to completely rewrite the thing you just thought you re-architected to be future proof. So how do we fix this? There's never ending cycle. And OAS themselves actually have recommendations for this. They say you should no longer adopt RBAC, but take concepts from A-BAC and RE-BAC. Obviously, I'm biased towards RE-BAC because I think it's a more modern approach to this. But the OAS folks also give you some high level benefits to why you would do something, like why you would adopt these new ones over RBAC. I'm going to just take this from the RE-BAC perspective. When you're doing a graph-like thing, a relationship-based system, you're forced to basically talk about individual entities. So this user, Jimmy, has access to this particular document. Because you're doing that, it has this kind of buzzword, fine-grained. You're not resolving Jimmy to a role or a group. You're actually following Jimmy directly to the document. So you're talking about individual entities in the system. So as a result, you get actually more fine-grained access. I'm not trying to generalize about any users or paint over anything. I'm actually talking about the exact objects I care about. And that means you can actually develop systems where you delegate access to a particular row in a database or a cell in a spreadsheet. And all of these systems are designed for speed because they understand they're going to have to store a lot of data to be this fine-grained. And then because your applications are only talking about the direct objects that they care about, any of the relationships in the in-between don't get written in your code. So you just ask the question, can this user perform this action on this thing? How they got access to that? And if you ever refactor or change how they get access to that, that does not live in your code base anymore. That means you can make changes to your permission system and not change a single line of code in any of your web applications. And believe me, when you do that for the first time, it is a magical feeling because you don't have to touch any code. So then there's also multi-tenancy and management ease. And this is just simplicity around modeling. And then with ABAC and REBAC systems, you're paying it forward. So our back might be really easy conceptually for you to implement at the beginning. But these systems, the ABAC and REBAC ones, they're more focused on forward thinking. If you need to make changes, like I just described, you can change REBAC designs without changing code. It may be a little bit more effort for you to get started in building and integrating with one of these systems. But by day two, if you ever need to make a change, it's going to pay dividends. So I wanted to get deeper into this Zanzibar paper I talked about earlier, which kicked off the interest in REBAC that you see today. Basically, Zanzibar is a purpose-built graph database that is very specifically optimized for one thing, which is finding a path in a graph. And by virtue of finding that path, that means that the user has access to that particular thing. It's actually one of the few good things that came out of Google+. So there's only two things that came out of Google+. There is Zanzibar internally at Google and then Google Photos. The novelty of this paper is actually that it is solving an authorization problem with a focus on distributed systems. So if you'll notice, the title of the paper is called Zanzibar Google's Consistent Global Authorization System. So it is fundamentally trying to tackle authorization as a distributed systems problem, which is not really something else has done in the past, because they kind of acknowledge that if they're going to deploy one system at Google, it needs to work across all geos in the world. And it has to be extremely, extremely reliable, and it can never be wrong. These are really difficult requirements. But the anecdote I like to use is when you're on a cloud provider like Amazon and you go to provision something like, say, an S3 bucket, you're always choosing what region. But actually, if you go to set IAM rules in a cloud provider like Amazon, you don't pick the region. That is because these systems fundamentally have to be global. And when you're designing them yourself at a particular scale, you need to think about how you're going to make your system global. And so this paper actually inspired two companies, Carta and Airbnb, to go forward and implement their own internal systems based on the ideas in this paper. None of them are truly 100%, I would say, authentic to the original paper, but rather the paper refused with the requirements of their business at the time. So I think the real superpower to Zanzibar, though, is this, which is if you go to send someone a Google Doc in Gmail and they don't have access, Gmail will pop up a box and tell you, hey, you didn't give access to this person. That fundamentally means that Gmail actually has a way to ask questions and check permissions that are built into Google Drive. So that means you could have one central source of truth for authorization data that your whole application suite can share, microservices can share. And this is incredibly powerful because not only does it allow integrations like this, but it also lets you have that central source of truth where if you need to audit something, you can just ask that one service. It's the only service you have to trust. It's the only service that you have to query if you're trying to really dig into any of this data. So you have a problem like an outage or something, an incident, and you need to understand what the access control looked like. So you might be wondering, how do I Zanzibar? So this is exactly what we set out to do. Basically, the year after the paper was published, my co-founders and I left Red Hat to found and basically build SpiceDB in the open source. There were some folks experimenting with the ideas around Reback at the time. But no one was really moving the needle towards making this a production thing that you could use in a real enterprise environment or at a real tech company. We originally prototyped the thing in Python. It was type annotated, lazily evaluated, functional Python. So it was way faster than you'd ever think Python should be, but it was not fast enough, so we ended up rewriting it and go and open sourcing that. The name is actually inspired by Dune because internally at Google, the project was actually called Project Spice because the ACLs must flow. So the timing for that has actually been really good with all the Dune resurgence in the movies, but internally at OZ, all of our software is named Dune References as homage. But if we fast forward to today, the SpiceDB community has actually gotten contributions from a lot of companies, big names like Netflix, GitHub, Google, Red Hat, and Plaid. And there are production users in small companies, startups, where it's just the co-founders, all the way up to Fortune 50 companies. But I still haven't actually told you what SpiceDB is. So SpiceDB is, as I described with Zanzibar earlier, this extremely parallel graph database. So developers basically apply a schema, just like you would for a relational database. And I've given an example schema here, kind of modeling a Google doc. And then what they do is they store data inside that database and query that data according to that schema. And it's really magic where you can actually make schema changes and not in a forward compatible way that lets you actually modify your permission systems without changing any code. So we don't actually have a SQL API, despite being a database. We give you GRPC and HTTP APIs. And effectively, like the primary interface we recommend as GRPC for latency reasons. Because authorization is in the critical path of everything, your web applications are going to do, and possibly everything at your business, you really have to make sure the stuff is fast. Thus, everything needs to be kept in memory. Everything needs to be returned in single digit milliseconds. So GRPC is actually pretty critical for that. And then in addition to the actual main server, we also expose servers for power and dev tools. So you can get auto-complete and things in your editor. But then also integration testing services. So it's Kubernetes native. Designed from the beginning, our background is all in Kubernetes. So actually, SpiceDB is self-cluster. So if you deploy just SpiceDB directly onto Kubernetes, it will discover other nodes and actually start to divide and shard up the in-memory graph that it's using to actually serve this data across them automatically. We also offer a SpiceDB operator in the open source, which will then do automated updates for SpiceDB. Notoriously, having zero downtime updates for a database is very tricky. So we just took that problem off the table for most people and just implemented it automatic for anyone using Kubernetes. So we remain true to Zanzibar's goals of consistency at scale. So we actually have pluggable data storage systems. And basically, depending on what your requirements are, say you need to deploy everywhere in the globe, you can actually store all of your raw relationship data in something like Spanner or a Cockroach DB. And then you can deploy regional deployments of SpiceDB that will exist as independent caches for those geos. But fundamentally, they're sharing all the same core data and they're consistent across those environments. If that sounds too complicated for you or you don't really need that, you're just single region shop, that's fine. We also have deep integrations of Postgres or MySQL if you just want to use something like Aurora or Amazon or ES. Obviously, then there's also memory for testing. We also have a tool called Zed. Zed is the CranLine tool. It basically manages cluster credentials as backups. It gives you a command for every single SpiceDB API. And I just kind of give an example of running kind of with debug flags permissions check. You can actually see it gives you a whole graph traversal. It shows you a tree of how you actually computed whether or not someone has access with timing data associated with all that. So you can see where things slow down. We have a web IDE. So actually, the two things you just saw, SpiceDB and Zed, we compile to WebAssembly and then run that in the browser. And then we basically build that all on top of Monaco, the engine that powers VS code. And give you a full IDE where you don't have to install any of the software I just showed you. You can just go to play.offz.com and start playing with this stuff. Run Zed against live data. You can load in test data. And what we actually do is we can generate exhaustively all of the paths available in the graph for you. So there's somewhat of a model checking happening here. So you can actually prove exhaustively that all of the ways you can traverse the graph are the ways you think they are. And that basically lets you prove that a system is correct without you deploying it into production or having someone do a extremely long security on it on your process. And then you can check this stuff into CICD. So if you make a change to the schema, you can actually guarantee that certain assertions always pass and that everything is exhaustively checked. So Zanzibar is not a silver bullet. We actually have had to extend Zanzibar in a bunch of different ways. So SpiceDB remains true to all of the core concepts that you'll find in Zanzibar. But not everyone is Google. So effectively, not everyone relies on users being represented the same way. So we are kind of more flexible with how people can model their own users. And then we kind of add on developer experience because at Google they can say, you're forced to use the software. When you're building open source software, you can't force people to use your software. You have to compel them to use your software by having a better experience than what they're currently doing. We've also added kind of contextual relationships with ABAC. So that means relationships can actually exist basically dynamically based on context that you provide at runtime. That was a joint project with Netflix. So if you're wondering how you SpiceDB, you can go to our Discord, discord.gg slash SpiceDB or check out GitHub, basically anywhere on the internet where you expect to find open source projects. SpiceDB is there. So thanks everyone. Thank you. Thank you.