Hello everyone. Thanks for coming. My name is Dejan. Unfortunately Marco couldn't be
here today. He got a call but yeah. What I want to talk about today is so we saw a lot
of sessions today about producing gas bombs and producing the data a little very little.
I think only Philip sessions was about managing actually the produce data right. So the challenge
we try to tackle with the justification project is how to get all these data that are currently
being produced by the more and more organizations like S bombs but also X files and more and more
advisory data that we get and get them into some kind of manageable system because without that
information is just a bunch of mostly JSON files spread all over the place right. So what we want
to what we try to do is to provide a system that will get all this data put it in into a system
that can be searchable and queryable and actually get us get us actionable information. So making
software development more proactive in managing security but also making it much more easier to
respond to the security issues. And yeah as I said these got us to start working on a
justification project which basically set these goals for itself. So being able to ingest and
store all kind of S bombs and VEX documents are open source but also proprietary company products
right. Discover for those ingested S bombs and VEX is learn about all the new vulnerabilities
and advisories related to the packages inside of the S bombs and being able to explore and search
those information but also create an API that can be
integratable in other systems and provide us to share this information with the rest of the
developer toolchain like IDEs and CI CD tools. So ideally we would want to mark all the vulnerable
dependencies directly in the developer's IDE and also for example fail the builds that
tries to build a software that contains some of the dependencies that are known to be vulnerable.
When we started to do this sometime last year this time last year we also figure out that
there is another open source initiative that revolves around the similar ideas and it's called
GUAC. It was mentioned in the previous session as well and I will cover it a little bit more here.
So GUAC stands for Graph for Understanding Artifact Composition and the idea is to being able to
ingest all different kinds of artifact documents like S bombs and VEX files and advisory data from
all kinds of sources and basically create a graph ontology of that. So at first we started just
experimenting with the GraphQL database but today ontology is based on the GraphQL API and can be
implemented by the multiple persistent backends. That's on the left side right on the right side
of the graph we also want to be able to query all these data. So GUAC should be able to provide us
with all the answers about what are the dependencies in my S bomb, how these dependencies correlate
with each other, so what's dependent on the what, so it's easy to find all the graph tree of dependency
in your project but also being able to attach to this particular dependency all the vulnerability
and the advisories and VEX data that we can find in additional systems.
This is the basic architecture. Let me just see how much time I have here. But I basically
explained it in the previous graph. So we can collect documents from different sources,
we can certify them against different sources like OSV or DApps Dev, get it all through the
GraphQL API ontology into a database. Two currently supported databases today is POSGRES,
relational database that we use basically and works just fine and there's an Orango DB back end
which is a pure GraphQL back end right and then on the other side provide the GraphQL API to
be able to query that and provide a bunch of CLIs that it can be able to extract the data from the
system. So in the classification project we try to provide a little bit more functionality on top
of that. First of all we want to be able to actually not just ingest all the data about
different relations in the database but we also want to provide a central place to store all
your documents for the organization. So it provides an S3 compatible storage for storing
and ingesting all the company's data into a single place so it can be an S3
bucket in the AWS but also for local deployments it can be some kind of a
Minio instance for that. It has what we call walkers for different kind of CSEF repositories
so that we can automatically ingest Asgum and Vex files and then provide what we can see
on top and on the bottom. So what we call a single pane of glass like a nice UI to be able to
search all this data that we have but also the Exort API as I said for integrating the system
to the rest of the developer tool chain. So there's a nice VS Code plugin that can work basically
with justification today and automatically from the project get all the dependencies and flag
vulnerabilities if it's found in the system. So I thought to do a little demo so let's see how
it's going to work. So Neil it will be easier. So here we can see the UI with some pre-loaded data
and we can see that we have basically what we call six products here which are actually
six S-bombs that are already already ingested in the system and a large number of CVs that
have been collected from multiple sources and we can see that we identified around 2000 packages
for these S-bombs and most importantly from the Vex files ingested here we identified 29
advisories for these. So if we go to a certain product we can see a couple of information
obtained from the S-bombs so we can see the basic metadata that we have. Usually we can see all
the packages and how they relate to each other. I think this S-bombs is pretty flat in structure so
there's no much dependency going on there but the most important thing is that we can see
different kinds of advisories that are against and also immediately see which
actual packages are being affected by these advisories. We can go back and forth through
this system so we can go to the actual package see that it's actually affected by this vulnerability.
We can also go from the package and find the S-bombs that it belongs to, the S-bombs or the product
but also what we can provide is that nice search capability as we said like maybe at some point
you don't remember exact vulnerability we're looking for so you can basically
just do a full text search or maybe yeah and find that there's a packages related to that but also
find the exact vulnerabilities that we talked about a little bit earlier. So this is just
like a basic demo right? I have a little bit more time just to explain so what were the challenges
for us and I think we heard in a lot of sessions all about these challenges so it's mostly
still early adopters everywhere, tools are immature including the project I'm working on so we definitely
don't consider it mature but also there's a lot of inconsistency in the data wherever you look right
so we heard today about all the multiple computing formats in S-bombs space and all the work that
people are doing to bring that more closer and together over time which I think is awesome.
We also heard a nice discussion about all the different kind of
identifiers and you can see so if you work only with one source of data then it's easier but then
if you try to correlate this S-bomb with this Vex file and this S-bomb is using PURELs and these
are the CPEs it's becoming impossible to correlate data and build the graph basically
properly. Also what we found is that even all these things are standards there's a lot of unwritten
rules in all the organizations about how they are presenting their data so the documents will pass
but what you have as an information from the document really depends so I think yeah it's good
that you're all here and there's a lot of things to do right because it's early early days. For the
project itself we'll try to additionally simplify architecture and the deployment model we're all
about microservices and Kubernetes for now which is okay but I think we could reach much more
people with simplifying how much resources and where they can deploy a project like this
and go into supporting more standards. So you saw here just basic searches and basic
correlation I think once we have much more data in the system we can get much more
vision from all this data in and provide that as that's the value of the project in my opinion
right and continue working on the future integrations because in my mind if you do continue doing this
right I think at some point in a couple years all these infrastructures should be invisible to
developers right so it should be part of your developer toolchain automatically working in
VS code in all the Git for pipelines and everything right so we are just beginning
that's it so justification side doesn't have too much data saying about immature projects but
there's a dev box sandbox that you can try there's a code there and we always on the
metric channel so if you're interested please reach out and yeah. I'm going to ask the question
are you using the SPX libraries for helping with the ingestion?
No no we're using yeah sorry yeah the question is are we using existing SPX libraries yes we are
yeah so there's one in Golan using in in guac but there is also in Rust one using the classification
itself because they are good yeah. So why is the reason that you decided to start a project from
the ground instead of help at least four or five open source projects big ones that already do
exactly what they do but not yet on the level but mostly 90 percent that we are doing today.
Why you not helping that one instead of creating one? So yeah why we are starting a new project
instead of instead of helping others so first of all we joined the guac project which is also
another new project but yeah I can't answer that I mean a lot of people were involved in that kind
of decision but we are trying to be as much I mean it's all open source we are contributing to other
projects so it's not a closed source product basically yeah. So one of your early slides said
this can be used to sort of share S-bomb data can you talk a little about that feature how
you this can be used to sort of send S-bomb data around to other projects?
So it's not about yeah sorry about it so about sharing the S-bomb data it's not about sharing
the data but providing the API so the external systems can query things so basically the VSCode
plugin would get all the URLs from the current project and being able to query this and get
actionable item back so there's no any distributed sharing of the data just integration API.
Okay please thank you. Thank you.