Hi everyone, good afternoon. You're hearing the Python Dev Room. So we're going to have
now Nigel Brown speaking about how we can trust third party tools when we have Python.
Many times we don't realize all the dependencies that happen when we install a package from
PyP and Nigel will be talking about how we can avoid getting some dodgy packages and things.
Thank you very much Nigel. Thank you. Thank you. Right. Okay. My name is Nigel Brown. I've been
programming since 1981 as a kid. I got a job about 12 years later. I've done mobile devices,
security, data, lots of different languages. I currently work at a company called Stack Lock
where I'm doing some data science and some engineering. If you're interested in the supply
chain and frankly who isn't these days, you'll love Stack Lock. You should check them out.
This talk covers some of the ideas that we've been grappling with there for the last nine months or so.
Okay. Here are some supply chains attacks and recent examples. I don't know much about these attacks.
I'm not a security researcher. Every time I read about one, I feel vaguely uncomfortable. These are
things that could apply to me on the whole. And this is why we're looking at these things
and the flames prove that they're scaring things. Okay.
So, recent lawmaking legislation. We've got the executive order 1428 in the States and
cybersecurity resilience proposals. The EO pushes S-bombs. What's an S-bomb?
The software bill of materials. It's probably a bit too much detail to go into right now.
Let's look it up. There's tracks over in the building about this.
The S-bombs are more of a first step than a solution. They're a step in the right direction.
Creating them sounds simple. But the practicalities get in the way actually and doing something with
them is more of an art than a science still. They are progressing. The key point is the
responsibility for the security of your code is shifting towards spenders. That means it's
shifting towards you on the whole. There's some more scary flames there because that's
quite scary. Okay. Supply chain attacks aren't you. It all boils down to who and what you trust.
The key point really is that security, insecurity most often comes from behavior rather than the
technology. Why are supply chain attacks becoming more fashionable? Maybe it's because they're easier
than they used to be. Perhaps maybe everything else got harder. Okay. Perhaps they were always there
and we just didn't notice. I don't know the answer there, but there is a lot more focus on them
these days.
So a word on trust. Basically, we want to trust some third party code.
That circle represents us. We're victims, skateboards, developers. The supply chain is how this
code actually gets to us. We generally get code delivered as some form of package.
And that package, the source and the package have to live somewhere. Sometimes they live in the
same place as in go, which is a very good example. Sometimes they're different places.
Some other package repository. These can be private, but we're talking mostly about open source.
Important point, we have to download it. These are all potential failures for the software supply
chain. Of course, we have multiple versions. They're changing all the time. They're moving target.
And there's normally tags in a source repository that point to the different versions.
And these are delivered as a bag of files to us on our laptops or our servers.
At this point, we can scan them. We can do vulnerability scanning and we can do static
code analysis. We should do that. Definitely should do that.
And the code has owners. And the point here is that you can't really trust code. It just
is what it is. It's the owners you're trusting. And the question we're faced with a lot of the time
is do we trust the right people? And it's not just the code owners. There are
multiple other people, contributors. And we trust those people because of their reputation.
Reputation comes from several sources. It comes from various media. Personal knowledge,
you might know some of the developers. They're all quite often we trust in a community of one
sort or another. And companies have reputation too. Sometimes good, sometimes bad.
How do you trust a company? If you've got close source, that's the only trust you've got actually.
The web of trust here is building up. Now,
turtle is all the way down. It's an expression of infinite regress. I heard it once. I thought
it would be a good metaphor for this stuff. It turns out, I'm looking for an image, turns out
Cole Kennedy thought the same thing. So I nicked his image because it displays this quite well.
The average middle size, medium size project has about 1500 transitive dependencies.
So you depend on something and it depends on other things. And you can investigate one package
at a time. You can look at its origins. You can look at the people. You can perhaps do a code
audit. But doing thousands of them is hard work. It will just take too long. Now, we probably want
automation to help with this. And that's one of the things that we're working on.
Trying to give this thing some oil to keep it going.
So this web of trust, the supply chain can be attacked at any point here. And it can break at
any point. It doesn't have to be attacked necessarily. And also the main point, there's
thousands of ways you can draw this diagram. It doesn't have to be like this.
But there is complexity there. And it's messy. So what do we do about this mess?
Okay. So what we currently do, we really like to see these. And that's because we can count them.
And we can fix them. We can show improvements. They've been guilty of a little bit of a misdirection
actually. In reality, only about 2% of these are exploitable.
So if you're not careful, you end up doing a lot of work that you don't actually have to.
This comes from Red Hat Report. I've seen other estimates of this 2% value. And they are similar
sizes. Okay. Another thing you can do, static code analysis. Currently, it's mostly signature based.
Finds things that we've found before. We think there may be more legs in perhaps grabbing the
source, grabbing features from the source code and running it through a neural net. And this may or
may not be more effective. There's still, there's lots of research out there. But there's still
lots to do. We think we're going to be doing some of this work ourselves at Stack Lock.
So, but that's more for the future. I mean, but the criticisms aside, we should definitely do
CVE monitoring and static code analysis. Right. So don't take anything I say here.
This is an excuse for not doing these things. Okay. So another idea is to look at metadata
rather than the code itself. Okay. So that would be descriptions of the package links to the
source repositories, activity around it, et cetera, et cetera. This is like a bit like a
classic security traffic analysis or perhaps in bank fraud detection. We're looking for behavior
around the package rather than the actual code itself. Okay. So this is a graph.
A graph. We got basically malicious packages look different from
non-malicious packages on the whole. The ones on the left, these little blue dots,
malicious packages. The ones on the right are non-malicious packages. They're surrounded by
nice bunch of purple users and orange source repositories. You can't see that probably in
any detail from where you're sitting. The point is they look different most of the time. Sometimes
you get good packages over here that are sort of isolated and you get malicious packages over there
that are well connected. So I don't know. It's a malice parent there. It is some of the
malicious packages look fine. Most malicious packages don't make any effort to hide the fact
that they're malicious. If you look at their metadata, it's quite obviously something. There's
no description. There's no effort put in at all. Unfortunately, a lot of legitimate packages look
like that as well. It makes it a little bit harder. We started off. We put a neural net on this and
we tried to put a classifier and we classified into malicious and non-malicious packages.
It worked beautifully. It's like so what? You don't really need a neural net to tell the
difference between those two things. You just need to look, you know, has it got any data associated
with it? So not necessarily very fruitful. So we don't need a neural net. Instead, we did a simple
score. We did a simple score. It looks at some malicious packages. It's mostly Python. We just
start with some Rust and NPM as well. We looked at the activity and the provenance. I'll come
on to that a bit later. We normalize it with a whole set of packages that we ingested.
You can see here that most of the malicious packages, these are just malicious packages,
they scored really low. So hey, it looks like we can spot malicious files using the metadata.
Not so fast. Unfortunately, the base rate let us down, as I mentioned, we do get low scores for
malicious packages, but we've got 10 times as many, at least 10 times as many good packages
at school zero as well, which isn't great. So if we get a low score, it means we've got a one in 10
chance maybe of finding a malicious package. We don't know for sure one way or the other. So you've
got to go on to your code analysis then. And also, I should point out, this isn't a representative
sample. We don't have a labeled data set of all the malicious packages in the PIKI repost,
because we haven't found them all yet. So we've got samples, we sample as best we can,
but we don't know. So does that handicap matter? Probably not, because most of the actual packages
we want to spot, most of the packages we want to use are probably on the further side of the
scale. They do have good description, they do have good information, and they are linked up.
There are some exceptions. All right. Okay. We act like this currently.
Vulnerabilities are all that there is, and they're all deadly. Okay. This creates a lot of work for
everyone, as I mentioned earlier. We're only really worried about things that can hurt us.
Right? And the real, the reality is that it's more like this. Most vulnerabilities come and
don't hurt us. We should use things like Open Vex, right, to describe the vulnerabilities that are
actually exploitable in place, and that's a sort of an emerging standard. And then we only have
to deal with the shaded bit between the two circles there. Obviously, you want to fix all
vulnerabilities, but, you know, there's a prioritization system that we can employ here.
Another thing to note is that malicious code doesn't always use CVEs, and there are other
things that can hurt us that aren't CVEs. So, we've got malicious code leverages bad habits,
like leaking keys and manual processes. We got abandoned code, and it gets taken over,
and it's not updated. But bugs and bad habits and abandon where
can also hurt us accidentally without being malicious, right? Malice isn't everything. So,
we want to avoid all these bad things. Most of the things we actually want to know about
are actually hidden from us, right? Okay. So, the malicious code is hidden by stealth.
Buggy code is hidden by incompetence or apathy. And since we started patching CVEs,
bad actors have moved increasingly to zero-day exploits.
All right. And let's remember, most code isn't malicious. When we look at the metadata,
buggy, poorly maintained, abandoned, malicious code all look similar. And you have to ask yourself
the question, well, we can't tell them apart. You have to ask yourself the question, do you
really want to use any of them? So, given that this is a hard problem, why not do something
simpler, right? Which is to invert the question. Look for the good, not the bad, right? Looking
after your health. It's like looking after your health instead of focusing on disease.
So, the good bits are everything outside the circle, right? We want all the rest of the code.
And for the Rust developers who insist that code could be finished, it's this bit as well,
the abandoned bit. Right. So, what does this look like? We want things that probably don't hurt us.
So, this is the inverse of what we just had. So, it's good coding and hygiene habits,
active development, regular releases, developers we trust.
CBA and stuff like that. Coding now is clear. And the key point is looking for good things is easier
because it isn't hidden. Okay. Right. So, I mentioned provenance. Right. The first challenge is
provenance. Okay. If you're going to do anything with any of this code, if you're going to scan it,
do whatever you like, we need there. Provenance means origin, right? We need to find out where
the code came from. Starjacking is when a code lies about its origin. And it tends to be a better
package than it is. So, you'll find that lots of different packages share the same source repository
in the package systems. It's very common. How do we find provenance? Okay. So, remember the
executive order earlier mentioned S bombs. Okay. S bombs are basically a shopping list of all your
of your piece of code, whatever it is, operating system, game, package. It's like a shopping list.
It's a document of provenance is what it is. What you put in an S bomb isn't quite standard yet,
but it's becoming more standard. There's lots of work going on with standardization. Open SSF,
there's a track over in the other building that covers this. It's where we probably want to go
we want to be able to record these things strongly. Now, if you've got an S bomb, you want to put it
somewhere, you want to put it somewhere safe. You don't want people tampering with your S bombs. So,
a thing that's becoming more common is Sigstore. And now this is artifacts signing. It's storing
artifacts in a transparency log. It's a distributed ledger. It gives us cryptographically strong
provenance. It circumvents most of the problems with delivery that we've got.
And there's a sort of convergence on this. It's being used more and more in the community.
I think it's where we're going to end up and it does solve a lot of problems.
But the fact is at the moment, most code isn't signed for now. And I think it would be a few years
before it is. And nothing. Historical provenance. That's a stack lock thing.
Okay, so we basically, we take a bunch of tags from the source repo and we take versions and we
see if we can match the dates. And if the dates match up, then we say it's got some provenance.
It's a statistical process. Quite hard to fake. There's a whole video on that on the, on our
website if you're interested. Videos and blogs and things like that.
I won't go into that any further here. Right, so just because you've got some code, you've got
rock solid provenance for it, you know where it came from. There's no, actually, it's no
really shortcut way of saying if it's any good. The old fashioned ways are the only ways you test it.
You measure it, SCA, again, code review. It requires the provenance of course because you
don't want to be reviewing some other bit of code that doesn't apply to your package.
And you become intimate with it. And with all those turtles and packages, like,
intimacy takes a lot of work. Right, we've got a community of people, right. So to make this
viable at any scale, you want to share the work with the community. Okay, and also we want to
automate this, right, because this is, you don't want to have to be on the email talking to people
all the time. All right. Okay, I mentioned reputation a couple of times. So the reputation of the
people and the, the companies that we're talking about, what do we know about someone? We know,
perhaps we know them, we know a company that big, we know the size of a company mostly, we don't
know much about them internally. We guess and we hope and, you know, do we even care? And the
executive order says that we do, apparently. So that's where our reputation currently comes from,
I think. Where should it come from? It should come from prior art, participation, recommendations.
Generally, we want some proof. And generally, we want to automate this. Okay, so, the key points.
Once again, look for good things. They're easier to spot. You don't trust code, you trust people.
Trust is complex. It can break in many places. Reputation is important. Communities can share
work. And automation makes this possible at scale. Shameless plug. That's the kind of stuff we're
working at at Stack Lock. We're open to ideas. Try our tools. They're on the website. Joining a
conversation with Discord. The source, if you like, if it's open, if it's not yet. And that's the
end of the presentation. So, any questions, please?
Page presentation, Nigel. Thank you very much. We have time for one question now. There. I'm coming
to you. One second. Thank you for the talk. Maybe it was some humility. But what does your
product Stack Lock do exactly to apply all you said? So, is it where you attend your packages and say
where are the vulnerabilities or what does it work in practice?
To this URL, www.cracksypackage.dev, you'll get to a web portal and you can type in the name of a
package and it'll give you a score. What we're doing is we're increasing the number of facets the
score is based off of. We've got provenance measures in there and we're going to be doing a
reputation engine for it as well. So, there's a website and you can go straight there. To bring
this to the developer, there's a VS Code plugin and you type along, you import a file and it'll
give a squiggly line underneath it and it'll say, yeah, this has got a low score. Obviously, some of
the low scores are absolutely fine but it just gives you an indication that you've got to do some more
investigation. There's ways around most of this stuff but it's kind of like it just gives you flags.
But yeah, go to the website. It's fairly intuitive. You don't need instructions for it.
Cool. Thanks for the question. Thank you very much, Nigel. Please feel free to reach out to our
speaker after. Thank you.