Hi, my name is Christoph Bales.
One of the things I do in LLVM is I'm a member of the LLVM Security Group.
And I realize that I don't think there has been a lot of publication or sharing of exactly what the Security Group does.
And so I thought, hey, let's see if there's anything interesting to say about the LLVM Security Group.
So what's the purpose of the Security Group?
It's there to make sure that when there are issues, security issues that are discovered, there is a way to responsibly disclose them.
So that means when there is need to take action before the issue should be fully publicly known, the Security Group enables that.
Its purpose is not to try and do everything related to security.
So everything that can be done fully public in the open probably should be done elsewhere.
A little bit of history.
The idea of setting up a Security Group was done in late 2019.
It took about a year and a bit.
The group was up and running by 2021.
And now three years later, we've been operating for, well, three years.
Who can be on the group?
There's a page on the LLVM Talk website, which I'm sure the URL will show up on some later slides.
But it does describe that it can be individual contributors, security researchers, and representatives from vendor contacts.
And here vendor means typically people who pick LLVM, build it, and release it, something along those lines.
Currently, there's about 20 members, and the far majority are vendor contacts.
So how to report a security issue?
So far, we've been using the Chromium Bug Tracker, which seems a bit odd, but it was at the time three years ago the easiest way to get the Security Group
going because that's a tracker that has features to make sure that you can report issues in confidence.
We are planning to move to something different, most likely something that's integrated in GitHub,
given that most of the other project infrastructure is now in GitHub.
So what kind of reports have we received over the past three years in total?
We've received about 47 reports, and in the next couple of slides I'll go into a bit more detail.
This is Sunburst Diagram.
Hopefully it will be clear what this kind of represents in the next couple of slides.
So the stuff that's highlighted here, 27 issues were considered by the Security Group to not be actually security issues,
or at least not issues that needed special treatment and could not be made public immediately.
Just a list of, this is my personal classification.
It's possible I made a mistake here, so four reports were just empty.
Three issues report about Chromium because it's a Chromium Bug Tracker, that's a confused people.
I don't blame them, like it's five bugs turned out to be just very regular normal bugs that probably should have been published publicly.
There were 12 issues where people reported a memory vulnerability, like a buffer overflow or something like that,
happening in a tool, not a library.
So for example, a buffer overflow in the parser of the compiler, something like that.
We explicitly describe that as we don't consider those are security issues,
or at least not issues that need to be handled in a coordinated way.
Two issues related to undefined behavior in source code, so someone showed a small program and said,
hey, the compiler does something like clearly security issue.
It is a source program written in C or in C++ and it has undefined behavior,
and actually the compiler does what's allowed according to the standard.
And there was one discussion on improving supply chain security, which is very useful,
but there's no need to do that as a discussion in confidence.
So let's jump to the issues that were considered to be, yes, probably these are, well,
these are considered security issues according to the security group.
So two of them were considered like we don't need coordinated actions here.
It's okay to just have these as public issues.
One was a particular sanitizer, not reporting an issue in a particular niche use case.
Another one was a client warning not being enabled by default, not going to go into the details,
but these could be discussed in public.
18 or 38% if I calculated correctly, were deemed security issues,
and that required some kind of coordination.
Again, very rough classification.
One was about the compiler generating incorrect code.
In general, not every incorrect code generation is considered a security issue,
but in this case, if I remember correctly, it was something about access to the frame pointer
that was not correct, and so we could see that this could be exploited.
Three issues were memory vulnerabilities in Lipsy++.
Seven issues roughly related to supply chain security.
I put it in that category, and then seven issues related to gaps in hardening features.
I say hardening features.
They could also be called security mitigations.
So I just show on the slide a few examples of what can be considered these hardening features
like stackinaries, branch protection.
The number of hardening features implemented in compilers over the past 10 years have really taken off.
Let me put it that way.
Going back to supply chain security.
So what are the kind of issues we saw in supply chain security?
One issue was about the Visual Studio Code plugin for playing D.
The LLVM was potentially trusting workspaces where maybe the user should first say,
yes, this is a fresh workspace.
I should trust that.
So a kind of supply chain security like in the tool LLVM built itself.
There was one issue that said something potentially suspicious happened here.
Did someone try to put the backdoor into the LLVM compiler?
It turned out extremely unlikely that that actually happened.
But I'm quite happy that if people have suspicions, they actually report it and it gets investigated.
Two issues were related to, I think one was a person accidentally publishing their GitHub access token publicly.
So everyone had access to commit to the repository.
And then something about the website that could be improved.
And three issues were about out of date dependencies on libraries.
So in a few, not in the core of LLVM, but a few of the utilities around, they're implemented in Python.
These typically say we depend on exactly that version of a Python library.
And those were out of date and there were security related consequence of that.
Hardening features again.
Seven issues we had.
Two categories. One was four issues were related to gaps in existing mitigations.
So mitigation has already existed.
And three issues were some new vulnerability got detected in another system or maybe in some hardware.
And the request was before we make this publicly available, we would like to make sure that there is a new security mitigation feature in LLVM
to help people to work around that issue.
So I really raised through just showing a few stats on what are the kinds of issues we get.
Can we have some takeaways from looking at those? I think yes.
So first of all, achievements, success.
I think through all of these issues, I think all of these, all the report issues seem to have been processed appropriately.
I think that means the security group is working as expected.
So that's a success.
If you would like to have a closer look yourself, the security group publishes a transparency report every year at that link.
One thing I do not know is maybe some people have published an issue that is covered in the public bug tracker.
That really is a security issue and should have been reported to the security group first.
I'm not entirely sure exactly how to get some stats on that.
If you do have any thoughts on that, please do contact me later.
Those were all room for a lot of good things.
The security group is working as expected.
But from these stats, we could also derive some ideas for what could we do?
What could be done better? Where do we need some more work?
I think there's at least five different categories.
First of all, a little bit more clarity on the threat model and what the security issue is.
The issues that from my experience at least were by far the most complex are all the ones related to issues related to mitigations in the compiler.
Supply chain, there's quite a few issues came in.
There's probably also something to do there.
I already said we should move away from the Chromium bug tracker.
Then something that isn't quite clear from just looking at the issues itself.
I think we also need to improve how we communicate what the known security issues are to everyone who has an interest and who should know.
I'll go into a little bit more detail there.
First of all, what is security issue or a threat model?
Actually, it's documented what currently the security group defines as this definition of what is security issue.
It was a threat model.
A few things highlighted there.
A lot of young issues in a very wide variety of use cases.
It's very hard for everyone having all different kinds of use cases for all of them to agree on these are security issues.
We'll have to come up with a reasonable consensus.
Before we can say, hey, this is really the threat model we follow.
We need buy-in from most of the people who commit to LLVM because otherwise how can you make sure that patches follow threat models?
There's actually a few thousand committers to LLVM every year.
That may be a bit of work.
Maybe the most important guideline currently is that if you're not quite sure of an issue, is it security issue or not,
please err on the safe side and do report security group.
You saw a slight majority of issues we received.
We decided this is not security issue can be handled as a regular issue.
It's better to be safe than sorry there.
So by all means, please do report if you think it might be security issue.
What are the security sensitive parts of LLVM?
Non-currently explicitly written up.
I think we need to improve this section.
Which parts or which kinds of issues are not security issues?
There's actually one definition here.
So it's basically if you have a malicious input file to a front-end and like you make clang fall over in a segmentation fall or something like that.
That's probably due to a memory vulnerability in clang, but in general we don't consider that to be a security issue.
Second category of things that could be improved.
Issues related to hardening features.
Based on the reports that we received, I think for two or three of them, we said that no, no, the hardening feature works as expected.
You as a user thought you would get more protection, but you didn't get it in a specific detailed example.
I think the documentation for many of the hardening features can be made quite a bit better to be more explicit and more refined about this is exactly what you get.
And are there a few things that we could do to increase the quality of implementation of these hardening features?
I have a few IDs there.
So one is to build a binary scanner that checks the compiler produced some binary.
Does it actually everywhere apply that security hardening feature correctly?
I build a prototype based on BALT, but that's a whole other presentation at some point.
I already said that probably we should improve documentation.
Some of the mitigations, so they're built in LLVM, they are made available for clang the frontend.
But then if you go look where are they also applied to other languages and then you see some gaps.
So for example, for the rest frontend, maybe some of these mitigations could also be enabled and maybe for other languages too.
And maybe it could also help to help more competitive developers know about what are the special things to look out for from a security point of view.
And so together with a few other people, I started an open source book to try and collect the information like what should a compiler developer know about security.
And so the link to that open source book is at the bottom of the slide.
I said something about supply chain, something, something we should could improve there too.
There's a few categories for supply chain that we saw.
So something's about just the web infrastructure for a project.
Something about features and tool chains that help with developing software.
Maybe something is needed there also to, to, to from the supply chain angle point of view.
Secure against malicious injection of code into LLVM binaries.
So yes, making sure that a malicious actor wouldn't be able to like modify the LLVM binaries or the binaries and produce itself.
One thing that I think is a really good, very recent change is having some open as an F scorecard for on, on, on the project currently,
which is a list of recommendations on best practices for software in general, software projects in general from a supply chain security point of view.
And currently LLVM has a score of 5.2, so probably there's room for improvement there too.
Moving away from the Chromium tracker, I'm not going to say that I'd spend any more time on that.
Finally, some thoughts on better communicating security issues.
So most of the security issues that we received, maybe people think of CVE as that's the way to communicate to the whole world.
There was a security issue here. This is what you do.
What it turns out, most of the issues that we receive on the tool chain side, there are quite a few security issues that actually,
most people would say you shouldn't create the CVE for that.
For example, gaps in mitigations like that doesn't make one specific program immediately exploitable.
And so quite a few people say you don't create CVEs for that.
But at the same time, these mitigations are widely used and people rely on it.
And so that should be a way to communicate to users of tool chains.
We found something probably you want to know and you can do your own decision.
Should I update my tool chain or is it okay for me?
There's a few ideas on the slides on how to do that.
Maybe just have a security label in the issue tracker or something different.
But I guess I'm almost running out of time, so let me move on.
A little bit on, can you participate as a general LLVM developer or LLVM user?
Well, please, if you do think you find a particular issue and you do think there's a particular security angle to it,
please do report it appropriately to the security group rather than its public bug reports.
When needed, spread the word that LLVM does have a security group and a process to report security issues.
We do every month have a public online sync up.
So if you would like to talk with a number of people on security group or anything relevant, please have a look.
There's also a link to the calendar, what the exact schedule is.
And if you're a vendor contact or a researcher or you would like to participate here, please do reach out.
We do welcome new additions to the group.
So in summary, the LLVM security group has been operating pretty well, I would say over the past three years.
If you analyze the issues, there's a number of areas for improvements left.
And again, if you do encounter an issue that you think, hey, maybe there's a security issue to try to remember to report it appropriately.
And with that, I thank you for listening and hopefully a few more minutes for questions.
One minute and 20 seconds.
Right.
I think you really understand the idea with vulnerability in the tool, not being concerned with the security group.
If the tool is part of the LLVM project, why is it not?
All right.
So the question is, I think, this line roughly.
So how come if there is an issue in a tool, how come that's not considered a security issue?
I think it boils down to that.
In the very first class, in your stats, you said there were two issues that you didn't need to consider.
All right.
Okay.
Let me go back to that.
So we had two issues, one in the sanitizer, one in a client warning.
So the classification is mine.
So I just looked at exactly what happened on those issues.
On those issues, the decision the group took was it's not necessary to keep this issue private.
This issue can be made fully public.
It means that fixes will be implemented more quickly, which is a good thing.
On the other hand, do we think anyone would be immediately the target of an attacker because we made this information public?
And for these two issues, we thought the chances of that are really, really, really low.
And so the trade-off was it's better to make this issue public rather than to keep it private.
Hopefully that makes sense.
Yeah.
Well, maybe we have time for one more question if there's one.
I was curious, since you said that you have some vendors in the group, do you think that they will be open to giving bounties for some of these vulnerabilities if they are significant enough?
Because I mean, it's a real bug in the code gen.
It could be significant, especially if mitigations are on my list.
So the question is, would vendors be open for a bug bounty program?
Sure.
All I can say is that so far I haven't seen anyone even suggesting the ID from the vendor's side, so who knows.
Bounty fixing.
Yeah.
All right.