So, it's my pleasure to introduce Rishi Rai.
He's one of the GSOC students from last year and worked on GCC LTO.
Welcome here. He actually came here and got some travel funding from the new toolchain fund.
So, that's also a thing the community can offer to students.
Not all times and so on, there's conditions but it worked in this time.
Okay.
Hello everyone. I am Rishi Raj and today I'll be presenting my GSOC work on GCC's LTO and along with I will share my experience with first time contribution to GCC.
So, basically this is little about me. I am under garaged to end from India and I love terminal.
Because I am lazy and it helps me to automate a lot of tasks.
And also I keep changing my distro about each two, three months and this I think this is very problematic because a lot I spent a lot of time in that.
Apart from that in my free time I love to read fiction, travel and I play badminton a lot.
So, moving on. So, before discussing my project, let's discuss some of the things related to my project.
So, first let's discuss about the LTO.
So, traditionally what compiler used to do is like to for optimization it is done at the compile time.
So, each of the compile compilation unit is optimized independent of other compilation unit.
But when we think about a real world project we are missing a lot of context because in real world project there are several of there are several compilation unit which are related to each other.
And if we can like and if we can find some context between these compilation unit then we can optimize more.
So, in LTO what we do is instead of optimizing along with optimizing at compile time we also compile at linking time.
So, linker is aware of all the translation unit so there is obviously more optimization.
Only downside being a longer compile time and more usage of memory during compilation.
In GCC to enable LTO you can either use minus FLTO flag or minus FAT LTO objects flag.
So, basically the second flag is to save GIMPAL bytecode along with LTO IL in the LTO object file.
Now, let's discuss the basic structure of a ELF file.
ELF file is basically used for storing binary libraries and executables on Linux and Unix based system.
So, here you can see that there is a ELF header.
ELF header basically contains the metadata about the ELF file to identify the ELF file on which system it was produced etc.
Apart from this there is a program header table which is not relevant to my talk because it's used for executable and I am going to talk about the object files.
So, our main interest is section header table.
Basically it contains the different information about different section along with their names.
So, in LTO this is the typical structure of ELF files.
So, you can see these sections prefixed with .gnu.ltu prefix.
So, these sections are produced by LTO streamer.
So, now let's discuss the role of an assembler in producing LTO object file.
So, right now we discussed that these all section with .gnu.ltu prefix are produced by LTO streamer.
And these all are already in the binary format.
So, assembler doesn't touch this like basically the assembler take this as an input and output it the object file.
The only thing which assembler does is it produces .sim tab which contains the symbols .strtab which contains the string representation of section names
and .strtab which contains the symbol name in the string representation.
Apart from that if you are compiling with debugging flag then it also produce various debug symbols along with their relocations.
So, assembler basically performs two functions.
First is to generate this three section .sim tab .strtab .strtab and if debugging option is enabled then it produces these debug sections.
So, now my project was to bypass assembler.
So, first let's discuss why we wanted to bypass assembler.
So, we already talked about like what was the function of a assembler in producing LTO object file.
It was basically two things and those two things were not very complicated it can be actually done by compiler.
So, what benefit we will get if we produce that from compiler.
So, like we already noted the fact that a lot of sections with the GNU LTO prefix were produced by the LTO streamer.
And compiler just have to scramble through that take that input and produce as output in the object file.
So, there was a lot of I over it.
So, here you can see the result.
So, with bypass we are taking like 14 seconds to compile a single compilation unit file and with default part it was taking 21 seconds.
So, only for a single compilation unit there was a difference of 7 seconds.
So, if we like we are currently testing it on the real world project like Firefox and all and we will publish the result.
And we hope to see a drastic improvements.
So, to implement the project we along with my mentor divided into two parts.
The first part was to extend the libivirty.
Basically libivirty is a helper module which have various function to write and read from an object file.
So, we extended the libivirty to output symbol table and a string table.
So, the first task which a similar was performing was done by this.
The second task was to output various debug sections and symbols along with their relocation directly.
So, this was little bit complicated for me.
The first thing was that GCC there was not sufficient documentation of GCC LTO debug architecture on the GCC website.
So, I got in contact with Richard who is the who have implemented the GCC debug architecture and he helped me through it.
And also the documentation of Dwarf debugging format was a little bit intimidating as a beginner,
but as we proceeded further with help of my mentor I was successfully able to navigate that.
And as of now we have like I have successfully implemented the support for ELF file format and it can be found in the devil by bypass ASM branch of the GCC repository.
So, you can go there and see that and if you have some comments you can reach back to me.
For now we are only supporting x86 target for relocations and we are generally we are eventually planning to roll out to other architectures too.
And we will also provide support for other object files other than ELF too.
So, now let's talk a little bit about how I got this opportunity to work in open source with GCC.
So, basically I tried getting into open source before a few times but it was a bit intimidating for me because due to the large code base and all those things.
So, when I like one of my friend recommended me to try for G-SOC and when I was going through various projects then this project interested me and thanks to DOH his newbies guide was very helpful me to introducing to GCC.
And then I applied and eventually got selected.
So, basically G-SOC is an opportunity for first new contributors to get started with open source and applications for this year is opening on 18 March.
If any if anyone of you are interested then you can sure apply.
I sought out the people who made it possible for my contribution first Google for organizing G-SOC which provided me a good platform to launch me to the open source world.
My mentors Jan and Martin for helping and guiding throughout the summers to complete the project.
Thomas and Dev for securing the sponsorship and also for my visa process.
Thanks a lot.
Or also to the GNU tool transfer which sponsored these attendance.
Thanks for your attention.
These are my socials.
If you want you can connect me here.
Any plans for the non-LTO case?
So, you put in a regular assembly like machine called.
But it will be repetition of a similar only right in non-LTO case.
So, the question is repeat the question that we have on the video.
So, he is asking if we have any plans to extend it for non-LTO case.
So, in non-LTO case one need to be simply repetition like we will be building another assembler right.
We will have to anyways like assemble all of those stuff.
Can you please repeat?
It is actually not done.
So, we should be extend as we need it for the forward.
Any more questions?
Yeah, you said that you need more relocation right for the fire business.
Is it just like relocation for the port?
Or do you really need all kinds of relocation?
So, he is asking that we like I was telling that we need more relocation.
So, is it like relocation for pointers or relocation we need more relocation.
So, basically for every architecture the relocation structure is different right.
So, for each debug section we need to output a corresponding relocation so that during the late debug phase the linker could identify that debug section and link it.
So, for each architecture we need to change that structure and output corresponding relocation.
Basically the address and the add-in will be different in the different format.
Thank you.
How about this work in case other debugging formats get sad that they need to be in LTO.
Because I see that you did some of the stuff in talk to us right.
Yeah.
Can you please repeat like I was?
Well, I think that the LTO streamer part of the stuff that it dumps, UY pass, right, directly is basically how I mean how attached you work to the door for.
Oh, okay.