Yeah, it's done. Yeah, well, we're about to start, so meet your Euro. Yeah, really. You're already on it? Really? I don't think so. Already on it? Yeah. You think it's on it, right? Yeah. It's on where do you want it? Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Hi, everybody. My name is Shivam and I work for KDAB and I also work this summer, Google Summer of Code with LLVM, working on this project, this mapping LLVM values to the corresponding source level expressions. And, but why? So the challenge of understanding the compiler optimizations. So, so compilers are basically performing different sorts of optimizations and it's not always possible that it's going to be optimized or basically vectorized. So, we basically our motivation was vectorization for the first because we wanted to include those optimization remarks in for the vectorization part. So, our motivation was vectorization. So, it's not always possible. So, your compiler can always vectorize your code. So, there's some sort of data dependencies there where that's why your compilers cannot actually vectorize all the time. So, on that cases you have to emit a good remarks and I'll let you know what currently clang actually generates as a remark. So, understanding why and how these optimization occurs is not always straightforward. Even the authors or vectorizer don't know what's going on if the vectorization didn't happen. So, consider this example. So, you can see there is a data dependencies between A of i and A of i plus 3. So, this loop clang will not be not able to vectorize this code. Okay. So, see this remark that produced with the clang which is that loop not vectorize. You can use pragma loop distribute. So, you can compile the tries to distribute the loop and it might be able to vectorize in some sense. But, just see the remarks. It's not clear that what actually gone wrong here and where's the data dependency. It's not telling you that where's the data dependency actually was and so you can improve the code itself. Right. So, it just that remarks and you can see this not actually clear. Right. So, it's a bit unclear and if you can have such this remarks nothing much just two expressions are the dependent source and the dependence destination for example. So, you know that okay there is a data dependencies between two to these two locations and so if you are aware of the code so you are going to modify your code and you might be able to modify in the way so you know that it will be possible for the compiler to vectorize now. So, you can modify the code by looking at this these expressions for example. So, yeah so it's going to surely enhance the remarks include the exact so if it includes the exact source and destination of the dependencies within the error for example and it will pinpoint the lines of those dependencies and let's look at the impact of these enhanced remarks so it would be clarity for the developers so they can quickly look where the dependencies are actually occurring and so they can improve their code and probably make it vectorizable and efficiency in the terms of that they can save time and improve efficiency by reducing the need for the deep debugging that where was the actual data dependencies so you can just look at the optimization remarks and you get the quite a lot of information that okay there is the data dependencies between two load and stores. So, let's look at the approach that we took for solve this problem or this project so approach was very simple to just utilize the debug information that available into the intermediate representation right so to recreate the variable and the function names lost during the optimization so the optimizations are actually a problem in our case because we currently don't know how to build those for example instructions that's lost into the optimization for example so if you see a multiplier if you see a MUL instruction in the IR so compiler might optimize that into shift left so the MUL was the original information that was actually available in the source code but right now we have shift left so we just lose the context that what was the actual source level information so that's still a problem for us and we have different approach for that but we didn't see to be so we see that it was not much a performance good so it was very bad so we wanted to clone the module so we have so we can look that what's happened after each optimizations so we can have the clone of the original IR and we can see that what's going on after every pass of optimization or every implementation passes so if you look at the different transformation pass and you have to look over that what's the thing that gets changed and anything that you have stored that okay there was the original instruction as MUL but right now it gets changed to shift left so you see that okay the MUL gets changed to shift left so you have to cache the expressions basically to reload the things in a new way so if there is no need for that so we will proceed after it so let's see how to utilize that information that available in the IR so LLVM uses a small set of infernsic functions if you are aware of so those are provided for the debug information so they contains different metadata so they have different arguments so these infernsic functions are named prefixed by this LLVM.debug and these things helps you to track the variables through the optimizations and code generation so if you look in the IR so if you dump your IR with the dash G symbol so you will get to know about the LLVM.debug.value or .declare so those contains everything related to the source level things so they contains the metadata information and the metadata are there for that so they can give you a lot more information about what was actually in the source so for example like variable names so when you trace the metadata so you can get the variable name from the actual source so for us these two infernsic functions were very important the debug.declare and the debug.value and let's try to understand a bit so if you can see the I is allocated and just below it you can see the call to the infernsic function which is .debug.declare and these you can see the three arguments in the infernsic function the first represents the first will always represent the address of the variable the second metadata always pointing to the for example you can see the DI local variable so which contains this is a metadata node and it contains the variable name so what was the actual name so you can see the actual name was I in the actual source expression so you can when you trace back the information so you can retrieve the name so these are these infernsic functions so the second can really help us a lot and the second was actually the source just the source information like name and the third argument is DI expression and generally DI expression is useful as a complex expression so you have expressions like for example int a is equals to b plus c a DI expression can hold that stuffs yeah and yeah so debug declare is used for that and the second is debug.value so .value is very similar to it it's just that when I is gets updated when a value is updated so debug value can up and so everything goes in the debug value for the same so we now have enough information to at least give a try to build the source expressions and only if the code is compiled with the debug info on so it's compiled using the dash g symbol so we use the we are infernsic as a bridge so our focus was on memory access and vectorization as I said so importance of the memory access pattern is so we really want this project for vectorization at first and then we also have a plan to give it a push into the debugger so debuggers can utilize this information to provide better remarks but the main goal initially was the vectorization pass vectorization is a transformation pass so a transformation pass can always query an analysis pass and what our work is our work was an analysis pass so this vectorization pass in LLVM can always query the analysis pass this transformation pass okay so project contribution is actually that we have we have built an analysis pass that can generate these mappings and provides a better remarks for the case of vectorization or any other things that requires it let's look at the implementation detail so for us the point of interest is load and store instruction because of the vectorization because we want to analyze the memory access patterns to actually emit in the vectorization so which is useful for the remark and for example just take a look at this C code and if you compile this with clag or to dash g and if you emit the LLVM just for showing you that what's going on so I think it should be visible so you can see the call to in intrinsic functions debug.7 so we can build these expressions from them so as I said so if you look these were the first is to multiply n1 but and we compile it with the optimization on so the multiply instruction gone away and it just updated by shift lift operator okay so that's why you can see the shift lift operator here and not multiply so that's a problem so that's a problem of accuracy of the expressions because we still not have a good plan of how to accurately generate the expressions because a lot many times these things gone away because of the optimizations on and it's always been a hard problem of how to actually debug in the case of when optimizations are actually gone so it's a classic problem which we still have to look so you can see the we can build these expressions from the instructions so yeah you can see this from example that computing the equivalent source expressions of interest will involve walking through the LLVM IR and using the information that provided in the debug in forensics so even though our current interest is load and store but we still have to build for every instructions because those load and store can contain any sort of an instruction when you trace back to them it's maybe any binary instruction it contains gap instructions so it might be contain any instructions and we have to we still have to build for them too and as I said that optimizations make it impossible to recover the original source expressions so as you see that for example the 2 into N1 is optimized to N1 left shift 1 so which is recovering the original expression may not be possible every time so let's look at how we proceed it's just a basic algorithm that I just want to go through so we started by traversing the IR so we have we started with the traversing the IR we identify the operations that were there for example load and store source or main so current interest is load and store instructions so specifically look for those instructions in the IR and then trace those operands it might be any other instructions it might be inside any metadata so we can retrieve any information like name and utilizing those metadata information building the source expressions and then we reconstruct the expressions with all the information that we get so that's a bit all but just look at the current state so it's still not yet upstream to LLVM the PR is here so what I need from you or anyone you have experience or anyone you have active in that zone of optimizations or for example analysis passes or transformation passes in LLVM so I do like you to have a look at the patch if you have some experience try to review the code as well and give some feedback so we can proceed with much detail because it's still it's still a new analysis and still need a lot of work for struct as well so as I mentioned we need more review on the patch and some active work from me as well and if any of you interested please reach out and as I said the struct pose a unique challenge to when we try to build the expression for struct for example it was not quite easy it was very difficult to build the expression for struct because they pose a unique challenge if you see them in the intermediate representation it's very weird to look at them because I don't know how they actually gets there in the IR how they represent it it's not as simple as the giving expressions for an array so struct is still a problem accurate source level expression in case of optimization it's still a problem and there isn't always one to one mapping between the source code and IR so if you see that we still don't know what to do in these cases if you see there's a pointer and the PTR 0 so IR can generate the same code for these two patterns and we don't know which have to pick so that's still a problem one solution for this was that debug information also contain the information about the file so there is a DI file in the debug info so what was actually we were discussing is so we have still information about the file path so what we can do is we can actually open the file and just go on that line and retrieve the information what was the actual ground truth just look at that second thing was fall back and anything because we don't know what was there so just fall back to any of them so but the DI file was actually quite easy but it's not good performance wise if you see open the file and just retrieving just going on that line and retrieving that so it's not good performance wise so yeah that's it for the talk and thank you for listening and if you are interested in letting knowing more about this project and the algorithm please reach out to me on mail or for example discord so yeah thank you any questions yes why do you need to rebuild the entire sequence of expressions for each of the values why not the specific value of the value it is what the dependencies and the production line from the file just like after a year ago can you just free us the question so you know like when you admit after marks there's a tool called the operator that just put everything in line so between that and what you have here it seems that you know you would get excellent results in terms of debubbability if you just did what the operator does plus you specify which of the values are causing the dependence that what you said and what the reason for the failed optimization okay so the question is basically about using opt viewer right yeah I was just admitting a more limited view as you have here and not trying to reconstruct everything so we still not reconstructing everything for example so we still we still not focusing on creating whole I at or just mapping whole I at the expressions we still focusing on those loads and stores as I said so we still focusing on that right yeah yes yes yes yeah so we we still we still picking up the load and stores so if we see that is there any gap instructions because gap instruction actually contains a chain of instructions so so but we still have to build the loop for loads and stores and opt viewer still not good at emitting those remarks it's still very abstract in that sense if I remember it correctly so I'm not sure how to go with opt viewer but we still making it for load and stores and tracing back the information yeah yep yeah yes nope nope not sure but one thing I can guess that so basically opening a file is not something which is very good performance wise and and just choosing and just going on that line down because you can see that there could be a multiple lines of code in code base so you have to go on that particular line so it would need it would need it would be very bad in performance was I think okay and there is no and there is no theory about if it would be more beneficial to tell the programmer the error that or the sub optimal choice that you made was between line 27 28 compared to generating some arbitrary complex expression that might not be representative of what the program originally wrote I'm not sure okay okay yeah I think it would be fine then if you if you're choosing for emitting a remarks then then you know that this is not good performance right so if you want to look at the performance if you want to look at the actual correct remarks so you have to going deep in the performance thing then yeah then it would be possible and we have also we also talking about preserving the metadata in LVM as they go through but but in LVM metadata is designed in a way so we can drop we can drop at any time so we still cannot preserve the metadata information so it's still a challenge this lot move what going on on this side so yeah yeah yeah okay yeah thank you yeah okay thank you for joining when you leave make sure to take everything yeah yeah yeah yeah