[00:00.000 --> 00:12.600] Alright, we are going to start next talk. Next up, we have Kiran. Kiran will give us [00:12.600 --> 00:24.200] an update on flight. Hello everyone. As Christoph introduced, I'm Kiran. I work at ARM in [00:24.200 --> 00:29.160] the Fortran compiler team and today the task for me today is to give you all an update [00:29.160 --> 00:36.640] about the progress we have made with the flying development. This slide shows the contents [00:36.640 --> 00:41.480] of my presentation today. It's fairly simple. First, I start with an overview of the flying [00:41.480 --> 00:47.960] compiler. Then I give you a summary of the story, whatever has happened so far. Then [00:47.960 --> 00:55.280] I provide you with the status of the compiler. Finally, I identify a few of the major development [00:55.280 --> 01:05.600] efforts going on currently. This slide shows in brief the overview of the flying compiler. [01:05.600 --> 01:10.600] Flying is a new Fortran frontend developed from scratch. It has a traditional compiler [01:10.600 --> 01:16.680] flow. It's an LLVM based Fortran frontend. It's actually the Fortran frontend of LLVM. [01:16.680 --> 01:22.760] It generates LLVM IR, but it has a difference with the other frontend in the LLVM project [01:22.800 --> 01:30.120] Clang. While Clang lowers from the AST to LLVM IR, flying uses a high level intermediate [01:30.120 --> 01:37.640] representation called Fortran IR or FIR. That's what flying generates. It uses the [01:37.640 --> 01:46.040] MLIR infrastructure for FIR and MLIR interfaces with LLVM through the LLVM dialect. FIIR is [01:46.040 --> 01:52.840] converted to LLVM dialect and then the LLVM pipeline kicks in. This is basically the [01:52.840 --> 01:57.680] diagram that I have on the left hand side. Given a Fortran program, there is some parsing [01:57.680 --> 02:02.360] and semantic checks that happen. Finally, you get a flying pass tree that's fairly [02:02.360 --> 02:09.400] well defined. Then that code is lowered into Fortran IR and calls to the runtime. Then [02:09.400 --> 02:19.280] the Fortran IR is converted to the LLVM dialect. This slide summarizes the story so far with [02:19.280 --> 02:26.680] the flying compiler. Looking at the slide, this project started in 2018. It was during [02:26.680 --> 02:32.200] Euro LLVM 2018 that news about this compiler start to come out that there is a new Fortran [02:32.200 --> 02:40.000] frontend being written from scratch. One year later in April 2019, the project was [02:40.000 --> 02:46.480] accepted as the Fortran frontend of LLVM. Again, one year later in 2020, it was merged [02:46.480 --> 02:54.400] into the LLVM project as LLVM flying. When this happened, there was some code that was [02:54.400 --> 03:03.400] left behind. The project actually split into two repositories. The first one in the LLVM [03:03.400 --> 03:09.880] project, where the parsing and semantic checks and the code for the runtime was there. All [03:09.880 --> 03:15.080] the code that lowers from the parse tree to the Fortran IR got left behind because it was [03:15.080 --> 03:22.440] not ready at that time. It began to take a life of its own. People had to now sync these [03:22.440 --> 03:28.120] two repositories, sometimes commit to both two repositories, and you have all the overhead [03:28.120 --> 03:35.480] of maintaining a downstream project. Fortunately, sometime in April 2022, people decided to [03:35.480 --> 03:40.920] freeze all the downstream development and pushed all the code into upstream. Sometime [03:40.920 --> 03:48.680] in July 2022, this whole code is now in the LLVM project repository. Since then, the project [03:48.680 --> 03:56.000] has mostly followed the LLVM contributions process, and all the social guidelines are [03:56.000 --> 04:05.080] associated with it. When it was merged into LLVM, it was mostly a Fortran 95 compiler, [04:05.080 --> 04:10.440] but there were still a few missing pieces. The code was stabilized further, and all unknown [04:10.440 --> 04:16.480] features were marked with to-dos, so that if you try to compile an unsupported feature, [04:16.480 --> 04:23.080] it will give a message saying that this feature is not supported rather than giving a crash. [04:23.080 --> 04:28.600] At the same time, development has continued to support newer standards, newer Fortran [04:28.600 --> 04:35.960] standards features like features from Fortran 2003, Fortran 2008, and things like that. [04:35.960 --> 04:40.200] Also a lot of bug fixes went in, as well as people started to look at some performance [04:40.200 --> 04:52.600] work as well. This slide summarizes the current status of the compiler. The compiler is not [04:52.600 --> 04:58.720] yet ready for general use, but it is still fairly advanced in its support for various [04:58.720 --> 05:04.480] Fortran constructs. The driver is temporarily called flying new, and executables can be [05:04.480 --> 05:09.200] generated, but you have to use an option called flying experimental exec. [05:09.200 --> 05:14.160] The feature development of Fortran 95 is mostly complete, and as I mentioned before, development [05:14.160 --> 05:19.800] of Fortran 2003 and later features are in progress. The compiler has been tested with [05:19.800 --> 05:27.520] various commercial and free test feeds. It has also been verified with some HPC benchmarks [05:27.520 --> 05:36.200] like SNAP, Cloverleaf. We have also used the spec benchmarks to test it. We are also continuing [05:36.200 --> 05:42.880] to test it with other benchmarks like the OpenMP version of spec, other open source [05:42.880 --> 05:47.360] applications like open radios, and things like that. [05:47.360 --> 05:55.080] This driver's name was flying new, and the use of the experimental flag is currently [05:55.080 --> 06:02.000] being discussed. It is possible that those requirements will go away soon, and then people [06:02.000 --> 06:09.720] can just type flying to compile their application. There is a discourse thread on that currently [06:09.720 --> 06:17.920] under discussion. This slide summarizes the support level of [06:17.920 --> 06:25.000] flying for various Fortran standards. Fortran is a living language. As you can see, it has [06:25.000 --> 06:29.640] gone through a lot of revisions. There is another revision that is going to come in this [06:29.640 --> 06:35.880] year and one later in this decade. It is a living language, and it is continuing to [06:35.880 --> 06:40.560] make progress, adding new features and things like that, but it makes the job of the compiler [06:40.560 --> 06:45.920] engineers much harder because you are always trying to catch up. [06:45.920 --> 06:51.040] The initial standard that we track is 77, and the development work is complete. Fortran [06:51.040 --> 06:58.400] 95, as I mentioned before, is complete except for a few bits here and there. Work on 2003 [06:58.400 --> 07:04.960] is in progress, particularly on polymorphic types, but the parsing semantics and runtime [07:04.960 --> 07:11.680] mostly works. Similarly, with Fortran 2008 as well, parsing semantics and runtime works, [07:11.680 --> 07:18.160] but some of the features are in progress. Whereas, when you come to Fortran 2018, none [07:18.160 --> 07:23.480] of the lowering or codes and work has happened yet. Whereas, the parser and semantics and [07:23.480 --> 07:28.080] runtime should work fine for this code, modulo n e bux. [07:28.080 --> 07:36.320] Now, I have said that the compiler is able to compile a lot of Fortran code. How does [07:36.320 --> 07:42.960] the performance look like? This slide gives you a summary of where this compiler stands [07:42.960 --> 07:51.040] with respect to other compilers. The benchmark I have used for this slide is the respect [07:51.080 --> 07:57.800] 2017 benchmark, and all the Fortran benchmarks from that, either Fortran or mixed Fortran. [07:57.800 --> 08:02.880] I have compared it with the two compilers. One is the G Fortran 12 version, and the other [08:02.880 --> 08:07.980] one is the compiler called classic flank. There is a compiler that is previously open [08:07.980 --> 08:14.440] sourced by PGI, and that is actually the Fortran front end of many of the existing [08:14.440 --> 08:19.960] commercial compilers like AMD's, ARM's and Huawei's. [08:19.960 --> 08:26.960] So, I have compared it with both these compilers, and what I have at the bottom here is the [08:26.960 --> 08:32.760] geometric mean, if you consider the performance of all the benchmarks in this suite. [08:32.760 --> 08:40.760] So, you can see that compared to G Fortran, we are around 1.5 times the runtime it takes [08:40.760 --> 08:49.080] in flank, whereas compared to classic flank, it is around 1.38. So, for some of the benchmarks, [08:49.080 --> 08:54.600] we are mostly on par, but for some of the other benchmarks, there is still some work [08:54.600 --> 09:01.480] to be done. The comments column basically summarizes what are probably the issues that [09:01.480 --> 09:05.360] are there that causes this performance difference. [09:06.040 --> 09:13.520] Some of them are familiar things like alias analysis, the other things like intrinsic [09:13.520 --> 09:21.360] in lining and function specialization. Fortran has a lot of intrinsic functions. So, generally [09:21.360 --> 09:26.360] these are all implemented in the runtime. Because you have a lot of these intrinsics, [09:26.360 --> 09:31.640] the runtime is many times written in a generic fashion. So, you might not get the performance [09:31.760 --> 09:36.420] if you call the runtime function. Also, for simple arrays and all, it does not make much [09:36.420 --> 09:40.560] sense to incur the overhead, particularly if that function is being called in a loop [09:40.560 --> 09:47.560] or something like that. So, many times it is good to inline that code. So, in benchmarks [09:47.560 --> 09:55.060] like exchange too, there are a lot of functions like count, sum and any mean lock and all. [09:55.060 --> 10:00.920] That can be possibly be inline to get more performance. And exchange too happens to be [10:00.920 --> 10:07.240] one benchmark where if you have performed function specialization, you get much benefits. [10:07.240 --> 10:12.720] So, function specialization is a process where if you know the arguments to the function, [10:12.720 --> 10:18.000] you can generate a specialized version of that function based on the known parameters [10:18.000 --> 10:19.500] value. [10:19.500 --> 10:26.960] There are also other issues mostly associated with how arrays are handled in Fortran. So, [10:26.960 --> 10:32.600] by the definition of or how the standard interprets arrays in Fortran or array expressions [10:32.600 --> 10:38.480] in Fortran is that there is always a temporary associated with it. But when we generate code, [10:38.480 --> 10:43.760] if there are a lot of temporaries, you know a lot of work, a lot of time is consumed just [10:43.760 --> 10:48.600] for copying these arrays from one version to another. So, in many cases these copies [10:48.600 --> 10:53.760] can be optimized away, but we do not do a good job about it and that is what is causing [10:53.760 --> 10:55.120] this performance issue. [10:55.120 --> 11:00.240] So, few engineers have been working on this for some time now. A few months back, we were [11:00.240 --> 11:09.880] around 2x, but we are closing the gap you know as fast as possible. So, now in this [11:09.880 --> 11:15.080] slide, in the following slides, I summarize some of the major development efforts. So, [11:15.080 --> 11:21.440] I probably missed some of the efforts, but what I am going to summarize is first one [11:21.440 --> 11:27.000] is high level fare, that is a new dialect that is being written, that sits just above [11:27.000 --> 11:34.480] Fortran IR. I will come to the reason for that. Second one is I will try to have one [11:34.480 --> 11:40.280] or two slides about polymorphic types and how they are handled in flank. I look at some [11:40.280 --> 11:46.480] of the performance work that is being done. I briefly summarize the work that is done [11:46.480 --> 11:53.640] already and what are the work going on in open MP as well as the driver. [11:53.640 --> 12:02.440] So, when the compiler work started, the IR that we had is Fortran IR which represents [12:02.440 --> 12:08.080] a lot of the Fortran constructs, but what we found is that although it does model a [12:08.080 --> 12:14.640] lot of the Fortran constructs, there is still some gap between the Fortran source program [12:15.000 --> 12:20.040] and the Fortran IR. So, there is some information that is being lost like what are the variables [12:20.040 --> 12:24.440] in the source program, what were the annotations that were there on the variables and things [12:24.440 --> 12:31.440] like that and losing that meant that we might not be able to do some performance optimization. [12:31.440 --> 12:37.520] So, what people decided is that we need to, so that was one issue. The other issue was [12:37.520 --> 12:41.880] that the lowering was also proving to be a bit difficult because of the huge difference [12:41.880 --> 12:47.720] between the source and the IR. So, that is the reason why a new intermediate representation [12:47.720 --> 12:54.640] was introduced between Fortran IR and the source that is the HLFIR IR or the high level [12:54.640 --> 13:01.040] FIR. As I mentioned before, it enables optimizations and because it carries more information from [13:01.040 --> 13:04.800] the source, it is likely to generate better debug. [13:04.800 --> 13:11.320] So, this IR basically introduces two major concepts. One is that it models expressions [13:11.440 --> 13:16.760] that are not buffered. As I mentioned before, array expressions in Fortran generally involve [13:16.760 --> 13:23.760] temporary arrays and whenever we introduce that arrays into it or the buffers associated [13:24.560 --> 13:31.560] with it, it looks like a lot of low level code. Whereas, expressions that are not associated [13:33.520 --> 13:39.800] with these buffers are still higher level concepts. So, that if you have chains of intrinsic [13:40.160 --> 13:44.600] functions that operate on arrays, it is easy to do some kind of processing there to simplify [13:44.600 --> 13:48.920] those expressions. It also introduces the concept of variables. [13:48.920 --> 13:55.120] So, there is something, there is an operation in high level FIR called HLFIR variable which [13:55.120 --> 13:59.400] collects information about all the variables in a single place. So, that you know this [13:59.400 --> 14:03.960] is the variable and what its properties are. Some of these might be modeled by attributes [14:03.960 --> 14:08.760] like if you say that a Fortran variable is a pointer or allocatable that is marked as [14:08.760 --> 14:15.720] an attribute. It also identifies the shape of the array, you know if that is an array [14:15.720 --> 14:22.720] along with the memory associated with it. So, then the initial lowering will be from [14:23.120 --> 14:29.200] the Fortran source to a mix of high level FIR and FIR and then the high level FIR is [14:29.200 --> 14:35.000] converted again to FIR and then the rest of the pipeline kicks in as always. So, I would [14:35.080 --> 14:40.560] not be going into the details, but if people are interested there is a detail RFC you know [14:40.560 --> 14:47.560] inside the flying documents repository. But I will try to show you an example. [14:47.560 --> 14:53.760] So, this is the Fortran source code and what we have is a declaration of two arrays. The [14:53.760 --> 14:57.520] first array called Y is a two dimensional array, the second one is a single dimensional [14:57.520 --> 15:04.520] array and then we are summing all the rows or columns in the first array Y and storing [15:05.680 --> 15:12.680] it in the result array that is the array called S. I tried to put the FIR code for this in [15:13.720 --> 15:19.280] a slide, but it happened to be too much and it would not fit in one slide or two slides. [15:19.280 --> 15:25.080] So, I have just left the comments here whereas the source code is completely gone. So, you [15:25.120 --> 15:30.920] can see that there is some FIR aloca which allocates that array and the name of that variable [15:30.920 --> 15:36.120] was part of that aloca. You can also see that there is a call to the runtime for the sum [15:36.120 --> 15:43.120] function. You can see some comments that I mentioned there that the runtime calls and [15:45.280 --> 15:50.440] it allocates a buffer on the heap and then returns that and then that heap is copied [15:50.480 --> 15:57.480] to the real variable and the original heap is deallocated. Not much here, but if you [15:57.680 --> 16:04.680] come to the HLFIR you know you can actually fit that into a single slide because it models [16:05.280 --> 16:10.400] things at a higher level. So, the important difference to notice here is that there is [16:10.400 --> 16:15.320] an HLFIR declare that there are two HLFIR declares that declares the variables that [16:15.320 --> 16:20.920] are there in your program. So, you can see that there are two arrays S and Y they have [16:20.920 --> 16:26.600] a representation for that and instead of a runtime call you have an operation called [16:26.600 --> 16:32.240] HLFIR sum that actually returns something called an HLFIR expression. So, there is no [16:32.240 --> 16:36.840] buffer associated with it or the runtime is not called there is no memory allocated as [16:36.840 --> 16:42.760] of now and then that is assigned from the expression to the original variable that is [16:42.760 --> 16:49.440] a result array S. So, basically I just want to show that you [16:49.440 --> 16:55.160] know this is at a higher level it has some concepts called variables and it has also [16:55.160 --> 17:01.960] some things called expressions. I will now move on to polymorphic types. Polymorphic [17:01.960 --> 17:08.960] types came as part of the FORTRAN 2003 standard. The types are only known at runtime FORTRAN [17:09.080 --> 17:14.160] has the class type for specifying such a type. You know if you have a class type it can [17:14.160 --> 17:19.440] refer to either that type or any of the type that extends from it. So, extension is the [17:19.440 --> 17:25.680] name for the inheritance concept that is there in some other languages. Again I will not [17:25.680 --> 17:29.400] go into much details I have an example in the next slide, but if people are interested [17:29.400 --> 17:34.320] they can check this RFC. So, only the example that I have on the left [17:34.400 --> 17:41.000] hand side there is a type called point. We call it derived types in FORTRAN and then [17:41.000 --> 17:45.480] you have a three dimensional type it extends from point and basically it just adds another [17:45.480 --> 17:52.480] field to it. We have a class type that is called P3D and then you call this subroutine [17:56.400 --> 18:03.400] called foe and then this subroutine accepts it as a class type of the base type. [18:04.680 --> 18:10.580] Then there is a construct called select type in FORTRAN that you know that can at runtime [18:10.580 --> 18:15.040] identify which type it is. So, if its type is point 3D then something is printed, if [18:15.040 --> 18:20.520] its type is point something else is printed. So, the modeling mostly follows what is there [18:20.520 --> 18:27.520] in this code. We have something called the FIR type and that type has the ones in green [18:27.560 --> 18:33.160] are the extended type, the ones in red are the base type. You can see that there is a [18:33.200 --> 18:40.200] FIR class that has inside it FIR type and then you have this FIR select type construct [18:41.960 --> 18:48.960] which tries to match between the base type or the extended type and then it branches [18:50.000 --> 18:55.280] off to the basic blocks that handle it. It is basic block one for the extended type and [18:55.280 --> 19:00.920] they are basic block two for the base type. So, at runtime also when you generate further [19:00.960 --> 19:06.480] lower level code like LVM there will be some comparison instructions that compares whether [19:06.480 --> 19:11.320] it is that type. Types will probably be represented as structures that are global. So, you can [19:11.320 --> 19:15.080] compare with it to know what is the real type. [19:15.080 --> 19:22.080] So, next I move to alias analysis. So, alias analysis is important to distinguish between [19:24.200 --> 19:30.240] different arrays that can potentially alias as well as to say that two arrays cannot definitely [19:30.240 --> 19:36.520] alias. The rules of aliasing in Fortran is different from what is there in C and so you [19:36.520 --> 19:43.520] know we cannot directly reuse whatever is there in C. In general I mean there are exceptions [19:43.720 --> 19:49.680] and lot of other special cases. Arrays do not overlap unless you specify that you know [19:49.680 --> 19:55.640] some array is a pointer and another array is a target and then these two can overlap. [19:56.440 --> 20:00.760] Ideally we should benefit from the restrict patches that are being worked on, but that [20:00.760 --> 20:06.320] work is not yet complete. We also have some issues with pointer escape and all. That is [20:06.320 --> 20:13.320] all captured in this RFC by Slava. But we still need to do some kind of alias analysis [20:14.440 --> 20:19.040] because otherwise as we saw in some of the earlier slides where I show the performance [20:19.040 --> 20:25.200] results you know the performance is hampered by the lack of alias information in the LLVM [20:25.240 --> 20:30.000] optimizer. We probably have some more information at the fair level, but much of the optimization [20:30.000 --> 20:36.000] is currently delegated to LLVM. So, LLVM still needs that information to do the optimizations. [20:36.000 --> 20:42.720] So, as a first step what we have done is that you know we are trying to distinguish between [20:42.720 --> 20:49.640] accessing the descriptor versus accessing another memory. So, as I might have probably [20:49.640 --> 20:55.000] mentioned before Fortran has arrays and it does it is very good at arrays. So, sometimes [20:55.000 --> 20:59.720] to pass additional information you cannot just pass just a pointer. You might need to [20:59.720 --> 21:06.720] pass additional information like its rank or its you know starting dimension starting [21:07.320 --> 21:13.600] extents value or its ending low basically the lower bounds or upper bounds to see whether [21:13.600 --> 21:18.160] that array has a stride or not you know there are all these information that can be passed [21:18.160 --> 21:22.440] in the descriptor. The descriptor is generally modeled as a structure at the LLVM level. [21:22.440 --> 21:28.200] So, you have to go and fetch the contents from the descriptor by loading. Now, this load [21:28.200 --> 21:33.200] can potentially alias with other arrays if you directly load from that. So, we are trying [21:33.200 --> 21:39.800] to just distinguish these using alias analysis information using TBA. So, that is what we [21:39.800 --> 21:46.800] have on this slide. I do not know how clear it is, but then this is in the LLVM MLIR dialect [21:47.800 --> 21:53.320] not in the LLVM IR representation. So, we have this TBA information that is being generated [21:53.320 --> 22:00.320] here. So, if you know TBA it is mostly trees and if one node is an ancestor of the other [22:00.640 --> 22:05.880] node then they alias, but if they are in separate sub trace they do not alias. So, you can see [22:05.880 --> 22:12.720] that the ones in gray is any data access the ones in yellow is whether when you are accessing [22:12.720 --> 22:16.360] a descriptor member. [22:16.360 --> 22:20.200] And if you go back to the source code. So, what I have here is in the simple program [22:20.200 --> 22:26.200] is a subroutine. Subroutine is a procedure which does not return a value in Fortran. [22:26.200 --> 22:32.360] The two values A and B, A is an array and B is a scalar both of integer type. I am loading [22:32.360 --> 22:38.200] the value at the tenth location and putting in this putting in this variable B. And you [22:38.200 --> 22:44.840] can see that whenever we are accessing the descriptor that is in this case the descriptor [22:44.840 --> 22:51.640] is as access possibly for a stride. We use the TBA with the yellow color and whenever [22:51.640 --> 22:57.880] we access something related to B we use the one with the gray one. So, you can distinguish [22:57.880 --> 23:04.720] that these do not alias and sometimes when it is in loops we get some performance benefits. [23:05.720 --> 23:13.560] So, the next one is code gen of assume shape array arguments. So, as I have mentioned before [23:13.560 --> 23:17.920] Fortran has different kinds of arrays one is assume shape. What assume shape means is [23:17.920 --> 23:23.560] that you know if you have an argument it takes the shape of the array that you pass to it. [23:23.560 --> 23:31.000] So, you can have you can either pass it an array of you know you can either pass it an [23:31.000 --> 23:39.960] array of a non size or an unknown size and it will accept both of them. So, this causes [23:39.960 --> 23:45.240] some issues particularly because you know the arrays can also be strided. So, if the [23:45.240 --> 23:49.240] array is strided then if you have a loop that is working on that array you have to fetch [23:49.240 --> 23:53.480] consecutive elements from the array. If it is strided then you have to increment it by [23:53.480 --> 24:00.520] the stride and usually you have to load from the descriptor to find the stride. And then [24:00.560 --> 24:07.720] you set stride to find the next element. So, sometimes this can be modeled by you know [24:07.720 --> 24:13.520] scatter gather loads and stores, but sometimes it is not possible, but in many cases the [24:13.520 --> 24:17.960] stride is actually one you are actually passing a consecutive array. So, we can do some [24:17.960 --> 24:24.240] versioning. So, it is represented in high level source here if you have some input code [24:24.240 --> 24:30.640] like this an array there is an array called x and you are looping over it. Then you can [24:30.640 --> 24:37.640] create a version of it in if the stride is one do this if otherwise do this right. And [24:37.640 --> 24:42.440] then this side of the portion loop this side of the version becomes easier to optimize [24:42.440 --> 24:49.880] and vectorize. I just have two more slides I will probably [24:50.040 --> 24:55.480] just run through it. We are nearing open mp 1.1 completion there are still a few items [24:55.480 --> 25:01.600] to complete like privatization atomic reduction and detail testing, but a lot of other things [25:01.600 --> 25:06.240] are still going on in parallel there is basic support for task SIMD construct. We have been [25:06.240 --> 25:12.480] able to run it with some spec speed benchmarks and it works things in progress include target [25:12.480 --> 25:19.280] off loading it just started task dependencies and new loop related constructs. We also made [25:19.280 --> 25:26.320] a lot of progress with the driver it can now generate executables, but what is new is that [25:26.320 --> 25:31.280] we now handle target specification, fast math, MLIR level optimizations previously only [25:31.280 --> 25:38.040] LVM optimization over there now we can control MLIR optimizations as well as LVM pass plugins. [25:38.040 --> 25:43.200] People are continuing to work on LTO saving optimization records and supporting something [25:43.200 --> 25:50.400] called stack arrays. This final slide just says that you are all welcome to contribute [25:50.400 --> 25:54.520] to this project and the details of this are here. Thank you. [26:13.200 --> 26:23.040] Yeah, I mean as of now we do not have so the basically the question was that when you traverse [26:23.040 --> 26:29.520] across the various layers in MLIR and LVM IR is the debug information preserved. So, [26:29.520 --> 26:35.280] the whole concept of the HLFIR operation is to you know put that information somewhere [26:35.280 --> 26:41.280] in MLIR at the highest level and then we plan to propagate it further down. So, the HLFIR [26:41.280 --> 26:46.400] declare has a corresponding FIR declare it will lower to that and from the FIR declare [26:46.400 --> 26:51.080] when we convert to LVM we will just pass on that debug information, but debug support [26:51.080 --> 26:58.560] is quite early in flank as of now only function names and location numbers as supported, but [26:58.560 --> 27:05.360] also may be not by default it is just a pass separately running the code to add that the [27:05.360 --> 27:12.360] driver is still pending. Yes. [27:12.360 --> 27:27.280] Everything that since standard or some well known extensions are there, but it is a lot [27:27.280 --> 27:31.840] of dusty deck code is was tested with it, but I do not know whether the specific thing [27:31.840 --> 27:37.120] that you have in mind is supported or not you have to try out or look at the documentation. [27:37.120 --> 27:45.360] So, the question was whether old Fortran 77 code or legacy extensions are supported. [27:45.360 --> 27:53.520] So, I am trying to understand what needs to be done for open MP because open MP is a completely [27:53.520 --> 27:59.600] separate project right. Yeah. So, the question is what needs to be done separately for open [27:59.680 --> 28:07.680] MP. So, as far as open MP is concerned all the open MP work is mainly represented at [28:07.680 --> 28:13.960] the MLIR level by a separate dialect called the open MP dialect. Then that dialect sits [28:13.960 --> 28:21.400] in the main MLIR repository and from the source program when we generate it you know we create [28:21.400 --> 28:26.280] these additional operations for the open MP dialect and it has regions in MLIR. So, it [28:26.320 --> 28:32.320] can capture things like you know a parallel directive much better compared to LVM. [28:32.320 --> 28:38.320] So, roughly speaking open MP is a set of intrinsics more or less. [28:38.320 --> 28:45.820] I mean it is not as I mean intrinsic when you whenever you say it is kind of some kind [28:45.820 --> 28:50.360] of function, but these are MLIR operations right. So, if you have something called a [28:50.400 --> 28:56.400] parallel directive there is an operation in MLIR called omp.parallel and it might have [28:56.400 --> 29:02.400] lot of clauses like you know what is the you know threading model and things like that. [29:02.400 --> 29:08.000] So, all those information is captured at that level along with the code that comes in that [29:08.000 --> 29:12.760] parallel region and what we do actually now is that we are trying to share code with Clang. [29:12.760 --> 29:17.560] So, there is some code that is refactored from Clang and put into something called the open [29:17.640 --> 29:24.640] MP IR builder. So, when we lower it from this dialect to LVM we use that to generate the [29:24.640 --> 29:28.160] LVM IR. Thank you.