[00:00.000 --> 00:16.640] Thank you all very much. Thank you for coming along. My name's Jeremy Bennett. I'm Chief [00:16.640 --> 00:24.280] Executive of Embercosm. Embercosm is an engineering heavy company. We only have one full-time [00:24.280 --> 00:28.480] non-engineer in the whole company, and it's not me, so I actually am a working engineer [00:28.480 --> 00:35.240] as well as running the company. We develop open-source, mostly compiler tool chains, but [00:35.240 --> 00:40.560] we also open-source AI, and we also have some open-source operating system stuff, and because [00:40.560 --> 00:44.640] most of what we do is pre-silicon, we do a lot, awful lot of open-source silicon chip [00:44.640 --> 00:51.600] modelling. But I'm also here with another hat on, which is that I am Chair of the Open [00:51.600 --> 00:56.080] Hardware Group's software task group, so I'm responsible for all the software developed [00:56.080 --> 01:01.720] for the Open Hardware Group, and I'll talk a bit more about them in a bit. And this [01:01.720 --> 01:08.240] talk is part technical, but it's partly about the actual practical side of how we go about [01:08.240 --> 01:22.640] developing complex software for an open architecture like RISC-5. So let's tell you a bit about [01:22.640 --> 01:30.040] the Open Hardware Group. So it's a not-for-profit, it's a member-driven collaboration, and it's [01:30.040 --> 01:36.520] a mixture of industry, companies like mine, some big companies that you will recognise, [01:36.520 --> 01:45.120] NXP is a member, for example. It has academics, so quite a few universities are members and [01:45.120 --> 01:49.960] part of it. And it also has individual members. You can contribute as an individual, and we [01:49.960 --> 01:53.320] have a number, some of the work I'm going to talk about has been just done by people [01:53.320 --> 02:00.540] who are individual members. And the goal is high quality, and that high quality is the [02:00.540 --> 02:05.440] key thing, open-source hardware development, the sort of open-source hardware that you [02:05.440 --> 02:11.200] can put in a commercial chip and be confident you can send it off to be fabricated. So it's [02:11.200 --> 02:16.920] collaborative and it's open, it's an open development model, so all these things are [02:16.920 --> 02:24.680] open to all. Now, the organisation is the Open Hardware Group, but its cores are known [02:24.680 --> 02:31.160] as Core 5 or Core V. So we have a huge family of processors, I'll talk about them a bit, [02:31.160 --> 02:37.280] everything from the smallest RISC-532 to the biggest RISC-564 designs. And these are [02:37.280 --> 02:45.840] standard RISC-5 cores, but with some custom ICER extensions for RISC-5. The chief executive [02:45.840 --> 02:49.600] is Rick O'Connor, and those of you who've been around for a few years will remember [02:49.600 --> 02:54.280] Rick because he was the first chief executive of RISC-5 International, and he's moved from [02:54.280 --> 03:03.200] the open-source specification world to actually delivering real in silicon IP. Let's look [03:03.200 --> 03:10.160] at the ancestry. So Open Hardware Group grows out of an academic industry project called [03:10.160 --> 03:18.320] the Parallel Ultralow Power Processor Project, PULP, and that was a collaboration originally [03:18.320 --> 03:25.040] between ETH Zurich, the University of Bologna, and ST Microelectronics. ST Microelectronics [03:25.040 --> 03:32.880] no longer active in it, and it predates RISC-5. The first part of PULP was done with open risk, [03:32.880 --> 03:38.200] but for the last many years it's been a RISC-5 project, and the idea is to get very, very [03:38.200 --> 03:45.600] low-power multi-core systems. So that's where we come from, and the cores started off as [03:45.600 --> 03:51.560] these academic research cores, and the point about academic research cores is they are [03:51.560 --> 03:56.560] designed to push the forefront of knowledge forward. They're not designed to be exhaustively [03:56.560 --> 04:02.280] tested for manufacturing and use in a commercial deployment. That's not the purpose of a university. [04:02.280 --> 04:08.040] So the natural transition is that Open Hardware Group takes that as an outstanding technical [04:08.040 --> 04:18.240] base and then takes it into a robust standard. We have loads and loads of members, so I've [04:18.240 --> 04:23.360] copied this off the website, and it's the wrong format. I really want a wider and flatter [04:23.360 --> 04:28.560] one, but you'll probably see some logos there. You'll see Ember Cosmos logos there. The astute [04:28.560 --> 04:33.320] of you will notice that Amazon Web Services appears to be both a member and a partner. [04:33.320 --> 04:39.960] I think they transitioned from a partner to a member, and the slide wasn't properly updated. [04:39.960 --> 04:48.760] So we have lots and lots of members. I think it's up to about 70 now. And it might be worth [04:48.760 --> 04:53.440] just saying that when you become a member, yes, if you're a corporate, you have to pay [04:53.440 --> 04:59.520] a membership fee, but that's not the big thing. You can only become a member if you [04:59.520 --> 05:06.000] commit resource in terms of what you're going to contribute, and that dwarfs any membership [05:06.000 --> 05:11.280] fee. You cannot be a member unless you're going to do something. We don't have sleep [05:11.280 --> 05:19.040] in members. You've got to be active members. In terms of an organisation, it's very lightweight, [05:19.040 --> 05:24.600] and this is one of the big contrasts with risk five international. At a technical level, [05:24.600 --> 05:29.400] we only have five committees. We have an overarching technical work in group which has the overall [05:29.400 --> 05:37.440] responsibility for the engineering direction, and it has co-chairs, it has Gerard Vamont [05:37.440 --> 05:42.320] from Thales, and David Lynch, and I can't remember where David Lynch comes from actually, [05:42.320 --> 05:47.800] but in two companies, joint chairs, and that meets every month as the final arbiter of [05:47.800 --> 05:53.400] technical stuff. And then all the work of this organisation is handled by just four [05:53.400 --> 06:01.680] companies. The cause group headed up by Ion Bink from Silicon Labs, and that is responsible [06:01.680 --> 06:07.840] for the tracking the development of the cause. The work is done by the member companies, [06:07.840 --> 06:15.200] but these are the groups that have oversight and make sure the quality criterion is maintained. [06:15.200 --> 06:19.960] There's a verification task group, and verification is actually separated from core development, [06:19.960 --> 06:24.120] although the two are desperately tightly tied in. And that's slide by Simon Davidman [06:24.120 --> 06:30.080] of Imperus. There's a hardware task group, and this is a slightly strange name perhaps, [06:30.080 --> 06:35.080] but the point of the hardware task group is, though this is fundamentally a group developing [06:35.080 --> 06:42.960] Silicon IP, we do have to have reference implementations, and the first reference implementation [06:42.960 --> 06:47.800] should be coming out later this year, which isn't the core five MCU, and that has one [06:47.800 --> 06:54.560] of the 32-bit cores in it. And lastly, the software task group, which I lead, assisted [06:54.560 --> 07:00.680] by Yun Haisheng from Alibaba Teahead, and that's responsible for all the software projects. [07:00.680 --> 07:05.400] And again, it's oversight. It's not doing the work, because you have to be a member, [07:05.400 --> 07:09.080] you have to do the work. We do have a bit of a problem there, because we have mostly [07:09.080 --> 07:17.200] hardware members and not enough software members. In terms of the roadmap, we've got [07:17.200 --> 07:25.160] a flagship application core, that's the CVA-6, and that's a 64-bit full-blown RISC-5 core [07:25.160 --> 07:33.920] designed to run Linux, and that comes out of the top-end pulp core under development. [07:33.920 --> 07:47.640] There is a smaller 32-bit core, the CV32A5, which is designed for, which is designed to [07:47.640 --> 07:51.120] try and do a small Linux system, or of course we've got the issue of getting Linux running [07:51.120 --> 07:59.560] on 32-bit anyway. You probably can only just make it out, each of these projects has a [07:59.560 --> 08:05.680] target technology readiness level. Are people familiar with technology readiness levels? [08:05.680 --> 08:10.320] A little bit. Most of these are aimed at technology readiness level, which is proven [08:10.320 --> 08:17.000] in the environment. CVA-5 is a bit different, it's aimed at TRL-4, which is proven in the [08:17.000 --> 08:22.080] lab, and we've got other projects which are at different levels, but mostly we're aiming [08:22.080 --> 08:30.320] at TRL-5. The flagship, the first project was the CV32E40P, which is a microcontroller [08:30.320 --> 08:37.440] class 32-bit RISC-5 implementation, and the first version of that is complete, the second [08:37.440 --> 08:42.080] version is under development, and actually the work Nandini was talking about with built-ins [08:42.080 --> 08:47.840] is primarily focused at CV32E40PV2. You'll see sitting there, there's something called [08:47.840 --> 08:55.000] the CV32E41P, and that's a bit unusual because it's actually only going to TRL-3, which is [08:55.000 --> 09:01.160] proof of concept, and it's being developed by Huawei under this group in order to test [09:01.160 --> 09:11.080] out the ZC-STAR extension, the new compressed extension, and the ZF-INX, where you have [09:11.080 --> 09:16.000] a shared register bank for floating point and integer. It's only a proof of concept [09:16.000 --> 09:22.200] chip to verify that those work. Then we have a couple of more forward-looking ones. CV32E40S [09:22.200 --> 09:29.000] is a version of the original CV32E40P aimed at security applications, and the really exciting [09:29.000 --> 09:34.720] one is the CV32E40X, that's a bit further out, because the X is a generic extension [09:34.720 --> 09:39.040] interface at the hardware level, so this is designed so you can take a core, it's really [09:39.040 --> 09:44.400] easy to add in a wide range of extensions, and indeed to the extent you can do the floating [09:44.400 --> 09:52.120] point extension through the CV32E40X. So that's the roadmap we're sitting on. So what about [09:52.120 --> 09:57.640] the software projects? Well, I'm going to focus on the tool chains, so the LLVM tool [09:57.640 --> 10:01.800] chain, the GNU tool chain, and because we haven't yet got silicon, I'm going to focus [10:01.800 --> 10:07.480] on a couple of software projects, QEMU and the Verilator model. We have other projects, [10:07.480 --> 10:10.680] so I'm responsible for eight projects in total. There's the SDK, the software development [10:10.680 --> 10:15.480] kit, that's actually joint with the hardware group. The hardware abstraction layer so that [10:15.480 --> 10:19.480] we can actually make our software more easily portable as we do more and more of these chips. [10:19.480 --> 10:24.800] FreeRTOS, we're microcontroller, if you're going to go for an RTOS, the obvious one to [10:24.800 --> 10:29.760] have to start with is FreeRTOS, and Linux, which is really aimed at the CVA6. So we've [10:29.760 --> 10:36.280] got eight projects under our belt. We have a rigorous engineering process, so those of [10:36.280 --> 10:43.840] you who work for big engineering companies are familiar with gate-based processes, okay? [10:43.840 --> 10:51.680] So all of the way we manage these projects is through a gate-based process. You have [10:51.680 --> 10:57.640] a project concept gate. That's where you propose a project. You want to work on this project, [10:57.640 --> 11:02.720] you explain what it is and why we need to do it. And the project will only go ahead if [11:02.720 --> 11:07.000] it's voted for by enough members. And members can't just vote and say, hey, that's a good [11:07.000 --> 11:11.080] idea, I'll say yes for everything. You vote for it, you're going to commit resource to [11:11.080 --> 11:17.240] it. Critical thing is this is a doing organization. So, okay, you then go and explore it, and [11:17.240 --> 11:23.680] then we got to a project launch gate. And that stage, not only do you know what and [11:23.680 --> 11:30.080] why, but now you know how you're going to do it, okay? What do I need to do to get this? [11:30.080 --> 11:35.120] And then the big one, the last one, is plan approved, okay? And plan approved means you've [11:35.120 --> 11:39.280] resourced it, so you know when you're going to deliver it. And that's quite a big hurdle [11:39.280 --> 11:44.600] for some projects, okay? And then away the project work goes, I mean, this is a bit simplified, [11:44.600 --> 11:48.200] we all know work starts a bit in advance, but that way the project goes and eventually [11:48.200 --> 11:54.160] you get to project freeze, okay, where it's all done. Now, this is brilliant for hardware, [11:54.160 --> 11:59.760] okay? It's a very hardware-centric view of the world because a freeze means something. [11:59.760 --> 12:05.000] It's when your chip's gone off to be fabbed, okay? It doesn't work quite so well for software, [12:05.000 --> 12:10.320] so we've modified it for software, okay? And the first two stages are quite generic because [12:10.320 --> 12:14.560] typically we're working with a big block of common software. You know, we're not going [12:14.560 --> 12:19.120] to, we're not writing a compile and a scratch, we're taking the GNU infrastructure, GCC, [12:19.120 --> 12:24.000] Binutils and all that, millions of lines of it. And that's mostly not changed and we're [12:24.000 --> 12:29.360] changing a bit. Now, some of that is quite the why and the how, the what and the why [12:29.360 --> 12:36.400] and the how is quite generic. Probably all Core 5 chips, certainly maybe to all the 32-bit [12:36.400 --> 12:42.000] ones. And then for each specific chip, we have multiple plan approved, which is where [12:42.000 --> 12:47.880] we work out when we're going to deliver it for CV30, E40PV1, for V2, for S, for X and [12:47.880 --> 12:53.760] so forth. And so we have a whole set of, but it's still the same process of you need to [12:53.760 --> 12:59.720] know it's properly resourced and so forth, okay? So we'll see that a little bit action, [12:59.720 --> 13:05.720] but that whole engineering focus pervades everything, okay? So let's put the content, [13:05.720 --> 13:11.560] what's compiler? Toolchain, it's not just the compiler, it's the assembler, it's the low-level [13:11.560 --> 13:16.360] utilities, it's the debugger, it's the libraries, it's the emulation libraries, it's the standard [13:16.360 --> 13:21.480] C libraries, the C++ libraries. And in the ultimate world, if you look at GCC, this [13:21.480 --> 13:27.400] is GCC, it's many, many languages. It's ADO, it's the C++ family, it's Fortran, it's [13:27.400 --> 13:31.880] Java, it's Go. These days, it's Rust, it's Modular 2. And of course, we've got things [13:31.880 --> 13:36.520] that sit at the high level like OpenMP and OpenACC. That's a huge lot of stuff. We have [13:36.520 --> 13:41.560] a hasten to say, we're a long way of having all of that for the Core 5. But it's a lot [13:41.560 --> 13:46.040] of code. It's about, it's north of 12 million lines of code and it's a year or so since [13:46.040 --> 13:51.520] I last measured those figures. So, it's big. And we're trying to get that all seamlessly [13:51.520 --> 13:56.880] worked through to work on Core 5. Now, we're not doing it from scratch. Of course, we're [13:56.880 --> 14:02.640] starting from Risk 5 and then we're adding stuff to it. And you can say LLVM has the [14:02.640 --> 14:09.600] same components, but they've got different names and a different set of languages. [14:09.600 --> 14:17.360] So let's look at the ICER extensions for Core 5 and the Core 5 ICER extensions, there are [14:17.360 --> 14:27.240] nine we're concerned with. Eight of those come from the PULP project. So, extra addressing [14:27.240 --> 14:36.360] modes, post-incrementing load and store. Hardware loops, more ALU operations, some special case [14:36.360 --> 14:42.560] branching operations. We've got MAC instructions. We've heard about the event load. That's a [14:42.560 --> 14:48.400] multi-core feature. And then we've got the PULP bit manipulation and the PULP SIMD. [14:48.400 --> 14:53.640] Those are not standard bit manipulation and SIMD. Those are different ones. But there's [14:53.640 --> 14:58.680] been years of development. The reason they're different is they predate a Risk 5 bit manip [14:58.680 --> 15:08.080] and they predate Risk 5 SIMD. And you can see the PULP SIMD is big. And the reason that [15:08.080 --> 15:14.240] Nandini knows so much about built-ins is she's done the built-ins to support those 220 SIMD [15:14.240 --> 15:21.880] instructions. So she's nearly finished and then I think she's going away for a long holiday [15:21.880 --> 15:29.640] where she never looks at another built-in again. And we've got ZC-STAR. So we had the [15:29.640 --> 15:35.160] first GCC implementation supporting ZC-STAR. Of course, these are standard Risk 5 compilers. [15:35.160 --> 15:40.640] You can still use them for Risk 5. And the hot news is that Core 5 GCC has a PUL request [15:40.640 --> 15:47.280] to put ZC-STAR 1.0.1, which is the freeze candidate support. Once that's been reviewed, [15:47.280 --> 15:52.680] that will go in there. So we've got a lot. The toolchain work is all about supporting [15:52.680 --> 16:09.440] these PULP extensions. In terms of the built-ins, you've heard all about them, but it's a lot [16:09.440 --> 16:16.160] of functions. It's not just Nandini. Nandini has a team working on this. And we've got [16:16.160 --> 16:22.400] this naming convention. So we get from a naming convention that built-ins for Risk 5 are built-in [16:22.400 --> 16:29.280] underscore Risk 5, underscore vendor, underscore name. We've got so many. We're actually splitting [16:29.280 --> 16:38.840] up into ICER extension and name. There is a rule, though. If what you're doing is a built-in, [16:38.840 --> 16:42.120] is it corresponds to a standard built-in, then you must use the standard name. So for [16:42.120 --> 16:47.400] example, we have, in our arithmetic, we have abs. So we've got built-in abs, which is [16:47.400 --> 16:57.920] a standard GCC built-in. The built-ins actually have the same name for 32-bit or 64-bit. That [16:57.920 --> 17:03.560] is not overloading in the C++ sense, because either you're running for a 32-bit target [17:03.560 --> 17:08.080] or you're compiling for a 64-bit target, you have one or the other there. There's not an [17:08.080 --> 17:12.680] error. It's also the case is built-ins are not just another way of doing inline assembler. [17:12.680 --> 17:18.440] There's not a one-to-one mapping. So for example, for the SIMD add scaler, there are actually [17:18.440 --> 17:25.400] two different ways of adding scalers in the Core 5 SIMD, one where the scaler's in a register, [17:25.400 --> 17:29.120] the other where it's a small integer, and you can actually put it, there's actually [17:29.120 --> 17:33.440] an add immediate instruction. We don't have two built-ins. There's a single built-in, [17:33.440 --> 17:40.160] and if the second argument is a small constant that fits there, it will generate the immediate [17:40.160 --> 17:44.680] version instruction, otherwise it will load it into a register. That's part two of Nandy's [17:44.680 --> 17:50.960] talk for the future, because that's quite a lot harder to do in a built-in. There is [17:50.960 --> 17:59.480] a specification. It's big if you put it into, if you generate PDF, it's 57 pages long. That's [17:59.480 --> 18:04.040] one of the things. It's built-ins. Built-ins are not quite as easy as you think. There [18:04.040 --> 18:07.400] are things you can get wrong there, and you do genuinely have to think and review. It's [18:07.400 --> 18:16.280] under review at the moment. That's not finalised. So testing. Now, if we're going to do full [18:16.280 --> 18:21.720] testing of a tool chain, we need a target which has all these ISER extensions. They're [18:21.720 --> 18:26.720] not standard risk five. I can't just take standard risk five, QMU or whatever. You can [18:26.720 --> 18:31.400] do some testing. So the standard GNU assembler tests, for example, don't need an executable [18:31.400 --> 18:36.680] target. They are pattern matching. Have you generated something that looks right? You [18:36.680 --> 18:39.920] can do the same with built-ins. You saw it from Nandy's compile time only thing, where [18:39.920 --> 18:45.000] you look and do a scan assembler to see, has that built-in generated just one of the instruction [18:45.000 --> 18:52.840] I want? But more generically, you need to be able to execute your tests. So we have [18:52.840 --> 18:59.840] two things. One is we have QMU for Core 5. That's a project being led by Wei Wei Li at [18:59.840 --> 19:08.040] the programming languages and compiler technology team at Chinese Academy of Sciences in Beijing. [19:08.040 --> 19:14.000] That's a work in progress. We're expecting that to become available later in 2023. And [19:14.000 --> 19:19.160] secondly, we're using verilator models. How many people here are familiar with verilator [19:19.160 --> 19:24.880] handshow? Okay, most of you. Okay, for those who don't, it's a tool that takes a hardware [19:24.880 --> 19:32.320] design in verilog or system verilog and generates a C++ model from it, a cycle accurate C++ [19:32.320 --> 19:37.040] model. And verilator models are really useful because A, they're easy to integrate to tool [19:37.040 --> 19:41.760] chain testing and B, they are what's called implementation models. Most models come from [19:41.760 --> 19:45.840] the specification of the chip. These are the actual implementation. So we know what your [19:45.840 --> 19:50.060] testing is, what is physically going on the chip. And when you have a model, it's not [19:50.060 --> 19:52.960] just enough to have a model, you've got to be able to hook to it. So typically you've [19:52.960 --> 19:58.280] got to run, wrap around some form of debug server so you can connect GDB or LLDB. And [19:58.280 --> 20:04.000] that's the work in progress. That's due to be completed in the next few weeks. And then [20:04.000 --> 20:10.160] you'll be able to actually run on the actual, a model of the actual hardware. [20:10.160 --> 20:21.000] Testing policy, LLVM project uses the LLVM integration tester. That is a set of several [20:21.000 --> 20:29.440] tens of thousands of tests from source code down to LLVM IR. Very comprehensive and we [20:29.440 --> 20:35.760] use that. But it isn't a set of execution tests. Now LLVM does have an executable test [20:35.760 --> 20:40.520] suite, but it is a set of applications to run under an operating system. And for a small [20:40.520 --> 20:44.040] microcontroller they're not suitable. They need an operating system there. So we can't [20:44.040 --> 20:49.040] use the LLVM test suite. So instead we use a subset of the GNU regression tests to test [20:49.040 --> 20:52.520] LLVM compilers. And that's widespread. That's not something we've invented. That's been [20:52.520 --> 20:57.440] done for years. It's only a subset because there is no point in running the GNU tests [20:57.440 --> 21:02.600] of the internal representations inside the GNU compiler. But things like the torture tests [21:02.600 --> 21:07.480] are absolutely fine whether they're on LLVM or GC. They're compiler agnostic. The GNU [21:07.480 --> 21:15.040] tools just uses the GNU regression tests. Something that we're very hot on is exhaustive [21:15.040 --> 21:20.360] testing. It's not just, oh, I tried one thing and it seemed to work. It's a, let's make [21:20.360 --> 21:27.520] sure we've not missed things. So starting at the assembler, we have both positive and [21:27.520 --> 21:33.360] negative testing. By positive testing, we mean testing the compiler does what you want. [21:33.360 --> 21:38.960] By negative testing, I mean testing it doesn't do things when it shouldn't. So for example, [21:38.960 --> 21:45.520] if we're looking at an instruction that takes a six-bit signed constant, we will test it [21:45.520 --> 21:52.200] with values of minus 33, too small for a negative number, minus 32, the biggest negative six-bit [21:52.200 --> 21:57.560] number, zero, because you should always test zero because it's a special case, probably [21:57.560 --> 22:01.120] minus seven and plus five. And then we'll test 31 and 32, which is too big. And we will [22:01.120 --> 22:05.200] check those bounds. And actually, we added those tests for ZC star and found a whole [22:05.200 --> 22:12.360] load of bugs in the ZC star spec as a consequence. So we do that sort of thing. And we also [22:12.360 --> 22:20.120] test things like we test the extensions. I've got the ELW instruction. I test that the ELW [22:20.120 --> 22:26.520] instruction is handled by the assembler when I specify X ELW. I also test that it doesn't [22:26.520 --> 22:31.000] get recognized when I don't specify X ELW. So that's really important. And one thing [22:31.000 --> 22:35.760] I would observe is, and I've seen this for a long time, is risk five is incredibly weak [22:35.760 --> 22:41.720] on its assembly level testing. There are about 10 times as many tests for ARM and X86 as [22:41.720 --> 22:50.240] there are for risk five. So, you know, it's important to do that. And so we do core five [22:50.240 --> 22:56.200] specific, we're a vendor, core five specific LD testing because we're adding some regress, [22:56.200 --> 23:01.800] we're adding some relocations. Do they work? At the moment, we've got compilation only [23:01.800 --> 23:07.240] tests of built-ins using scan for assembler instructions. But when we've got those models [23:07.240 --> 23:12.360] running, the QEMU and the Verilator model, we'll be adding execution tests. So not only [23:12.360 --> 23:17.240] do I generate the right assembly instruction, but it does what I expect. Not only do I generate [23:17.240 --> 23:24.520] the built-in, but it does what I expect. So we'll be adding those in as well. And that, [23:24.520 --> 23:29.800] all that testing ties into the difference of this is a commercial grade chip, commercial [23:29.800 --> 23:35.520] grade core, sorry, and its associated tool chains. So resourcing. Resourcing is an issue [23:35.520 --> 23:40.760] because software, these days on a chip, you'll spend twice as much on the software as you [23:40.760 --> 23:48.760] do on the hardware. But open hardware group is inherently, mostly, hardware company members. [23:48.760 --> 23:53.640] Okay? We've got plenty of software members, but we're still in a minority. So it is a [23:53.640 --> 23:57.920] challenge to get enough resourcing as part of my contribution and because it makes a [23:57.920 --> 24:05.280] lot of contribution, that's part of our membership. The PLCT lab in China, they make a big contribution. [24:05.280 --> 24:09.560] So that's coming there. And we actually double up. We use them very much as part of our [24:09.560 --> 24:13.920] graduate training program as well. So part of the graduate allows us to train a new generation [24:13.920 --> 24:19.240] of compiler engineers. But we're also seeing companies like Silicon Labs and Dolphin Design [24:19.240 --> 24:23.320] call out to them because they help to fund those software companies to do the work. So [24:23.320 --> 24:27.760] thank you to both of those. And we do need more of that to come on, and I expect it will [24:27.760 --> 24:35.840] come along. Okay? Second is we are not going to maintain out-of-tree forks of GCC and [24:35.840 --> 24:41.600] LLVM. It's a thankless task. It takes up a lot of time. The goal is to get upstream as [24:41.600 --> 24:46.440] vendor extensions. This is not a new thing. This has been part of GCC since before and [24:46.440 --> 24:51.720] part of LLVM. Okay? And when you have that triple that says what your target is, that [24:51.720 --> 24:57.560] thing that in your standard compiler says risk 532 unknown hyphen L5 and GCC or whatever, [24:57.560 --> 25:02.600] that unknown is the vendor field. And we should be using that. So if you get these tool changes, [25:02.600 --> 25:10.240] you'll find they build as risk 532 hyphen core fee, the vendor, hyphen L5 and GCC. Okay? [25:10.240 --> 25:14.240] And that's absolutely standard. It's been around forever. You can see in the GCC build, [25:14.240 --> 25:19.120] for example, there are variants of Spark and risk 5 for different manufacturers for people [25:19.120 --> 25:25.760] who make, you know, radiation, hardened versions for space and so forth. Okay? That all works [25:25.760 --> 25:30.720] fine. Risk 5 is designed for this. It's extensible. That's the whole point. There is a missing [25:30.720 --> 25:43.600] piece of the jigsaw which is in RV32 in the ABI specification, you have relocations. Okay? [25:43.600 --> 25:49.400] There are 256 possible relocation values you can have. The top 64 of those are reserved [25:49.400 --> 25:54.960] for vendors. Okay? That's enough probably for any one vendor, but it's not enough for [25:54.960 --> 26:00.080] all vendors and it requires a centralized way of controlling. So we know how to solve [26:00.080 --> 26:04.680] this problem is that every time you need a relocation to tell you this bit of assembler [26:04.680 --> 26:09.800] when you link it needs adjusting the memory offset, you put down two relocations. One [26:09.800 --> 26:15.160] is to say which vendor you are and that's just a new vendor. That's just a new relocation [26:15.160 --> 26:22.360] with 32 bits so we can have 4 billion vendors. Okay? And then the second one is say which [26:22.360 --> 26:27.200] of those 64 relocations, but it means the vendor relocations, there's a full set for [26:27.200 --> 26:34.280] every vendor. So we know the concept. One of my other team, Pietra Ferrara, who's sitting [26:34.280 --> 26:40.040] somewhere in the audience, is doing the proof of concept to demonstrate that works. Turns [26:40.040 --> 26:45.880] out the GNU linker is rather running at its limits with the complexity of risk 5 so it's [26:45.880 --> 26:50.160] not a completely trivial task, but we need that before we can fully upstream all this. [26:50.160 --> 26:56.520] The rest of it is all ready to go and it's all done to upstream standards. There's another [26:56.520 --> 27:02.440] thing we found is, you noticed I showed you there's two versions of CV32E40P and they've [27:02.440 --> 27:08.200] got different instruction encodings. We thought it would be good to actually be able to support [27:08.200 --> 27:14.400] both instruction encodings and if you specify an architecture, you're allowed to specify [27:14.400 --> 27:26.720] my RV32IMAC underscore X ELW and you'll understand then that I'd like to say 1P2 to say I want [27:26.720 --> 27:32.080] version 1.2. Okay? That's all part of the standard way you name an architecture, but [27:32.080 --> 27:37.920] it turns out it's not supported in the assembler, the GNU assembler, and furthermore, the GNU [27:37.920 --> 27:41.680] assembler is not written in such a way that it's ever going to be easy to support and [27:41.680 --> 27:46.240] we gave up on that and, in fact, we're only going to support the latest version and that [27:46.240 --> 27:52.480] probably ties into the way that a risk 5 international is going. So those, if you like, are the [27:52.480 --> 27:56.960] key issues we're addressing. On the upstreaming, we're almost certainly going to upstream [27:56.960 --> 28:00.800] the ISER extensions that don't need vendor-specific relocations and we'll put the others up once [28:00.800 --> 28:12.080] the vendor-specific relocations are ratified by the PSA by ABI group. So, there. Get involved. [28:12.080 --> 28:19.240] The projects are all on GitHub. The open hardware group has its own repository and if you don't [28:19.240 --> 28:23.160] like building from source, you can go to the Embercosm website and you can download pre-built [28:23.160 --> 28:29.480] tool chains for GCC and LLVM, for Core 5, for every operating system under the sun, [28:29.480 --> 28:38.160] all flavours of Linux, Mac, Windows, whatever. So, get involved. Each of these projects has [28:38.160 --> 28:45.760] a project lead. Charlie Keeney leads the LLVM project. Chun Yu-Liao from PLCT. Remember [28:45.760 --> 28:49.280] I said how you have different plan approved for the different variants? She's in charge [28:49.280 --> 28:55.800] of the specific project for CV32E40PV2. Nandi Jamnadas, who you heard from just now, leads [28:55.800 --> 29:03.400] the GNU Tools project and is also responsible for the CV32E40PV2. Wei Wei Li from PLCT runs [29:03.400 --> 29:08.080] the QMU project and I'm responsible for the verilator modelling because I'm a verilator [29:08.080 --> 29:15.320] guy. Part of this is about bringing on a new generation. We actually help a new generation [29:15.320 --> 29:20.600] on and train. So, there is a half hour call. I'm sorry about the time if you live in America [29:20.600 --> 29:25.640] because most of the people involved are either in China or in Europe. So, they're on Friday [29:25.640 --> 29:29.560] mornings. There's a half hour call on LLVM run by Charlie and there's a half hour call [29:29.560 --> 29:34.360] on GNU run by Nandi. And the idea is that we'll review people collectively. We'll review [29:34.360 --> 29:39.400] their pull requests and it's as much a training and learning thing as anything. So, if you [29:39.400 --> 29:44.840] want to get into this stuff, it's actually quite a good way to get a bit of free training. [29:44.840 --> 29:48.720] And that's it. So, that's me. That's Ember Klossom. That's the Open Harbour Group. Thank [29:48.720 --> 29:59.240] you very much. So, we've got a few minutes for questions. I'm happy to take any questions. [29:59.240 --> 30:00.240] Yes? [30:00.240 --> 30:19.240] Yes? Yes. So, I'm working in a hardware research group at the university. We do a lot of feedbacks. [30:19.240 --> 30:20.240] Previously, we've always used the fieldwork from ETH Taraki or Karambolanya. But sometimes [30:20.240 --> 30:21.240] it gives us some troubles because, for example, I'm doing compiler development right now and [30:21.240 --> 30:22.240] then last week I discovered that there was a bug in GDB and nobody is working on GDB anymore [30:22.240 --> 30:30.240] for this specific version that we take out. So, I was just wondering, do you maybe have [30:30.240 --> 30:36.240] like a time frame for these upstreaming of these extensions? And can we, like, if tomorrow [30:36.240 --> 30:41.240] we do a takeout, should I tell my colleagues to do an Open Harbour core or should I tell [30:41.240 --> 30:46.240] them to do the stay with the pull, please, or the pull cores in general? [30:46.240 --> 30:50.480] Okay. So, the question for the recording, the question was about, if you're working on [30:50.480 --> 30:55.080] the ETH pulp cores, which are still there as fantastic research cores, should you use [30:55.080 --> 31:01.880] the old pulp compiler or should you use the core 5 compiler? So, I think there's not a [31:01.880 --> 31:07.640] black and white answer on that. The pulp compiler is a fork of GCC from 2017. So, it's quite [31:07.640 --> 31:11.680] a long way out and that means it hasn't got the latest RISC 5 stuff in there. Where we [31:11.680 --> 31:15.840] started on the GCC for this, we actually looked at whether we could roll that forward and [31:15.840 --> 31:24.040] it wasn't a sensible starting point. We started from scratch from the latest GCC. So, in terms [31:24.040 --> 31:28.760] of which core you use, I believe ETH-Ziric is slowly moving over to more using the core [31:28.760 --> 31:33.720] 5 unless you're particularly, because you might, you may as well use these hardened [31:33.720 --> 31:39.680] cores. In that case, the obvious thing is to use the core 5 tool chains and though they're [31:39.680 --> 31:44.120] not yet upstream, they're all in the public and there are pre-compiled ones you can pull [31:44.120 --> 31:48.720] up. There is a problem if you're using the old pulp cores, because remember I talked [31:48.720 --> 31:56.280] about that version 1 and version 2. The old pulp things are so old they predate the finalization [31:56.280 --> 32:02.000] of the RISC 5 encoding space and actually the instruction encodings trample on future [32:02.000 --> 32:09.000] encoding spaces for RISC 5. So, the version 2 fixes all that and all the version 2 instruction [32:09.000 --> 32:16.360] encodings are actually now RISC 5 compliant. They sit in the custom 0123 blocks. What that [32:16.360 --> 32:20.200] means is you can't use this compiler because we haven't got the version 1 stuff because [32:20.200 --> 32:24.880] the versioning issue I talked about to compile for the old pulp encodings. So, that might [32:24.880 --> 32:29.800] be a factor you have to bear in mind there. But the old compiler, I've looked at the old [32:29.800 --> 32:36.040] compiler and it comes down to it's a research compiler. It wasn't designed to be tested and [32:36.040 --> 32:40.280] Rust is designed to prove concepts and I think I've always very strong. That's the job of [32:40.280 --> 32:45.800] universities not to do the exhaustive testing we do to different purpose. So, it's a different [32:45.800 --> 32:50.800] type of compiler but it does mean that occasionally you get weird behavior. Yeah, so I haven't [32:50.800 --> 32:55.200] really answered the question but I've given you the decision points to look at. I'd love [32:55.200 --> 33:02.280] you to use Core 5 because then you'd be tempted to join in and help here. Any more questions? [33:02.280 --> 33:09.280] Yes, right in the back. [33:09.280 --> 33:20.280] Absolutely. Yes, I should have said, yeah. So, we have a lot of projects under there [33:20.280 --> 33:24.160] and we bring in that roadmap I showed. If you look closely you'll see the dates are [33:24.160 --> 33:27.640] all wrong because some of these have moved out and we've got a load of problems like [33:27.640 --> 33:32.240] the Tristan project that we heard of earlier which are under the Open Hardware Group. And [33:32.240 --> 33:40.680] those of you who use David and Sarah Harris's textbook for design, okay, the Wally processor [33:40.680 --> 33:46.160] is being re-implemented as a RISC 5 processor and that is being done under Open Hardware [33:46.160 --> 33:51.640] Group. So, your next generation of textbook will have an Open Hardware Group Wally processor [33:51.640 --> 33:58.440] in it. So, yeah, there's more than just those cores I said there. And if you are working [33:58.440 --> 34:02.720] on a core and you think you might want to put it in this framework, come and talk to [34:02.720 --> 34:05.840] one of us. You can talk to Director Rick O'Connor if you don't know him, come to me and I will [34:05.840 --> 34:06.840] introduce you. Yes? [34:06.840 --> 34:14.400] I only have a stupid question. So, I work mostly on applications actually and then our development, [34:14.400 --> 34:20.000] we usually, we're starting to converge like developers and testers are sort of converging [34:20.000 --> 34:25.640] into one team. Like, now you were saying that you actually have this bit where some people [34:25.640 --> 34:30.640] do like the cores and others do the verification. Would that also be possible to converge at [34:30.640 --> 34:31.640] some point? [34:31.640 --> 34:36.080] So, this is, so the question is why do we have, I can sub-paraphrase as why do we have [34:36.080 --> 34:41.720] separate core task group and verification task group? They do work very closely together. [34:41.720 --> 34:45.920] This is specifically about hardware verification. It's not about software verification. I think [34:45.920 --> 34:50.840] the argument is completely different and for the software, the verification and development [34:50.840 --> 34:57.760] are closely integrated. I think because hardware verification is so formally structured, there [34:57.760 --> 35:02.200] is actually a case to be made for keeping them separate and having the design team and [35:02.200 --> 35:07.960] the verification teams distinct. So, it sort of makes sense. I'm really a software guy. [35:07.960 --> 35:13.200] I'm not an expert on hardware. But it does sort of make sense. But the two teams work [35:13.200 --> 35:18.760] very closely together. But it allows one team to focus on the UVM-based test and verification [35:18.760 --> 35:27.760] flow and another to work on the actual implementation of the chip. Any more questions? Okay. Thank [35:27.760 --> 35:32.440] you all very much. That brings the risk five dev room to an end and hope you enjoyed it. [35:32.440 --> 35:33.440] Thank you. [35:33.440 --> 35:43.440] Thank you.