[00:00.000 --> 00:16.640]  Thank you all very much. Thank you for coming along. My name's Jeremy Bennett. I'm Chief
[00:16.640 --> 00:24.280]  Executive of Embercosm. Embercosm is an engineering heavy company. We only have one full-time
[00:24.280 --> 00:28.480]  non-engineer in the whole company, and it's not me, so I actually am a working engineer
[00:28.480 --> 00:35.240]  as well as running the company. We develop open-source, mostly compiler tool chains, but
[00:35.240 --> 00:40.560]  we also open-source AI, and we also have some open-source operating system stuff, and because
[00:40.560 --> 00:44.640]  most of what we do is pre-silicon, we do a lot, awful lot of open-source silicon chip
[00:44.640 --> 00:51.600]  modelling. But I'm also here with another hat on, which is that I am Chair of the Open
[00:51.600 --> 00:56.080]  Hardware Group's software task group, so I'm responsible for all the software developed
[00:56.080 --> 01:01.720]  for the Open Hardware Group, and I'll talk a bit more about them in a bit. And this
[01:01.720 --> 01:08.240]  talk is part technical, but it's partly about the actual practical side of how we go about
[01:08.240 --> 01:22.640]  developing complex software for an open architecture like RISC-5. So let's tell you a bit about
[01:22.640 --> 01:30.040]  the Open Hardware Group. So it's a not-for-profit, it's a member-driven collaboration, and it's
[01:30.040 --> 01:36.520]  a mixture of industry, companies like mine, some big companies that you will recognise,
[01:36.520 --> 01:45.120]  NXP is a member, for example. It has academics, so quite a few universities are members and
[01:45.120 --> 01:49.960]  part of it. And it also has individual members. You can contribute as an individual, and we
[01:49.960 --> 01:53.320]  have a number, some of the work I'm going to talk about has been just done by people
[01:53.320 --> 02:00.540]  who are individual members. And the goal is high quality, and that high quality is the
[02:00.540 --> 02:05.440]  key thing, open-source hardware development, the sort of open-source hardware that you
[02:05.440 --> 02:11.200]  can put in a commercial chip and be confident you can send it off to be fabricated. So it's
[02:11.200 --> 02:16.920]  collaborative and it's open, it's an open development model, so all these things are
[02:16.920 --> 02:24.680]  open to all. Now, the organisation is the Open Hardware Group, but its cores are known
[02:24.680 --> 02:31.160]  as Core 5 or Core V. So we have a huge family of processors, I'll talk about them a bit,
[02:31.160 --> 02:37.280]  everything from the smallest RISC-532 to the biggest RISC-564 designs. And these are
[02:37.280 --> 02:45.840]  standard RISC-5 cores, but with some custom ICER extensions for RISC-5. The chief executive
[02:45.840 --> 02:49.600]  is Rick O'Connor, and those of you who've been around for a few years will remember
[02:49.600 --> 02:54.280]  Rick because he was the first chief executive of RISC-5 International, and he's moved from
[02:54.280 --> 03:03.200]  the open-source specification world to actually delivering real in silicon IP. Let's look
[03:03.200 --> 03:10.160]  at the ancestry. So Open Hardware Group grows out of an academic industry project called
[03:10.160 --> 03:18.320]  the Parallel Ultralow Power Processor Project, PULP, and that was a collaboration originally
[03:18.320 --> 03:25.040]  between ETH Zurich, the University of Bologna, and ST Microelectronics. ST Microelectronics
[03:25.040 --> 03:32.880]  no longer active in it, and it predates RISC-5. The first part of PULP was done with open risk,
[03:32.880 --> 03:38.200]  but for the last many years it's been a RISC-5 project, and the idea is to get very, very
[03:38.200 --> 03:45.600]  low-power multi-core systems. So that's where we come from, and the cores started off as
[03:45.600 --> 03:51.560]  these academic research cores, and the point about academic research cores is they are
[03:51.560 --> 03:56.560]  designed to push the forefront of knowledge forward. They're not designed to be exhaustively
[03:56.560 --> 04:02.280]  tested for manufacturing and use in a commercial deployment. That's not the purpose of a university.
[04:02.280 --> 04:08.040]  So the natural transition is that Open Hardware Group takes that as an outstanding technical
[04:08.040 --> 04:18.240]  base and then takes it into a robust standard. We have loads and loads of members, so I've
[04:18.240 --> 04:23.360]  copied this off the website, and it's the wrong format. I really want a wider and flatter
[04:23.360 --> 04:28.560]  one, but you'll probably see some logos there. You'll see Ember Cosmos logos there. The astute
[04:28.560 --> 04:33.320]  of you will notice that Amazon Web Services appears to be both a member and a partner.
[04:33.320 --> 04:39.960]  I think they transitioned from a partner to a member, and the slide wasn't properly updated.
[04:39.960 --> 04:48.760]  So we have lots and lots of members. I think it's up to about 70 now. And it might be worth
[04:48.760 --> 04:53.440]  just saying that when you become a member, yes, if you're a corporate, you have to pay
[04:53.440 --> 04:59.520]  a membership fee, but that's not the big thing. You can only become a member if you
[04:59.520 --> 05:06.000]  commit resource in terms of what you're going to contribute, and that dwarfs any membership
[05:06.000 --> 05:11.280]  fee. You cannot be a member unless you're going to do something. We don't have sleep
[05:11.280 --> 05:19.040]  in members. You've got to be active members. In terms of an organisation, it's very lightweight,
[05:19.040 --> 05:24.600]  and this is one of the big contrasts with risk five international. At a technical level,
[05:24.600 --> 05:29.400]  we only have five committees. We have an overarching technical work in group which has the overall
[05:29.400 --> 05:37.440]  responsibility for the engineering direction, and it has co-chairs, it has Gerard Vamont
[05:37.440 --> 05:42.320]  from Thales, and David Lynch, and I can't remember where David Lynch comes from actually,
[05:42.320 --> 05:47.800]  but in two companies, joint chairs, and that meets every month as the final arbiter of
[05:47.800 --> 05:53.400]  technical stuff. And then all the work of this organisation is handled by just four
[05:53.400 --> 06:01.680]  companies. The cause group headed up by Ion Bink from Silicon Labs, and that is responsible
[06:01.680 --> 06:07.840]  for the tracking the development of the cause. The work is done by the member companies,
[06:07.840 --> 06:15.200]  but these are the groups that have oversight and make sure the quality criterion is maintained.
[06:15.200 --> 06:19.960]  There's a verification task group, and verification is actually separated from core development,
[06:19.960 --> 06:24.120]  although the two are desperately tightly tied in. And that's slide by Simon Davidman
[06:24.120 --> 06:30.080]  of Imperus. There's a hardware task group, and this is a slightly strange name perhaps,
[06:30.080 --> 06:35.080]  but the point of the hardware task group is, though this is fundamentally a group developing
[06:35.080 --> 06:42.960]  Silicon IP, we do have to have reference implementations, and the first reference implementation
[06:42.960 --> 06:47.800]  should be coming out later this year, which isn't the core five MCU, and that has one
[06:47.800 --> 06:54.560]  of the 32-bit cores in it. And lastly, the software task group, which I lead, assisted
[06:54.560 --> 07:00.680]  by Yun Haisheng from Alibaba Teahead, and that's responsible for all the software projects.
[07:00.680 --> 07:05.400]  And again, it's oversight. It's not doing the work, because you have to be a member,
[07:05.400 --> 07:09.080]  you have to do the work. We do have a bit of a problem there, because we have mostly
[07:09.080 --> 07:17.200]  hardware members and not enough software members. In terms of the roadmap, we've got
[07:17.200 --> 07:25.160]  a flagship application core, that's the CVA-6, and that's a 64-bit full-blown RISC-5 core
[07:25.160 --> 07:33.920]  designed to run Linux, and that comes out of the top-end pulp core under development.
[07:33.920 --> 07:47.640]  There is a smaller 32-bit core, the CV32A5, which is designed for, which is designed to
[07:47.640 --> 07:51.120]  try and do a small Linux system, or of course we've got the issue of getting Linux running
[07:51.120 --> 07:59.560]  on 32-bit anyway. You probably can only just make it out, each of these projects has a
[07:59.560 --> 08:05.680]  target technology readiness level. Are people familiar with technology readiness levels?
[08:05.680 --> 08:10.320]  A little bit. Most of these are aimed at technology readiness level, which is proven
[08:10.320 --> 08:17.000]  in the environment. CVA-5 is a bit different, it's aimed at TRL-4, which is proven in the
[08:17.000 --> 08:22.080]  lab, and we've got other projects which are at different levels, but mostly we're aiming
[08:22.080 --> 08:30.320]  at TRL-5. The flagship, the first project was the CV32E40P, which is a microcontroller
[08:30.320 --> 08:37.440]  class 32-bit RISC-5 implementation, and the first version of that is complete, the second
[08:37.440 --> 08:42.080]  version is under development, and actually the work Nandini was talking about with built-ins
[08:42.080 --> 08:47.840]  is primarily focused at CV32E40PV2. You'll see sitting there, there's something called
[08:47.840 --> 08:55.000]  the CV32E41P, and that's a bit unusual because it's actually only going to TRL-3, which is
[08:55.000 --> 09:01.160]  proof of concept, and it's being developed by Huawei under this group in order to test
[09:01.160 --> 09:11.080]  out the ZC-STAR extension, the new compressed extension, and the ZF-INX, where you have
[09:11.080 --> 09:16.000]  a shared register bank for floating point and integer. It's only a proof of concept
[09:16.000 --> 09:22.200]  chip to verify that those work. Then we have a couple of more forward-looking ones. CV32E40S
[09:22.200 --> 09:29.000]  is a version of the original CV32E40P aimed at security applications, and the really exciting
[09:29.000 --> 09:34.720]  one is the CV32E40X, that's a bit further out, because the X is a generic extension
[09:34.720 --> 09:39.040]  interface at the hardware level, so this is designed so you can take a core, it's really
[09:39.040 --> 09:44.400]  easy to add in a wide range of extensions, and indeed to the extent you can do the floating
[09:44.400 --> 09:52.120]  point extension through the CV32E40X. So that's the roadmap we're sitting on. So what about
[09:52.120 --> 09:57.640]  the software projects? Well, I'm going to focus on the tool chains, so the LLVM tool
[09:57.640 --> 10:01.800]  chain, the GNU tool chain, and because we haven't yet got silicon, I'm going to focus
[10:01.800 --> 10:07.480]  on a couple of software projects, QEMU and the Verilator model. We have other projects,
[10:07.480 --> 10:10.680]  so I'm responsible for eight projects in total. There's the SDK, the software development
[10:10.680 --> 10:15.480]  kit, that's actually joint with the hardware group. The hardware abstraction layer so that
[10:15.480 --> 10:19.480]  we can actually make our software more easily portable as we do more and more of these chips.
[10:19.480 --> 10:24.800]  FreeRTOS, we're microcontroller, if you're going to go for an RTOS, the obvious one to
[10:24.800 --> 10:29.760]  have to start with is FreeRTOS, and Linux, which is really aimed at the CVA6. So we've
[10:29.760 --> 10:36.280]  got eight projects under our belt. We have a rigorous engineering process, so those of
[10:36.280 --> 10:43.840]  you who work for big engineering companies are familiar with gate-based processes, okay?
[10:43.840 --> 10:51.680]  So all of the way we manage these projects is through a gate-based process. You have
[10:51.680 --> 10:57.640]  a project concept gate. That's where you propose a project. You want to work on this project,
[10:57.640 --> 11:02.720]  you explain what it is and why we need to do it. And the project will only go ahead if
[11:02.720 --> 11:07.000]  it's voted for by enough members. And members can't just vote and say, hey, that's a good
[11:07.000 --> 11:11.080]  idea, I'll say yes for everything. You vote for it, you're going to commit resource to
[11:11.080 --> 11:17.240]  it. Critical thing is this is a doing organization. So, okay, you then go and explore it, and
[11:17.240 --> 11:23.680]  then we got to a project launch gate. And that stage, not only do you know what and
[11:23.680 --> 11:30.080]  why, but now you know how you're going to do it, okay? What do I need to do to get this?
[11:30.080 --> 11:35.120]  And then the big one, the last one, is plan approved, okay? And plan approved means you've
[11:35.120 --> 11:39.280]  resourced it, so you know when you're going to deliver it. And that's quite a big hurdle
[11:39.280 --> 11:44.600]  for some projects, okay? And then away the project work goes, I mean, this is a bit simplified,
[11:44.600 --> 11:48.200]  we all know work starts a bit in advance, but that way the project goes and eventually
[11:48.200 --> 11:54.160]  you get to project freeze, okay, where it's all done. Now, this is brilliant for hardware,
[11:54.160 --> 11:59.760]  okay? It's a very hardware-centric view of the world because a freeze means something.
[11:59.760 --> 12:05.000]  It's when your chip's gone off to be fabbed, okay? It doesn't work quite so well for software,
[12:05.000 --> 12:10.320]  so we've modified it for software, okay? And the first two stages are quite generic because
[12:10.320 --> 12:14.560]  typically we're working with a big block of common software. You know, we're not going
[12:14.560 --> 12:19.120]  to, we're not writing a compile and a scratch, we're taking the GNU infrastructure, GCC,
[12:19.120 --> 12:24.000]  Binutils and all that, millions of lines of it. And that's mostly not changed and we're
[12:24.000 --> 12:29.360]  changing a bit. Now, some of that is quite the why and the how, the what and the why
[12:29.360 --> 12:36.400]  and the how is quite generic. Probably all Core 5 chips, certainly maybe to all the 32-bit
[12:36.400 --> 12:42.000]  ones. And then for each specific chip, we have multiple plan approved, which is where
[12:42.000 --> 12:47.880]  we work out when we're going to deliver it for CV30, E40PV1, for V2, for S, for X and
[12:47.880 --> 12:53.760]  so forth. And so we have a whole set of, but it's still the same process of you need to
[12:53.760 --> 12:59.720]  know it's properly resourced and so forth, okay? So we'll see that a little bit action,
[12:59.720 --> 13:05.720]  but that whole engineering focus pervades everything, okay? So let's put the content,
[13:05.720 --> 13:11.560]  what's compiler? Toolchain, it's not just the compiler, it's the assembler, it's the low-level
[13:11.560 --> 13:16.360]  utilities, it's the debugger, it's the libraries, it's the emulation libraries, it's the standard
[13:16.360 --> 13:21.480]  C libraries, the C++ libraries. And in the ultimate world, if you look at GCC, this
[13:21.480 --> 13:27.400]  is GCC, it's many, many languages. It's ADO, it's the C++ family, it's Fortran, it's
[13:27.400 --> 13:31.880]  Java, it's Go. These days, it's Rust, it's Modular 2. And of course, we've got things
[13:31.880 --> 13:36.520]  that sit at the high level like OpenMP and OpenACC. That's a huge lot of stuff. We have
[13:36.520 --> 13:41.560]  a hasten to say, we're a long way of having all of that for the Core 5. But it's a lot
[13:41.560 --> 13:46.040]  of code. It's about, it's north of 12 million lines of code and it's a year or so since
[13:46.040 --> 13:51.520]  I last measured those figures. So, it's big. And we're trying to get that all seamlessly
[13:51.520 --> 13:56.880]  worked through to work on Core 5. Now, we're not doing it from scratch. Of course, we're
[13:56.880 --> 14:02.640]  starting from Risk 5 and then we're adding stuff to it. And you can say LLVM has the
[14:02.640 --> 14:09.600]  same components, but they've got different names and a different set of languages.
[14:09.600 --> 14:17.360]  So let's look at the ICER extensions for Core 5 and the Core 5 ICER extensions, there are
[14:17.360 --> 14:27.240]  nine we're concerned with. Eight of those come from the PULP project. So, extra addressing
[14:27.240 --> 14:36.360]  modes, post-incrementing load and store. Hardware loops, more ALU operations, some special case
[14:36.360 --> 14:42.560]  branching operations. We've got MAC instructions. We've heard about the event load. That's a
[14:42.560 --> 14:48.400]  multi-core feature. And then we've got the PULP bit manipulation and the PULP SIMD.
[14:48.400 --> 14:53.640]  Those are not standard bit manipulation and SIMD. Those are different ones. But there's
[14:53.640 --> 14:58.680]  been years of development. The reason they're different is they predate a Risk 5 bit manip
[14:58.680 --> 15:08.080]  and they predate Risk 5 SIMD. And you can see the PULP SIMD is big. And the reason that
[15:08.080 --> 15:14.240]  Nandini knows so much about built-ins is she's done the built-ins to support those 220 SIMD
[15:14.240 --> 15:21.880]  instructions. So she's nearly finished and then I think she's going away for a long holiday
[15:21.880 --> 15:29.640]  where she never looks at another built-in again. And we've got ZC-STAR. So we had the
[15:29.640 --> 15:35.160]  first GCC implementation supporting ZC-STAR. Of course, these are standard Risk 5 compilers.
[15:35.160 --> 15:40.640]  You can still use them for Risk 5. And the hot news is that Core 5 GCC has a PUL request
[15:40.640 --> 15:47.280]  to put ZC-STAR 1.0.1, which is the freeze candidate support. Once that's been reviewed,
[15:47.280 --> 15:52.680]  that will go in there. So we've got a lot. The toolchain work is all about supporting
[15:52.680 --> 16:09.440]  these PULP extensions. In terms of the built-ins, you've heard all about them, but it's a lot
[16:09.440 --> 16:16.160]  of functions. It's not just Nandini. Nandini has a team working on this. And we've got
[16:16.160 --> 16:22.400]  this naming convention. So we get from a naming convention that built-ins for Risk 5 are built-in
[16:22.400 --> 16:29.280]  underscore Risk 5, underscore vendor, underscore name. We've got so many. We're actually splitting
[16:29.280 --> 16:38.840]  up into ICER extension and name. There is a rule, though. If what you're doing is a built-in,
[16:38.840 --> 16:42.120]  is it corresponds to a standard built-in, then you must use the standard name. So for
[16:42.120 --> 16:47.400]  example, we have, in our arithmetic, we have abs. So we've got built-in abs, which is
[16:47.400 --> 16:57.920]  a standard GCC built-in. The built-ins actually have the same name for 32-bit or 64-bit. That
[16:57.920 --> 17:03.560]  is not overloading in the C++ sense, because either you're running for a 32-bit target
[17:03.560 --> 17:08.080]  or you're compiling for a 64-bit target, you have one or the other there. There's not an
[17:08.080 --> 17:12.680]  error. It's also the case is built-ins are not just another way of doing inline assembler.
[17:12.680 --> 17:18.440]  There's not a one-to-one mapping. So for example, for the SIMD add scaler, there are actually
[17:18.440 --> 17:25.400]  two different ways of adding scalers in the Core 5 SIMD, one where the scaler's in a register,
[17:25.400 --> 17:29.120]  the other where it's a small integer, and you can actually put it, there's actually
[17:29.120 --> 17:33.440]  an add immediate instruction. We don't have two built-ins. There's a single built-in,
[17:33.440 --> 17:40.160]  and if the second argument is a small constant that fits there, it will generate the immediate
[17:40.160 --> 17:44.680]  version instruction, otherwise it will load it into a register. That's part two of Nandy's
[17:44.680 --> 17:50.960]  talk for the future, because that's quite a lot harder to do in a built-in. There is
[17:50.960 --> 17:59.480]  a specification. It's big if you put it into, if you generate PDF, it's 57 pages long. That's
[17:59.480 --> 18:04.040]  one of the things. It's built-ins. Built-ins are not quite as easy as you think. There
[18:04.040 --> 18:07.400]  are things you can get wrong there, and you do genuinely have to think and review. It's
[18:07.400 --> 18:16.280]  under review at the moment. That's not finalised. So testing. Now, if we're going to do full
[18:16.280 --> 18:21.720]  testing of a tool chain, we need a target which has all these ISER extensions. They're
[18:21.720 --> 18:26.720]  not standard risk five. I can't just take standard risk five, QMU or whatever. You can
[18:26.720 --> 18:31.400]  do some testing. So the standard GNU assembler tests, for example, don't need an executable
[18:31.400 --> 18:36.680]  target. They are pattern matching. Have you generated something that looks right? You
[18:36.680 --> 18:39.920]  can do the same with built-ins. You saw it from Nandy's compile time only thing, where
[18:39.920 --> 18:45.000]  you look and do a scan assembler to see, has that built-in generated just one of the instruction
[18:45.000 --> 18:52.840]  I want? But more generically, you need to be able to execute your tests. So we have
[18:52.840 --> 18:59.840]  two things. One is we have QMU for Core 5. That's a project being led by Wei Wei Li at
[18:59.840 --> 19:08.040]  the programming languages and compiler technology team at Chinese Academy of Sciences in Beijing.
[19:08.040 --> 19:14.000]  That's a work in progress. We're expecting that to become available later in 2023. And
[19:14.000 --> 19:19.160]  secondly, we're using verilator models. How many people here are familiar with verilator
[19:19.160 --> 19:24.880]  handshow? Okay, most of you. Okay, for those who don't, it's a tool that takes a hardware
[19:24.880 --> 19:32.320]  design in verilog or system verilog and generates a C++ model from it, a cycle accurate C++
[19:32.320 --> 19:37.040]  model. And verilator models are really useful because A, they're easy to integrate to tool
[19:37.040 --> 19:41.760]  chain testing and B, they are what's called implementation models. Most models come from
[19:41.760 --> 19:45.840]  the specification of the chip. These are the actual implementation. So we know what your
[19:45.840 --> 19:50.060]  testing is, what is physically going on the chip. And when you have a model, it's not
[19:50.060 --> 19:52.960]  just enough to have a model, you've got to be able to hook to it. So typically you've
[19:52.960 --> 19:58.280]  got to run, wrap around some form of debug server so you can connect GDB or LLDB. And
[19:58.280 --> 20:04.000]  that's the work in progress. That's due to be completed in the next few weeks. And then
[20:04.000 --> 20:10.160]  you'll be able to actually run on the actual, a model of the actual hardware.
[20:10.160 --> 20:21.000]  Testing policy, LLVM project uses the LLVM integration tester. That is a set of several
[20:21.000 --> 20:29.440]  tens of thousands of tests from source code down to LLVM IR. Very comprehensive and we
[20:29.440 --> 20:35.760]  use that. But it isn't a set of execution tests. Now LLVM does have an executable test
[20:35.760 --> 20:40.520]  suite, but it is a set of applications to run under an operating system. And for a small
[20:40.520 --> 20:44.040]  microcontroller they're not suitable. They need an operating system there. So we can't
[20:44.040 --> 20:49.040]  use the LLVM test suite. So instead we use a subset of the GNU regression tests to test
[20:49.040 --> 20:52.520]  LLVM compilers. And that's widespread. That's not something we've invented. That's been
[20:52.520 --> 20:57.440]  done for years. It's only a subset because there is no point in running the GNU tests
[20:57.440 --> 21:02.600]  of the internal representations inside the GNU compiler. But things like the torture tests
[21:02.600 --> 21:07.480]  are absolutely fine whether they're on LLVM or GC. They're compiler agnostic. The GNU
[21:07.480 --> 21:15.040]  tools just uses the GNU regression tests. Something that we're very hot on is exhaustive
[21:15.040 --> 21:20.360]  testing. It's not just, oh, I tried one thing and it seemed to work. It's a, let's make
[21:20.360 --> 21:27.520]  sure we've not missed things. So starting at the assembler, we have both positive and
[21:27.520 --> 21:33.360]  negative testing. By positive testing, we mean testing the compiler does what you want.
[21:33.360 --> 21:38.960]  By negative testing, I mean testing it doesn't do things when it shouldn't. So for example,
[21:38.960 --> 21:45.520]  if we're looking at an instruction that takes a six-bit signed constant, we will test it
[21:45.520 --> 21:52.200]  with values of minus 33, too small for a negative number, minus 32, the biggest negative six-bit
[21:52.200 --> 21:57.560]  number, zero, because you should always test zero because it's a special case, probably
[21:57.560 --> 22:01.120]  minus seven and plus five. And then we'll test 31 and 32, which is too big. And we will
[22:01.120 --> 22:05.200]  check those bounds. And actually, we added those tests for ZC star and found a whole
[22:05.200 --> 22:12.360]  load of bugs in the ZC star spec as a consequence. So we do that sort of thing. And we also
[22:12.360 --> 22:20.120]  test things like we test the extensions. I've got the ELW instruction. I test that the ELW
[22:20.120 --> 22:26.520]  instruction is handled by the assembler when I specify X ELW. I also test that it doesn't
[22:26.520 --> 22:31.000]  get recognized when I don't specify X ELW. So that's really important. And one thing
[22:31.000 --> 22:35.760]  I would observe is, and I've seen this for a long time, is risk five is incredibly weak
[22:35.760 --> 22:41.720]  on its assembly level testing. There are about 10 times as many tests for ARM and X86 as
[22:41.720 --> 22:50.240]  there are for risk five. So, you know, it's important to do that. And so we do core five
[22:50.240 --> 22:56.200]  specific, we're a vendor, core five specific LD testing because we're adding some regress,
[22:56.200 --> 23:01.800]  we're adding some relocations. Do they work? At the moment, we've got compilation only
[23:01.800 --> 23:07.240]  tests of built-ins using scan for assembler instructions. But when we've got those models
[23:07.240 --> 23:12.360]  running, the QEMU and the Verilator model, we'll be adding execution tests. So not only
[23:12.360 --> 23:17.240]  do I generate the right assembly instruction, but it does what I expect. Not only do I generate
[23:17.240 --> 23:24.520]  the built-in, but it does what I expect. So we'll be adding those in as well. And that,
[23:24.520 --> 23:29.800]  all that testing ties into the difference of this is a commercial grade chip, commercial
[23:29.800 --> 23:35.520]  grade core, sorry, and its associated tool chains. So resourcing. Resourcing is an issue
[23:35.520 --> 23:40.760]  because software, these days on a chip, you'll spend twice as much on the software as you
[23:40.760 --> 23:48.760]  do on the hardware. But open hardware group is inherently, mostly, hardware company members.
[23:48.760 --> 23:53.640]  Okay? We've got plenty of software members, but we're still in a minority. So it is a
[23:53.640 --> 23:57.920]  challenge to get enough resourcing as part of my contribution and because it makes a
[23:57.920 --> 24:05.280]  lot of contribution, that's part of our membership. The PLCT lab in China, they make a big contribution.
[24:05.280 --> 24:09.560]  So that's coming there. And we actually double up. We use them very much as part of our
[24:09.560 --> 24:13.920]  graduate training program as well. So part of the graduate allows us to train a new generation
[24:13.920 --> 24:19.240]  of compiler engineers. But we're also seeing companies like Silicon Labs and Dolphin Design
[24:19.240 --> 24:23.320]  call out to them because they help to fund those software companies to do the work. So
[24:23.320 --> 24:27.760]  thank you to both of those. And we do need more of that to come on, and I expect it will
[24:27.760 --> 24:35.840]  come along. Okay? Second is we are not going to maintain out-of-tree forks of GCC and
[24:35.840 --> 24:41.600]  LLVM. It's a thankless task. It takes up a lot of time. The goal is to get upstream as
[24:41.600 --> 24:46.440]  vendor extensions. This is not a new thing. This has been part of GCC since before and
[24:46.440 --> 24:51.720]  part of LLVM. Okay? And when you have that triple that says what your target is, that
[24:51.720 --> 24:57.560]  thing that in your standard compiler says risk 532 unknown hyphen L5 and GCC or whatever,
[24:57.560 --> 25:02.600]  that unknown is the vendor field. And we should be using that. So if you get these tool changes,
[25:02.600 --> 25:10.240]  you'll find they build as risk 532 hyphen core fee, the vendor, hyphen L5 and GCC. Okay?
[25:10.240 --> 25:14.240]  And that's absolutely standard. It's been around forever. You can see in the GCC build,
[25:14.240 --> 25:19.120]  for example, there are variants of Spark and risk 5 for different manufacturers for people
[25:19.120 --> 25:25.760]  who make, you know, radiation, hardened versions for space and so forth. Okay? That all works
[25:25.760 --> 25:30.720]  fine. Risk 5 is designed for this. It's extensible. That's the whole point. There is a missing
[25:30.720 --> 25:43.600]  piece of the jigsaw which is in RV32 in the ABI specification, you have relocations. Okay?
[25:43.600 --> 25:49.400]  There are 256 possible relocation values you can have. The top 64 of those are reserved
[25:49.400 --> 25:54.960]  for vendors. Okay? That's enough probably for any one vendor, but it's not enough for
[25:54.960 --> 26:00.080]  all vendors and it requires a centralized way of controlling. So we know how to solve
[26:00.080 --> 26:04.680]  this problem is that every time you need a relocation to tell you this bit of assembler
[26:04.680 --> 26:09.800]  when you link it needs adjusting the memory offset, you put down two relocations. One
[26:09.800 --> 26:15.160]  is to say which vendor you are and that's just a new vendor. That's just a new relocation
[26:15.160 --> 26:22.360]  with 32 bits so we can have 4 billion vendors. Okay? And then the second one is say which
[26:22.360 --> 26:27.200]  of those 64 relocations, but it means the vendor relocations, there's a full set for
[26:27.200 --> 26:34.280]  every vendor. So we know the concept. One of my other team, Pietra Ferrara, who's sitting
[26:34.280 --> 26:40.040]  somewhere in the audience, is doing the proof of concept to demonstrate that works. Turns
[26:40.040 --> 26:45.880]  out the GNU linker is rather running at its limits with the complexity of risk 5 so it's
[26:45.880 --> 26:50.160]  not a completely trivial task, but we need that before we can fully upstream all this.
[26:50.160 --> 26:56.520]  The rest of it is all ready to go and it's all done to upstream standards. There's another
[26:56.520 --> 27:02.440]  thing we found is, you noticed I showed you there's two versions of CV32E40P and they've
[27:02.440 --> 27:08.200]  got different instruction encodings. We thought it would be good to actually be able to support
[27:08.200 --> 27:14.400]  both instruction encodings and if you specify an architecture, you're allowed to specify
[27:14.400 --> 27:26.720]  my RV32IMAC underscore X ELW and you'll understand then that I'd like to say 1P2 to say I want
[27:26.720 --> 27:32.080]  version 1.2. Okay? That's all part of the standard way you name an architecture, but
[27:32.080 --> 27:37.920]  it turns out it's not supported in the assembler, the GNU assembler, and furthermore, the GNU
[27:37.920 --> 27:41.680]  assembler is not written in such a way that it's ever going to be easy to support and
[27:41.680 --> 27:46.240]  we gave up on that and, in fact, we're only going to support the latest version and that
[27:46.240 --> 27:52.480]  probably ties into the way that a risk 5 international is going. So those, if you like, are the
[27:52.480 --> 27:56.960]  key issues we're addressing. On the upstreaming, we're almost certainly going to upstream
[27:56.960 --> 28:00.800]  the ISER extensions that don't need vendor-specific relocations and we'll put the others up once
[28:00.800 --> 28:12.080]  the vendor-specific relocations are ratified by the PSA by ABI group. So, there. Get involved.
[28:12.080 --> 28:19.240]  The projects are all on GitHub. The open hardware group has its own repository and if you don't
[28:19.240 --> 28:23.160]  like building from source, you can go to the Embercosm website and you can download pre-built
[28:23.160 --> 28:29.480]  tool chains for GCC and LLVM, for Core 5, for every operating system under the sun,
[28:29.480 --> 28:38.160]  all flavours of Linux, Mac, Windows, whatever. So, get involved. Each of these projects has
[28:38.160 --> 28:45.760]  a project lead. Charlie Keeney leads the LLVM project. Chun Yu-Liao from PLCT. Remember
[28:45.760 --> 28:49.280]  I said how you have different plan approved for the different variants? She's in charge
[28:49.280 --> 28:55.800]  of the specific project for CV32E40PV2. Nandi Jamnadas, who you heard from just now, leads
[28:55.800 --> 29:03.400]  the GNU Tools project and is also responsible for the CV32E40PV2. Wei Wei Li from PLCT runs
[29:03.400 --> 29:08.080]  the QMU project and I'm responsible for the verilator modelling because I'm a verilator
[29:08.080 --> 29:15.320]  guy. Part of this is about bringing on a new generation. We actually help a new generation
[29:15.320 --> 29:20.600]  on and train. So, there is a half hour call. I'm sorry about the time if you live in America
[29:20.600 --> 29:25.640]  because most of the people involved are either in China or in Europe. So, they're on Friday
[29:25.640 --> 29:29.560]  mornings. There's a half hour call on LLVM run by Charlie and there's a half hour call
[29:29.560 --> 29:34.360]  on GNU run by Nandi. And the idea is that we'll review people collectively. We'll review
[29:34.360 --> 29:39.400]  their pull requests and it's as much a training and learning thing as anything. So, if you
[29:39.400 --> 29:44.840]  want to get into this stuff, it's actually quite a good way to get a bit of free training.
[29:44.840 --> 29:48.720]  And that's it. So, that's me. That's Ember Klossom. That's the Open Harbour Group. Thank
[29:48.720 --> 29:59.240]  you very much. So, we've got a few minutes for questions. I'm happy to take any questions.
[29:59.240 --> 30:00.240]  Yes?
[30:00.240 --> 30:19.240]  Yes? Yes. So, I'm working in a hardware research group at the university. We do a lot of feedbacks.
[30:19.240 --> 30:20.240]  Previously, we've always used the fieldwork from ETH Taraki or Karambolanya. But sometimes
[30:20.240 --> 30:21.240]  it gives us some troubles because, for example, I'm doing compiler development right now and
[30:21.240 --> 30:22.240]  then last week I discovered that there was a bug in GDB and nobody is working on GDB anymore
[30:22.240 --> 30:30.240]  for this specific version that we take out. So, I was just wondering, do you maybe have
[30:30.240 --> 30:36.240]  like a time frame for these upstreaming of these extensions? And can we, like, if tomorrow
[30:36.240 --> 30:41.240]  we do a takeout, should I tell my colleagues to do an Open Harbour core or should I tell
[30:41.240 --> 30:46.240]  them to do the stay with the pull, please, or the pull cores in general?
[30:46.240 --> 30:50.480]  Okay. So, the question for the recording, the question was about, if you're working on
[30:50.480 --> 30:55.080]  the ETH pulp cores, which are still there as fantastic research cores, should you use
[30:55.080 --> 31:01.880]  the old pulp compiler or should you use the core 5 compiler? So, I think there's not a
[31:01.880 --> 31:07.640]  black and white answer on that. The pulp compiler is a fork of GCC from 2017. So, it's quite
[31:07.640 --> 31:11.680]  a long way out and that means it hasn't got the latest RISC 5 stuff in there. Where we
[31:11.680 --> 31:15.840]  started on the GCC for this, we actually looked at whether we could roll that forward and
[31:15.840 --> 31:24.040]  it wasn't a sensible starting point. We started from scratch from the latest GCC. So, in terms
[31:24.040 --> 31:28.760]  of which core you use, I believe ETH-Ziric is slowly moving over to more using the core
[31:28.760 --> 31:33.720]  5 unless you're particularly, because you might, you may as well use these hardened
[31:33.720 --> 31:39.680]  cores. In that case, the obvious thing is to use the core 5 tool chains and though they're
[31:39.680 --> 31:44.120]  not yet upstream, they're all in the public and there are pre-compiled ones you can pull
[31:44.120 --> 31:48.720]  up. There is a problem if you're using the old pulp cores, because remember I talked
[31:48.720 --> 31:56.280]  about that version 1 and version 2. The old pulp things are so old they predate the finalization
[31:56.280 --> 32:02.000]  of the RISC 5 encoding space and actually the instruction encodings trample on future
[32:02.000 --> 32:09.000]  encoding spaces for RISC 5. So, the version 2 fixes all that and all the version 2 instruction
[32:09.000 --> 32:16.360]  encodings are actually now RISC 5 compliant. They sit in the custom 0123 blocks. What that
[32:16.360 --> 32:20.200]  means is you can't use this compiler because we haven't got the version 1 stuff because
[32:20.200 --> 32:24.880]  the versioning issue I talked about to compile for the old pulp encodings. So, that might
[32:24.880 --> 32:29.800]  be a factor you have to bear in mind there. But the old compiler, I've looked at the old
[32:29.800 --> 32:36.040]  compiler and it comes down to it's a research compiler. It wasn't designed to be tested and
[32:36.040 --> 32:40.280]  Rust is designed to prove concepts and I think I've always very strong. That's the job of
[32:40.280 --> 32:45.800]  universities not to do the exhaustive testing we do to different purpose. So, it's a different
[32:45.800 --> 32:50.800]  type of compiler but it does mean that occasionally you get weird behavior. Yeah, so I haven't
[32:50.800 --> 32:55.200]  really answered the question but I've given you the decision points to look at. I'd love
[32:55.200 --> 33:02.280]  you to use Core 5 because then you'd be tempted to join in and help here. Any more questions?
[33:02.280 --> 33:09.280]  Yes, right in the back.
[33:09.280 --> 33:20.280]  Absolutely. Yes, I should have said, yeah. So, we have a lot of projects under there
[33:20.280 --> 33:24.160]  and we bring in that roadmap I showed. If you look closely you'll see the dates are
[33:24.160 --> 33:27.640]  all wrong because some of these have moved out and we've got a load of problems like
[33:27.640 --> 33:32.240]  the Tristan project that we heard of earlier which are under the Open Hardware Group. And
[33:32.240 --> 33:40.680]  those of you who use David and Sarah Harris's textbook for design, okay, the Wally processor
[33:40.680 --> 33:46.160]  is being re-implemented as a RISC 5 processor and that is being done under Open Hardware
[33:46.160 --> 33:51.640]  Group. So, your next generation of textbook will have an Open Hardware Group Wally processor
[33:51.640 --> 33:58.440]  in it. So, yeah, there's more than just those cores I said there. And if you are working
[33:58.440 --> 34:02.720]  on a core and you think you might want to put it in this framework, come and talk to
[34:02.720 --> 34:05.840]  one of us. You can talk to Director Rick O'Connor if you don't know him, come to me and I will
[34:05.840 --> 34:06.840]  introduce you. Yes?
[34:06.840 --> 34:14.400]  I only have a stupid question. So, I work mostly on applications actually and then our development,
[34:14.400 --> 34:20.000]  we usually, we're starting to converge like developers and testers are sort of converging
[34:20.000 --> 34:25.640]  into one team. Like, now you were saying that you actually have this bit where some people
[34:25.640 --> 34:30.640]  do like the cores and others do the verification. Would that also be possible to converge at
[34:30.640 --> 34:31.640]  some point?
[34:31.640 --> 34:36.080]  So, this is, so the question is why do we have, I can sub-paraphrase as why do we have
[34:36.080 --> 34:41.720]  separate core task group and verification task group? They do work very closely together.
[34:41.720 --> 34:45.920]  This is specifically about hardware verification. It's not about software verification. I think
[34:45.920 --> 34:50.840]  the argument is completely different and for the software, the verification and development
[34:50.840 --> 34:57.760]  are closely integrated. I think because hardware verification is so formally structured, there
[34:57.760 --> 35:02.200]  is actually a case to be made for keeping them separate and having the design team and
[35:02.200 --> 35:07.960]  the verification teams distinct. So, it sort of makes sense. I'm really a software guy.
[35:07.960 --> 35:13.200]  I'm not an expert on hardware. But it does sort of make sense. But the two teams work
[35:13.200 --> 35:18.760]  very closely together. But it allows one team to focus on the UVM-based test and verification
[35:18.760 --> 35:27.760]  flow and another to work on the actual implementation of the chip. Any more questions? Okay. Thank
[35:27.760 --> 35:32.440]  you all very much. That brings the risk five dev room to an end and hope you enjoyed it.
[35:32.440 --> 35:33.440]  Thank you.
[35:33.440 --> 35:43.440]  Thank you.