We'll get started. Sylvester will introduce us to Piepart MC. Thank you for coming. I'm Sylvester Arrabas. I work at the AGH University in Kraków in Poland. And this is a project carried out together with a team from the University of Illinois, Urbana-Champaign in US. So Piepart MC is the highlight here. But from the perspective of this conference, probably I should read the subtitle, namely How to engineer a Python to Fortran binding in C++ for use in Julia and MATLAB and why to do it. So the package that this tool is interfacing is called Piepart MC. It's a Monte Carlo simulation package for air resolves that are, for example, floating in the air. It's an open source tool developed for more than 20 years at Urbana-Champaign. And just one line about the physics. So usually it's kind of a box model, so studying just processes without a spatial context. But it also has an option to be coupled with the Worf weather simulation for a cast. So here is the HPC context. And it simulates things like air pollution, evolution due to collisions of particles, condensation, chemical reactions, et cetera. And on the technical side, it's actually an object-oriented code base written in quite classic, using quite classic subset of Fortran, but still in very much object-oriented manner. And despite 20 years of heritage, it has a very comprehensive test suite. And I would say it could be an example of best practices in Fortran. However, its usage poses several challenges, for example, to students who intend to start off using it, for example, from a Jupyter notebook. And these challenges are related with, first of all, multiple dependencies. The need to compile it. Getting updates doesn't have really a workflow ready. The automation of simulations, analysis, et cetera, usually involves Shell. The input output is handled through multiple text files. And to analyze output from these simulations, usually one needs to actually look or use some of the Fortran code the simulation is based on. So the question that was posed when we started was how to bring together these two seemingly separate worlds. So on the right-hand side, this is the simulation package, part MC, with its Fortran code base, a bit of C code base, different dependencies. And then a perspective of a modern student, let's say, who starts with Jupyter and expects basically everything to be importable and interoperable with other libraries, scipy, numpy, et cetera. So the goals would be to lower the entry threshold for installation and usage. To ensure that the same experience is doable on different operating systems. And also to streamline the dissemination of studies based on the simulation tool, for example, for peer review with scientific journals. So the status of the project, as of now, of part MC, this Python bindings, is that we released after two years of development version one, it's on PyPy. And we also published a description of the package in the software X journal. So we are kind of ready for a rollout. And today I will talk more about the internals. And the internals start with PyBind 11. So despite we are talking about Python and Fortran, we actually, we picked PyBind 11, which is a C++ tool for developing Python packages as our backbone. So here's some highlights. The project actually is for those who are new to it, it's quite a remarkable success, I would say, with over 300 contributors on GitHub, 2,000 forks and 14,000 stars. Congratulations to PyBind 11. And it's very useful. So it fits here into the picture. So essentially we developed in C++, in C and in Fortran, so it's a triple language project, something that uses PyBind 11 and a few other components to automate building of this part of C and offering the Python package. So probably what's also worth mentioning is here that most of the work on PyPartnC was around substituting this text file input output with JSON-like Python native, let's say, or Python-like Pythonic input output layer. And as I mentioned, the original project has the object-oriented structure, so we tried to also couple Python's garbage collector with the Fortran functions that are provided for creating and deallocating objects. And there are many, many dependencies that the project has in Fortran, in C, in C++. And here, let me just mention that we picked Git submodules as a tool to pin versions of these dependencies, which is useful because the pip install command is able to grab packages from a Git repository, and this would include all the submodules with their versions. So let me now present a bit of code and how it looks from a user perspective. So this example here, please don't look particularly on the license of code, maybe just on the bulk of code, and the type of code. So here on the left, we have the Fortran Hello World for using the PartMC package, and on the right, three text files that would be the minimum to start a simplest simulation. So now this is the end result that uses the PyPartnC layer, so essentially the same can be obtained with a single file, starting with importing from this PyPartnC wrapper, and then using this kind of JSON-like notation, essentially here, list and dictionaries that are wrapped. So one achievement kind of, and one big advantage of using Python is that actually providing Python wrappers, you are catering also to Julia users, for example, here through the PyCall.jl package, essentially the same code and the same logic can be obtained for Julia users using PyPartnC. And finally, example with using Matlap, which ships with built-in Python bridge, and then which allows also to use PyPartnC to access the Fortran code from Matlap. So these three examples I've shown are actually part of our CI, so we have them in the readme file, and on CI we are executing the Julia, the Python, the Fortran, and the Matlap example, uploading the output as artifacts, and there is an assert stage that checks if the output from all these languages match. By the way, the timings here are essentially compilation and set up, so it's not that Fortran takes much shorter, the execution is always done through the Fortran code base and binary, but clearly compiling just the Fortran code is faster than setting up the Python, Julia, or Matlap environment, and how it works actually in practice when looking at the code. So here, this diagram might be not perfectly visible, but the right column is C++ layer, here is the C layer, here is Fortran layer, and here is the user code either in Julia, Matlap, or Python. And the different color here is to depict the package that we are interfacing with. So if we start with this readme code here, the user's Python code, we have set up the some import and instantiation of a single object of this arrow data class as an example, and what happens if we call it, first it goes through barely visible, I guess. So anyhow, this is the kind of outer layer for the C++ implemented Python package, and now I hope it's more visible. This is how PyBind 11, how one works with PyBind 11. So this is the C++ code where we define a module for Python, creating a Python class from C++ code looks roughly like this, with some templates defining the class that we interface how to handle memory allocation and defining particular methods. Here there is an init method, so a kind of constructor, and this constructor, when called, goes through C++ code, this arrow data class that we wrap, but quickly we need on our way to Fortran to go into what is written here up at the top, C binded signatures for the Fortran function. So they cannot take exceptions, exception handling through, across these languages is essentially undefined behavior, depending on the compiler. This is how it looks from the C++ perspective. So when we look now on the C signatures here at the top, they match to what is later defined in Fortran with the Fortran built in C binding module. So whenever you see this bind C or C underscore types, these ensure within Fortran code that we can access this code from C, and each of these routines is written for our wrapper and essentially calls quickly as a fin wrapper around the original Fortran routines that we wanted to wrap. So for example, the one below spec file read arrow data. So now we go finally to the wrapped code. This is the unmodified code that we access, and it sits in a Git submodule of the Pypartmc project. Now the fun starts when this Fortran code actually calls its input output layer, and there is like, usually a simulation takes something like 20 different text files to be read through, and these text files are nested. So what we've done is we replaced one of the components of the original Fortran package with our implementation that starts in Fortran, then goes through a C layer back to C++, which then uses JSON for Fortran. So this is a C++ library that helps get very readable C++ code for using Fortran, and this was our solution to replacing the multiple text files with what from user perspective are essentially in memory, MATLAB, Julia, or Python objects. We also have online documentation for the project generated from the source code, and as you can see here, for example, the types are hinted correctly. So despite in principle the Fortran parameter ordering is the key, we do inform Python users for the types of the arguments. So to start a summary, what we achieved with the Pypartmc wrapper is that we have a list of different types of the wrapper, and we have a single command pip installation on Windows Linux and OS X, with the exception that from Apple Silicon we are still struggling to get it done and help welcome, if any of you is a Fortran hacker who could help us produce universal binaries. We provide access to unmodified internals of the Pypartmc underlying package from Python, MATLAB, and also C++. So as a side effect by product of this goal of providing Python interface, we got also Julia MATLAB and C++ layer. Probably something that might not be obvious from the original plan, and we ended up actually using extensively is that this provides us with a nice tool for development of other Python packages because we can use part mc in test shoots to verify against the established simulation package. And also probably it's maybe a non-trivial way to use pip, but since C and Fortran are probably not the best, are not the solutions, not the technologies where you see mainstream package managers coming in or being established here, we managed to ship Fortran codes to users of Windows 6 Linux different variants of binary packages through pip. So it's essentially probably one way of thinking of the PyPy.org platform. And from the point of view of what I mentioned earlier, providing students or researchers using this package with tool to disseminate their research workflows, including input data, output data analysis workflow in a single, for example, Jupyter file for a paper peer review. And finally, PyPy.org mc allows to extend the Fortran code with some Python logic. So since we interface with, we expose the internals of the package, we can do in a simulation the time stepping can actually be done from Python. And you can add to, let's say, if you have 10 different steps of the simulation done in Fortran, you can add an 11th one that is in Python, Julia or whatever. And the final point is probably one of the key things here is that having statically linked all the dependencies, we can actually use the package on platforms such as Colab or Jupyter Hubs of various institutions by doing just pip install and importing what otherwise would require getting a lot of dependencies and a lot of compile time stuff available. Take home messages. So I wanted to kind of give you a little bit of a little bit of a little bit of a little bit of a little bit of a little bit of a little bit of a little bit of a little bit kind of underline that PyBind 11, despite being a C++ tool is actually a valuable thing for interfacing Fortran with Python. And this is linked to the fact that PyBind 11 offers CMake integration. So your C++ projects can have build automation in CMake, and CMake handles Fortran well, so this was the key thing here. The glue language role of Python is, I think, nicely exemplified here with Julia and Matlap, including CI. Static linkage of the dependencies was essential for us, for example, due to the fact that there is no standardized ABI for four different versions, even of the same compiler, have different binary incompatibilities, and this was essential to get it working on on platforms such as Colab or other Jupyter Hubs. But this prevented us from from publishing the package on KONDA due to KONDA policy of no static linkage. We've used more than 10 Git submodules for tracking our dependencies from the GitHub repo. As I mentioned, help welcome in getting the universal binaries generated with G4tran. The CI on using MATLAB is possible thanks to the MATLAB actions. So the producer of MATLAB MapWorks offers CI, GitHub actions that actually do not require any MATLAB license. So if one wants to run MATLAB code on GitHub, this is important and just wanted to thank them. And finally, a fun fact or the positive thing that actually when we submitted the paper about the project to the Software X Journal, just reporting that during the peer review, the reviewers indeed tried the code and provided us with feedback that also helped. So this was kind of positive that it did work. Let me acknowledge funding from US National Science Foundation and Polish National Science Center and thank you for your attention. Any questions? Yes, thank you for that presentation. My question was exactly did you keep in Fortran and what did you pass to Python site? So it's a race or some or just single values? So the question is about if I understand correctly what kind of data we are tackling with passing us during the simulation. So it's a the Monte Carlo simulations here are tracking particles in kind of attribute space that tracks their physical and chemical properties. So it's usually 20, 30 dimensional attribute space that is randomly sampled. So we have vectors of these particles in this attribute space. So usually this could be from thousands to hundreds of thousands of particles that each of the particle has like 30 attributes. From Python perspective, usually the user does not really use the roll data of the simulation, the state vector, just some aggregate information which is passed back to Python as enumerables that can be used with NAMPy, but we don't actually assume that it must be NAMPy. So one can use just lists if they are enough. I hope that answers. My question is just because we need some roll data from Fortran site to Python site and then it's just some two dimensional matter. Here we have some problems that we need to know where we keep the data. We are not exposing particle locations in memory. They are always returned as new objects to Python because this is it is never the state vector of the simulation. It's just a some aggregate information that characterizes it in a simpler way. So usually we have just one dimensional enumerable. For you it's much more simple. Thank you. Time for one more question. If there is one. Okay, if not we'll wrap up here because apparently there's a queue outside to get in for the next talks. Thank you. Thank you very much.