[00:00.000 --> 00:11.480]  So, hello and welcome to my talk about green software engineering and more specifically
[00:11.480 --> 00:17.200]  about building energy measurement tools and ecosystems around software.
[00:17.200 --> 00:23.680]  My name is Arne and I work for Green Coding Berlin, which is a company that specializes
[00:23.680 --> 00:31.040]  in making open source tools for energy aware software measurement.
[00:31.040 --> 00:38.000]  I would like to take you on a tour today of a concept for a possible future ecosystem
[00:38.000 --> 00:43.880]  we imagine where energy consumption of software is a first world metric and available for
[00:43.880 --> 00:48.040]  every developer and user.
[00:48.040 --> 00:50.640]  So let's have a look at a hypothetical scenario.
[00:50.640 --> 00:56.280]  The Windows 10 operating system typically comes with a minimum system requirements.
[00:56.280 --> 01:00.280]  So if you look on the vendors web page, you can see it has a processor that is needed,
[01:00.280 --> 01:05.440]  one gigahertz, one gigabyte of RAM, a particular amount of hard disk space, graphics ports,
[01:05.440 --> 01:06.440]  etc.
[01:06.440 --> 01:12.520]  However, what is never given is the power on, for instance, idle that this operating system
[01:12.520 --> 01:18.240]  uses on this reference hardware that it apparently already specifies.
[01:18.240 --> 01:20.640]  So this should be pretty doable, right?
[01:20.640 --> 01:23.640]  Also something like power the desktop activity.
[01:23.640 --> 01:28.360]  So how much power does it use just to go around in the operating system, opening the
[01:28.360 --> 01:33.160]  file explorer, using the taskbar and stuff like this.
[01:33.160 --> 01:37.400]  On the reference system, for instance, that with Microsoft specifies or on a reference
[01:37.400 --> 01:41.240]  system that we or the community specifies.
[01:41.240 --> 01:43.880]  And imagine then you can make could make informed choices.
[01:43.880 --> 01:51.480]  So by just saying, hey, I'm looking at Windows 10, and I see that it has 45 watts in idle.
[01:51.480 --> 01:54.120]  But apparently, my computer is mostly in idle.
[01:54.120 --> 01:58.600]  So it might be more interesting to use Ubuntu, for instance, which has just 20 watts in idle
[01:58.600 --> 02:00.800]  or desktop activity is even lower.
[02:00.800 --> 02:05.360]  So why not choose this operating system if energy is my main concern.
[02:05.360 --> 02:10.160]  And this is what I, what I cherish the most in the operating system, or which is an important
[02:10.160 --> 02:11.940]  metric for me.
[02:11.940 --> 02:18.280]  If you think this process even further, you can think about comparing energy of applications
[02:18.280 --> 02:24.480]  very specific, not only in the idle scenario or in one scenario, but in very specific usage
[02:24.480 --> 02:30.760]  scenarios that are ingrained to how people typically use such an application.
[02:30.760 --> 02:35.120]  What you see here is two radar charts on the left side is WhatsApp, and on the right side
[02:35.120 --> 02:36.120]  is Telegram.
[02:36.120 --> 02:38.200]  Please keep in mind that these are concept pictures.
[02:38.200 --> 02:44.800]  So this is not actually the energy that these application use for this use case.
[02:44.800 --> 02:49.800]  But let's say your use case is that you message a lot with an app, but you don't do that many
[02:49.800 --> 02:51.360]  video calls.
[02:51.360 --> 02:55.280]  So if you look then at WhatsApp, you see here that it has quite a high energy budget when
[02:55.280 --> 02:59.360]  it comes to messaging, whereas Telegram has quite a lower budget.
[02:59.360 --> 03:04.080]  Telegram is, however, very bad when it comes to video where WhatsApp could be, for instance,
[03:04.080 --> 03:05.080]  better.
[03:05.080 --> 03:09.720]  So let's say that you are mostly doing messaging with your application, and you would like
[03:09.720 --> 03:14.520]  to keep your battery life, or maybe use Telegram on the desktop, your desktop energy consumption
[03:14.520 --> 03:20.600]  low, then with such metrics, you could actually make an informed decision if WhatsApp or Telegram
[03:20.600 --> 03:26.160]  is the better app for you if energy is an important concern.
[03:26.160 --> 03:31.520]  And imagine as a developer, if you think even one step further, that you go to GitHub or
[03:31.520 --> 03:36.160]  to GitLab or wherever your software is hosted, and you look in the repository and you see
[03:36.160 --> 03:42.720]  right away with something like an open energy batch, how we call it internally, to see how
[03:42.720 --> 03:48.360]  much the software, you see it down here, how much the software is actually using for its
[03:48.360 --> 03:52.720]  intended use case that the developer of the software had in mind.
[03:52.720 --> 03:57.880]  So you can compare one software that maybe has very limited use case to another software
[03:57.880 --> 04:04.560]  or library, just by the energy budget, because you have the metrics so readily available.
[04:04.560 --> 04:10.760]  We actually try to build these tools, and I would like to take you in this very short
[04:10.760 --> 04:15.320]  time frame that we have been given by FOSDAM, so just about 20 minutes, I would like you
[04:15.320 --> 04:21.440]  to take a tour through our projects that we are doing, more as an appetizer, so you see
[04:21.440 --> 04:26.560]  what we are working on and what we think could be possible or a possible ecosystem in the
[04:26.560 --> 04:29.640]  future.
[04:29.640 --> 04:36.360]  You will be presented with a view that looks like such, so the green metrics tool, EcoCI,
[04:36.360 --> 04:38.840]  Open Energy Badge and Cloud Energy.
[04:38.840 --> 04:43.480]  So the green metrics tool is what I would like to talk about today, mostly, because
[04:43.480 --> 04:49.720]  I think it is the tool that outlines our concept of transparency in the software community
[04:49.720 --> 04:54.800]  the best, and then we'll talk about later about our approaches for CI pipelines or restricted
[04:54.800 --> 04:58.040]  environments like the cloud.
[04:58.040 --> 05:03.080]  So first of all, I think it makes sense, although I know people tend to hate diagrams
[05:03.080 --> 05:09.960]  or flowcharts to some degree, but I think it makes sense to quickly go over how the concept
[05:09.960 --> 05:13.440]  of the tool works from a high-level perspective.
[05:13.440 --> 05:18.440]  So in order to measure software, we follow the container-based approach.
[05:18.440 --> 05:24.000]  So we assume that your software is already in a containerized format or can be put in
[05:24.000 --> 05:25.360]  such a format.
[05:25.360 --> 05:29.840]  So for instance, even a Firefox browser, if you want to measure desktop applications,
[05:29.840 --> 05:33.080]  can be put in a container and be measured with our tool.
[05:33.080 --> 05:39.360]  Also machine learning applications, simple command line applications, but also web applications.
[05:39.360 --> 05:44.400]  Typically when you develop software, you already have infrastructure files like Docker files,
[05:44.400 --> 05:50.600]  Docker compose file, or even a Kubernetes file available, which our tool can consume
[05:50.600 --> 05:58.160]  in all fairness, Kubernetes is still a work in progress, but Docker files can consume.
[05:58.160 --> 06:03.640]  And then what the tool basically orchestrates the containers and attaches every reporter
[06:03.640 --> 06:06.400]  that you want in terms of measuring metrics.
[06:06.400 --> 06:12.000]  So here we are still very similar to typical data logging approaches like Datadoc does
[06:12.000 --> 06:14.480]  it for instance, or other big players.
[06:14.480 --> 06:20.600]  So the memory, the AC power, DC power, the network traffic, CPU percentage, CPU and RAM
[06:20.600 --> 06:27.680]  is all locked during the execution of what we call a standard usage scenario.
[06:27.680 --> 06:32.120]  So in the first couple of slides, I've shown you the concept of looking at software from
[06:32.120 --> 06:34.840]  how is it typically used.
[06:34.840 --> 06:39.440]  And people already have thought about this concept quite a lot when they make end-to-end
[06:39.440 --> 06:45.180]  tests with their software, because this is a typical flow that a user goes through in
[06:45.180 --> 06:49.920]  your application, or unit tests, which might be very reduced amounts of functionality that
[06:49.920 --> 06:56.160]  is tested in a block, or benchmarks that are already inside of the software repository,
[06:56.160 --> 07:01.280]  session replays, shell scripts, build files that basically measure where we could measure
[07:01.280 --> 07:02.280]  your build process.
[07:02.280 --> 07:07.440]  All of this is already available typically, and our tool can consume these files, will
[07:07.440 --> 07:13.960]  run these workflows and then tell you the energy budget over the time of this run in
[07:13.960 --> 07:14.960]  particular.
[07:14.960 --> 07:19.200]  This slide is more just, if you're not too familiar with Docker, the idea is just to
[07:19.200 --> 07:24.440]  have every service or every component of the application in a separate container, so that
[07:24.440 --> 07:31.000]  we can later on better granularize the metrics and better look at which component might be
[07:31.000 --> 07:36.960]  interesting to look at if you want to do energy optimizations in particular.
[07:36.960 --> 07:40.560]  When you use the tool, and I will just go quickly over that and then probably go with
[07:40.560 --> 07:44.520]  you through a live version of what we are hosting at the moment, you will get a lot
[07:44.520 --> 07:45.520]  of metrics.
[07:45.520 --> 07:50.560]  So you will obviously get something like the CPU utilization, or the average memory that
[07:50.560 --> 07:53.800]  was used, or maybe the network bandwidth that was used.
[07:53.800 --> 07:59.880]  But what is interesting for this dashboard, and basically it's USP, is that you get also
[07:59.880 --> 08:05.080]  the energy metrics from the CPU, from the memory, you get a calculation what the network
[08:05.080 --> 08:13.040]  has used in energy, and you get convoluted or basically aggregated values where it makes
[08:13.040 --> 08:18.520]  often sense to look at CPU and memory in conjunction, or it makes sense to look at all the metrics
[08:18.520 --> 08:23.800]  that you have available to get something like a total energy budget.
[08:23.800 --> 08:28.160]  Then you obviously can look also at the AC, so at the wall plugs, so not only what is
[08:28.160 --> 08:33.000]  your CPU and your RAM using, but what is the total machine using, or something that we
[08:33.000 --> 08:37.760]  have in our lab as a setup, you just look at the main board, so not on the outside of
[08:37.760 --> 08:42.120]  the PSU, so what is basically plugged in the desktop computer, but only the power that
[08:42.120 --> 08:45.160]  flows directly into the main board.
[08:45.160 --> 08:50.300]  And here you can see that our tool automatically calculates the CO2 budget based on the energy
[08:50.300 --> 08:54.520]  that it has used for this run.
[08:54.520 --> 08:59.680]  The tool also shows you which reporters have been used in an overview, and then it tells
[08:59.680 --> 09:04.960]  you a lot of charts, so this is a sample chart, and what the tool can basically give you is
[09:04.960 --> 09:10.000]  not only an overview capability, but also an introspection where you, for instance,
[09:10.000 --> 09:12.920]  are interested in the idle time of the application.
[09:12.920 --> 09:17.120]  So what is my application doing when no user is interacting with it?
[09:17.120 --> 09:23.460]  Is it actually using energy, and is this too much energy for my belief or for the belief
[09:23.460 --> 09:24.960]  of the community?
[09:24.960 --> 09:29.480]  So for instance, here we have an example of a setup, of a WordPress setup that we have
[09:29.480 --> 09:38.440]  done with an Apache, a Puppeteer container that runs Chrome, and also a MariaDB instance.
[09:38.440 --> 09:42.280]  And you can see here that here are a couple of requests that have been done to the WordPress
[09:42.280 --> 09:47.440]  instance, and then we are basically just idling, but still the web server is doing quite some
[09:47.440 --> 09:52.480]  work, and there have been no web sockets active, so why is there server and database activity
[09:52.480 --> 09:59.120]  here? Is this valid, is this maybe some caching, some housekeeping, or is this unintended behavior?
[09:59.120 --> 10:04.920]  We picture that our tool could highlight such energy hotspot or energy malfunctions, as
[10:04.920 --> 10:10.520]  we call them, to better understand how software uses energy.
[10:10.520 --> 10:16.440]  You can also look at energy anomalies, so we work sometimes with features like TurboBoost,
[10:16.440 --> 10:20.600]  which is typically not turned on in cloud environments, but very often for desktops,
[10:20.600 --> 10:24.520]  which brings a processor in kind of like an overdrive state so that it can react very
[10:24.520 --> 10:28.400]  quickly in a frequency above its normal frequency.
[10:28.400 --> 10:34.640]  However, what we have done here in this example, we have run a constant CPU utilization, but
[10:34.640 --> 10:39.720]  as you can see here, the CPU clocks at different frequency over the time, and sometimes it
[10:39.720 --> 10:43.600]  uses exponentially more energy for the same tasks.
[10:43.600 --> 10:50.200]  So it finishes quicker, but it uses more than only a linear amount more of energy to do
[10:50.200 --> 10:51.320]  the task.
[10:51.320 --> 10:55.600]  So this is a very interesting insight that our tool can, for instance, deliver when you
[10:55.600 --> 11:00.020]  try for energy optimizations of your software.
[11:00.020 --> 11:05.040]  So what is the whole idea that we have behind all this project? And let me move myself down
[11:05.040 --> 11:09.120]  here a little bit so you can see the full slide.
[11:09.120 --> 11:17.040]  We want to create an open source community or a green software community that focuses
[11:17.040 --> 11:22.440]  on the transparency of software so that you have basically an interface, which we call
[11:22.440 --> 11:29.200]  the usage scenario, where you can measure software against and then ask later on questions
[11:29.200 --> 11:33.800]  against a database or against an API, which has measured all these softwares, questions
[11:33.800 --> 11:38.800]  like how much does this software consume? Is there a more carbon friendly alternative,
[11:38.800 --> 11:44.480]  or is there a software that makes less energy requests, less network requests?
[11:44.480 --> 11:50.040]  The idea, if these softwares are available in your country, so Yucca to my knowledge
[11:50.040 --> 11:55.720]  is, for instance, from the US and code check is more like a German application, is we want
[11:55.720 --> 11:58.960]  to be the Yucca or the code check of software.
[11:58.960 --> 12:05.840]  So we want to deliver answers to developers where they can ask questions about the energy
[12:05.840 --> 12:11.640]  budgeting of a library, of a software, or of a functionality by providing a framework to
[12:11.640 --> 12:15.600]  make these measurements.
[12:15.600 --> 12:19.480]  So let me move up here again and then back to the slides.
[12:19.480 --> 12:23.560]  So let me show you our other tools that we believe are needed to build an ecosystem around
[12:23.560 --> 12:29.440]  green software because software is not only running in desktop environments or is not
[12:29.440 --> 12:34.080]  only on a single machine, it also runs a lot in the clouds, where these measurements that
[12:34.080 --> 12:38.640]  we have, and I would like to encourage you to read a bit on what sensors are available
[12:38.640 --> 12:45.440]  in our tool, but where these sensors are not available, which is for instance in the cloud.
[12:45.440 --> 12:51.120]  So let me bring up my browser again.
[12:51.120 --> 12:54.720]  So if you are on the homepage and you have seen the green metrics tool that I've just
[12:54.720 --> 13:01.000]  talked about, you'll also see that we have the cloud energy project and the EcoCI project.
[13:01.000 --> 13:08.440]  So EcoCI focuses on measuring the energy of software in a continuous integration pipeline
[13:08.440 --> 13:10.800]  that for instance runs in a virtual machine.
[13:10.800 --> 13:13.320]  Our focus is currently on GitHub actions.
[13:13.320 --> 13:18.360]  In order to estimate the energy in a virtual machine, because you cannot measure, you have
[13:18.360 --> 13:22.920]  no access to the wall plug in the data center, you have no access to sensors in the CPU or
[13:22.920 --> 13:27.440]  whatever, you have to estimate the machine based on measurements that you already have
[13:27.440 --> 13:29.680]  for the same hardware.
[13:29.680 --> 13:34.560]  If you click on cloud energy, you can see here that we have based our machine learning
[13:34.560 --> 13:40.380]  model on a research paper from InterACTC and the University of London, and they have basically
[13:40.380 --> 13:47.880]  taken the data from the spec power database, which is an open database for servers that
[13:47.880 --> 13:53.760]  have been measured just with a fixed workload to compare it against each other.
[13:53.760 --> 13:58.680]  And based on this data, we can create a machine learning model, which is also free and open
[13:58.680 --> 14:05.440]  source to use, that is just a Python tool, which you call with the information that you
[14:05.440 --> 14:06.440]  have.
[14:06.440 --> 14:10.040]  So let's say you have the information that your CPU is from Intel, that the frequency
[14:10.040 --> 14:15.080]  that you're running is 2.6 gigahertz, you have 7 gigabytes of RAM, and you know the CPU
[14:15.080 --> 14:17.040]  has 24 threads.
[14:17.040 --> 14:18.560]  But you don't know any more info.
[14:18.560 --> 14:25.880]  You don't know if it's a Skylake processor or a more modern internal processor.
[14:25.880 --> 14:30.080]  You have no more information because the hypervisor limits this to you.
[14:30.080 --> 14:34.080]  So if you give the model more information, it can give you more accurate estimates, but
[14:34.080 --> 14:37.160]  it can also work with the limited information in the cloud.
[14:37.160 --> 14:42.480]  And then it spits out to the standard out the current watchers that you have been using,
[14:42.480 --> 14:47.040]  and then you can reuse that in a tool that we build upon that.
[14:47.040 --> 14:50.280]  So now that you've understood that there is a machine learning model behind the idea,
[14:50.280 --> 14:52.960]  I would like to bring you to EcoCI.
[14:52.960 --> 14:58.320]  So EcoCI is a GitHub action that is based on the work from the Cloud Energy Project
[14:58.320 --> 15:05.440]  that can give you in a GitHub action the information of how much a CI pipeline has used in terms
[15:05.440 --> 15:07.160]  of energy.
[15:07.160 --> 15:15.120]  So if you go, for instance, to the GitHub repository, you can also go to the marketplace.
[15:15.120 --> 15:17.040]  So we go one step further.
[15:17.040 --> 15:20.560]  And here you can see you can directly use it.
[15:20.560 --> 15:22.360]  It is very easy to use.
[15:22.360 --> 15:27.520]  It just needs two calls to initialize a tool and then one more call whenever you want to
[15:27.520 --> 15:29.160]  get a measurement.
[15:29.160 --> 15:36.480]  And what it does for you, so let's quickly go to our repository where we actually use
[15:36.480 --> 15:41.720]  GitHub actions to measure every of our workflows in the tool.
[15:41.720 --> 15:46.760]  So we click on actions, let's say we go to manual test run virtual machine, we click
[15:46.760 --> 15:48.360]  on main.
[15:48.360 --> 15:51.840]  And you see here, I've run this run yesterday, it succeeded.
[15:51.840 --> 15:54.800]  So our log tells us, hey, all tests have to work fine.
[15:54.800 --> 15:57.920]  So to run a work point fine and also the API.
[15:57.920 --> 16:03.480]  And for this run in the Azure Cloud where GitHub actions runs as virtual machines, I
[16:03.480 --> 16:06.560]  have used 650 joules of energy.
[16:06.560 --> 16:08.640]  And you get a nice ASCII graph over time.
[16:08.640 --> 16:12.680]  We were a bit limited here in the graphs we can display in the GitHub actions overview.
[16:12.680 --> 16:16.680]  But you can see here at what point in time the energy, for instance, is the highest and
[16:16.680 --> 16:22.840]  then maybe look at the later tests if they, if you deem them to be more energy consuming
[16:22.840 --> 16:29.200]  than for instance at the start where it was using only a fixed amount of energy.
[16:29.200 --> 16:35.160]  So this gives a developer and also a user the information how much energy is not only
[16:35.160 --> 16:41.240]  the software using, but also the development of the software is it maybe using more than
[16:41.240 --> 16:45.440]  we want as developers or maybe even as a community.
[16:45.440 --> 16:50.800]  And these are all concept tools to just get a first start of what we, what we think could
[16:50.800 --> 16:58.080]  be possible, of what we think could be possible in a new future where software is basically
[16:58.080 --> 17:05.880]  measured and the data of the, of its usage is constantly published by developers also.
[17:05.880 --> 17:09.880]  The idea is then to have something like an open energy batch that is basically in every
[17:09.880 --> 17:15.600]  repository that tells you for this software and for this usage scenario that comes with
[17:15.600 --> 17:16.600]  it.
[17:16.600 --> 17:21.640]  So be it for instance running the tests or be it for instance building the containers
[17:21.640 --> 17:24.200]  or the intended use case of the software.
[17:24.200 --> 17:29.480]  So let's say the NumPy library of Python has an energy batch where it says, hey, for 1000
[17:29.480 --> 17:35.800]  times 1000 metrics multiplication, this software uses this amount of energy on the reference
[17:35.800 --> 17:38.920]  system that we have specified.
[17:38.920 --> 17:42.840]  And when you use the same reference systems to compare software against each other, you
[17:42.840 --> 17:47.640]  come to a scenario that we have basically shown from the starters in the first slides
[17:47.640 --> 17:53.560]  where you can basically tell is the one software more energy hungry than the other one comparing
[17:53.560 --> 17:56.680]  the same use case.
[17:56.680 --> 18:00.680]  So let me quickly get back to my slide deck.
[18:00.680 --> 18:02.600]  So let's wrap up.
[18:02.600 --> 18:05.800]  Measuring software energy consumption we believe is still too hard.
[18:05.800 --> 18:10.920]  The goal should be easy as starting a Docker container and it should happen transparently.
[18:10.920 --> 18:15.680]  Therefore we have created the green metrics tool which can reuse Docker files and infrastructure
[18:15.680 --> 18:19.360]  files to make it very easy to orchestrate your architecture.
[18:19.360 --> 18:24.400]  And then in a flow that you already have, be it a puppeteer file or be it just a shell
[18:24.400 --> 18:30.800]  script, you can run that with our tool just as a parameter appended and it will tell you
[18:30.800 --> 18:36.240]  how much energy has been used over this particular scenario that you feed in.
[18:36.240 --> 18:38.040]  Measuring software is also very complex.
[18:38.040 --> 18:43.400]  So this is what we have integrated best practices or tool like pausing between measurements,
[18:43.400 --> 18:49.480]  letting systems idle before you actually use them, turning functionalities like SGX off,
[18:49.480 --> 18:54.340]  looking at if TurboBoost is on and very more features.
[18:54.340 --> 18:58.480]  Just inline measuring like Datadoc or other providers are doing it at the moment, we believe
[18:58.480 --> 19:02.480]  is not enough and is too arbitrary to talk about energy.
[19:02.480 --> 19:05.760]  Software must be measured against a standard usage case.
[19:05.760 --> 19:12.080]  So we provide standard usage cases for software as an interface, but we ask you the community
[19:12.080 --> 19:19.840]  also or we need to see over time what are the standard usage cases we can all agree on.
[19:19.840 --> 19:23.680]  A software must be comparable to another similar software in terms of energy.
[19:23.680 --> 19:28.460]  This is why we need these standard usage cases to make it comparable.
[19:28.460 --> 19:34.040]  This also means it must be measured on reference machines that everybody has access to that
[19:34.040 --> 19:38.640]  we want to provide for the community as a free service.
[19:38.640 --> 19:42.240]  Energy metrics must also be available in restricted environments like the cloud.
[19:42.240 --> 19:46.760]  So I've talked about estimation models that need to be open source and available and for
[19:46.760 --> 19:48.680]  everybody to implement.
[19:48.680 --> 19:54.240]  And energy must be transparent and a first order metric and order in developing and using
[19:54.240 --> 19:55.240]  software.
[19:55.240 --> 19:59.880]  People should know before they use the software how much energy it is consuming.
[19:59.880 --> 20:03.800]  And this is what we are trying to achieve with the tools we are developing.
[20:03.800 --> 20:08.040]  I hope it could pique your interest in our work and in the tools we are developing, some
[20:08.040 --> 20:11.440]  as concepts, some already production ready.
[20:11.440 --> 20:27.920]  And thank you for listening and now I hope it's time for questions.