So, hello.
Let's start.
I think 50 minutes and all, so I will hurry up here.
And the previous slides and presentations, we saw the overall picture somehow, the crit thing.
And the last slide, we dig into one system somehow.
And in this slide, we also want to dig a little bit more in the details,
how to analyze the power consumption.
What we saw also in the previous presentation, there was an...
Sure, sure, sorry.
What we saw in the last presentation was that we saw the power consumption of one system,
a little bit similar to power supply that we saw this task consumes and what's whatever.
But the question what you often have is after this data, how you can optimize your load for your server,
for your embedded product somehow.
What are the causes why the application runs too often, the system runs too often,
cannot go into deep states, peace states and such things, right?
At the end, it's the hardware that consumes the power and you can save power if you put things into deep sleep states
or consumes the frequency somehow.
And this is really important to save energy.
And what we did in the past was writing scripts to optimize your workload
and get things what are the causes that an application runs too often, right?
It runs often, cannot go into deep sleep states.
This is important to do the power optimization.
And what I provide in the next couple of slides here is an application that helps you to optimize your workload
and makes this visible somehow.
So what we are talking about is a perf script and extension to perf.
So it's, if it is mainline, it's not yet mainline.
I will send this script to Analo in the mailing list and hopefully it gets merged quickly somehow.
But when it's merged then it's really usable, easy usable.
It's just an up-get install and everything works out.
And also for Yachter and Buildroot, it's really easy to use these things after.
It's also important that it can be used in embedded systems and everywhere.
How does it work?
It's just a record call where you record your workload with the workload separator, like every time.
And here I record for 60 for one minute, workload on all the CPUs.
And then you record everything one minute fine.
And then you start with the report, the power analyzer, and it's have different modes.
Because I have just 10 minutes, I show one mode here,
but there are different modes for different optimization, analysis, and things.
So what are the modes?
There are several modes that can be activated and used.
And you just activate or use the mode you want to focus and dig into the details, right?
This is how things work.
And what's also important, every mode has different trace points in the kernel.
So usually you record only the trace points you require for the particular analyzer.
Because if you record everything, every trace point, you get a lot of huge data and things.
So normally you limit the data.
How does it work?
So there's the per script.
As always, you can write a recorded data, as we saw for one minute here.
And it just records all the trace points that are required.
But on the other hand, you can also record the data that are required for your analysis.
This is documented, what the trace points are required.
Then you have the data.
And then you start the script, the reports, and here outputs all the analyzers for this.
You start it here with a timer.
So what are the timer events somehow.
And then because there's a lot of data coming out of this, you usually can use this data already
and see, here's something that's not working well, too much timer interaction, for example.
But what is it also can do is some post-processing and to create data graphs somehow or filter things afterwards
because it's a lot of data.
And here, just a showcase, this is one image that's created.
You see the time and you see a workload.
It's a logarithmic scale here.
How much time, timers are working.
Timers are one course that triggers from a deep C state to an active state to a zero state somehow.
Timers are not that good.
Often you see, if you begin analyzing things on your desktop CZ, you see here,
I think this is the kitty, my terminal I use here.
It has just wake-ups all the time.
Why are there wake-ups here?
And then you see often some buggy applications, clip-out things.
They are constantly triggering your system and this prevents to going into a deep C state.
This is the causes that prevents this.
So it's really important.
And here you see a workload I started and you see all the timers that are correlated with starting a workload.
Here you see a lot of kernel timers and then you can start optimizing things.
This is just a focus for the timer events, but there are a lot of other events as well.
This is other sub-sequence analyzes also just for the timer events.
You see here for a tick-less system, normally if there is no load,
kernel can really go in a deep sleep state.
And then it shuts down the timer tick altogether.
But does it really stop the timer tick?
You will see it here in these images and you can analyze things and optimize things.
What are the kernel timers that trigger your systems?
If you look at the graphs a little bit, the resolution is not that good,
but you see that there are timer ticks all the time,
and the network interrupts, timers are working here and you can optimize this.
If you see this and you know what's happened, what we see here in this graph is the timers that are working for each particular task.
So you can optimize for your task as well.
How many timers are there?
I often see in the production environment that the timer has done all the time somehow and not correlated.
What you can also do there are system calls for the granularity that the timer can optimize things.
For example, the kernel which the introduction of the HR timers, the resolution timers, you can align timers
so that timers are not really spaced there and they're exactly triggered at a particular moment in time,
which is a simple system knob.
You can also say, oh no, it's not so important that the timer is triggered at this time
so that the kernel aligns timers at a particular time and allows a deeper sleep state again, something.
This knowledge can be combined with the knowledge of this, what you see here, for example.
Where are the timers?
Right, CPU 0 is somehow special.
There are the timers.
Can you move, for example, tasks to CPU 1 so that this other CPU cores can go in a deeper sleep state, for example, right?
All this is important to do an optimization there.
There are some general options.
Some others are not always required.
This can be turned on with this particular flag.
There's CPU, often you want analysis on a particular CPU so you can limit the data.
And there's a file out option so if you want to do a post-processing, as we saw in the images, somehow the data is not put it on the standard out,
so it's put it on the file and you can use this there.
And the data is also written in a day and sanitized that you can trust through use partners here to read the CVS data.
And for the post-processing, it's really easy.
But there are multiple modules there provided.
This is just a sneak peek on the timer module, but there are a lot of other modules as well.
You can use them later on.
But to the time limit, I just highlighted this timer module.
But one last sneak peek here, for example, is the governor.
The governor is the component within the kernel to do the processing and commanding of disease deep-stakes.
This is the governor.
You can select a different governor.
It's normally the menu governor.
There are other governors as well.
And here what you see, how often is which C state is commanded here?
And what is also analyzed is, was this good or not?
Because the kernel doing a guess working, right?
So here the things are the next time in 10 milliseconds, there's a workload because the timer will trigger.
So it puts a processor in a particular C state.
But was this the right decision or sleep is too narrow, too shallow?
And so this is also important somehow.
And here you can debug the governor.
A student of mine also discovered a bug for the AMD stuff.
It's for one particular C1 state.
It's switched all the time to the wrong state.
But I think this will be released in the next couple of weeks somehow.
So it's really also important for you.
If you see, does the governor does the right job here?
This is visible with another analysis, but there are multiple other post processing steps.
And yeah, that's all.
I hope this will be integrated in the mainline next couple of weeks.
But if you want, you can use this kernel tree and this particular branch to use this.
It's just a perf script, really easy also to use out of the tree.
And this post processing scripts cannot be shipped with the kernel.
That's not how the kernel somehow works.
This Python scripts and there will be always available here based on this.
And at the end, good documented, hopefully somehow.
So yeah, that's all questions.
Yeah, perfect.
Questions.
I'm always getting a question.
Process of coverage, just x86.
What's the coverage of you got?
I mean, now look, I've got an M1 Apple thing.
Would I be able to run it off there if I run Linux on that hardware?
Yeah, this script will work on ARM x86 for Intel and AMD.
There are differences in the P state tracking because P state tracking is the introduction of Skylake and HWP with hardware tracing.
So it's will be not visible, but it will be visible on ARM CPUs.
For example, some as a sample work, some will not work, but it's just Linux and all the major.
And some are more software, the analysts of scheduling events.
Somehow it will always run, but more hardware like analysis will not work somehow.
But yeah.
Just a follow up for previous question.
Will it work for like, Graviton, all this kind of cloud proprietary processors?
Yeah.
It would generally run there.
If it at least Linux ARM and the processor and it will just be the same.
So no difference there.
Yeah.
So I think it's going to be a good idea to run it on the same hardware.
Yeah.
And there.
If it at least Linux ARM and the processor and it will just be the same.
So no difference there.
Another question.
If not later on we can install a script at your PC and test it.
Hi Aaron.
That's just a follow up on the previous question.
There's actually an extra library, LibOpen CSD, which gives you a whole lot of extra stuff on most ARM cores,
but not necessarily apples and Amazon's ARM cores, but any that actually come from standard designs.
So a Turing design here.
One goal was that it runs everywhere somehow, right?
It must be general.
And I don't, we don't skip going into the EPPF world.
So there are advantages to do things in the kernel to aggregation in the kernel.
So but this has sometimes problems with on specific ARM and PSOX and embedded products.
So the design was really that runs everywhere.
It's easy to use and generally available somehow.
Somehow EPPF working with EPPF things to in the kernel and process unwanted data there out has some advantages, right?
But you need a tool train then on an embedded product.
So it's not that great somehow.
And this everything I told you was somehow the idea on the design somehow or extra library.
Keep it a bit of minimal stuff which works everywhere somehow.
If you want to do more and often you want to do more if you analyze your particular task,
how is the scheduling behavior you need more and you need more custom scripting as well somehow.
But this is not here. I think it's a lot of data already there.
Easy available somehow.
But if you want to do more, you need more scripting and things like that and libraries you want to use.
Sure. It's a compromise.
Maybe a question for me.
Can you give us a few insights about the community?
How many developers, how many people contribute?
Currently I'm the main developer.
But at the end it's just in the Python script so it's not really the rocket science.
And there are students also working on this, help things and looking at the details.
But yeah, it's not that magic somehow.
It's just keeping things putting together and make them easy usable.
The trace points and Steven Roslatt and all the things, the infrastructure that the kernel provides are the main drivers.
That this is possible, right?
So just in script.
Thank you so much.