First talk is by Uwe.
How to win first place in the kernel patch statistics.
Good morning.
Soundcheck seems still good.
I'll talk to you about how to get many patches in the kernel.
The starter for the talk is the LWN patch statistics.
That is presented after each kernel release where you get statistics.
But actually this shouldn't be your motivation to get patches in the kernel.
This is just a nice side effect.
But it was a good starter for the talk.
First about me and my employer.
I'm Uwe Kleinekunig.
I work at Pangotronics as a kernel engineer since 2008.
I have several jobs in the kernel.
I'm a PWM maintainer.
But I already contributed patches through all the kernel subsystems.
You can reach me via IRC and PGP if you have questions after the talk.
If you are interested in the tools I present,
I didn't create a repository for that.
If you have questions or want to use the tools,
just contact me.
My email address isn't listed here, but you should be able to Google it.
Pangotronics is a company that exists a bit longer than I'm with them.
We're doing embedded Linux consulting, mostly for German industrial customers.
In the kernel, my colleagues and me,
we have several, we're listed several times in the maintainer's file.
So we're working with our customers also in the mainlining business.
We're selling them that mainlining is a good idea.
Yeah.
So if you have a good idea of what to change in the kernel,
this is the process you have to work with.
You put your changes in the end in a mail
and send them to subsystem specific mailing list.
Then ideally you get prompt review by the maintainers who are responsible for the code.
Then the patches are picked up and sent to Linux Torward through in the end,
creates a release from it.
If you have a big series, you have to apply the same things.
You have to do for single patches too.
This is the usual or a short list of things you have to care for.
These are not very hard rules, but this is what I think is the sensible set I use next as a base.
Next is the integration tree for the upcoming kernel release.
This is a good idea because if you send patches based on what is in Linux Torward's tree,
you often get feedback that there is already some development happening
and that your patch doesn't apply, so you have to rebase.
If you use next, this is minimized.
Even if you think you are a good kernel developer and you don't do the beginners mistake,
use check patch. This is a small Perl tool that catches the obvious errors you can do with your patches.
You forget your sign off or there are spelling mistakes and things.
It's much nicer to get these things said to you by check patch then if you send the patches out
and people tell it to you.
The same applies for build testing.
Do build tests ideally on several architectures because even for trivial patches,
it's quite easy to break the build.
The same reasoning as with check patch.
For single patches, it's good to describe the change as well.
The idea is that you want the maintainer to understand your motivation and the things you are changing.
You want to make it easy for them to apply the patches to understand the benefit.
This is still more important if you do massive patch sanding
because you are adding much more burden to the maintainers.
Also, addressing the right people.
You don't want to miss the important people obviously,
but you also don't want to annoy the others.
I once sent a 600 patch series to the kernel mailing list and several people were annoyed.
Don't repeat that.
To get a big project, you have to pick something that applies to many drivers.
What I did in the past is the remove callback for SPI drivers returned an integer,
but that value is ignored by the core,
which resulted in many drivers returning an error code
in the expectation that there is some error handling in the upper layers,
but which is wrong and this resulted in several resource leaks.
The same for platform devices.
This is my current quest, which is a bit more massive
because there are more than 2,000 platform drivers that I have to touch.
I am in the middle approximately, so there are still a few more patches to come.
I have a few further ideas, but I will come to that when I am gone with this quest
because doing more than one such quest at one time is really hard.
Usually it is not hard to find something new to patch.
If you touched all platform device drivers, you have seen quite some stuff
and there is always something you can fix.
What is very helpful to generate the patches is the tool Coxinell.
It allows to describe a patch in a very high level form where you can...
For example, this is a small version of a patch where I first try to identify...
...platform drivers that have a remove function that does not return zero,
which is the first step before converting them to return void.
Maybe I can...
The syntax is just that you say, OK, I have any expression that is not zero
and I just want to patch that in all remove functions of a platform driver,
changes the return value from that non-zero value to zero.
This is just to find the drivers that are affected by the quest.
It is very hard to create a Coxinell patch that does the right thing for all drivers already.
There is always some handwork that you have to find.
For example, for indention, which is usually get wrong by Coxinell.
With Coxinell, you don't have a tree where all drivers are adopted in the end.
If you have 2,000 affected files, you don't want to commit by hand.
You have to apply some shell scripting to make a commit for each file,
which I think is the right thing.
For some maintainers, they prefer to convert all their drivers in their subsystem in a single patch.
But at least for sending it out and for review, it's easier to have one patch per driver.
What I then do is I iterate overall changed files and commit it.
The challenge here is to pick the right subject prefix.
In the first approach, I just put the file name here.
Then I go several times over my branch and use Filter Branch to adapt the subject prefix.
This is depending on the subsystem, how they want it, if they want a capital or a small letter here,
and if the separator is a colon or a hyphen, you have to check all the commits for the subsystem to get this right.
I have a script that I keep in a scratch file.
You see a short part of it where for some common drivers, I can adapt the subject prefix accordingly.
This is much quicker than doing it by hand.
Then here comes my usual workflow for formatting the patches in a mail and sending them out.
This is the usual format patch call.
I always put it in a sub-directory that I always call w. I don't know what it stands for.
Then I have a script that I pass all my patches.
I'll come to that in a moment.
I edit the cover letter, which is a quite important part of patch theories,
where you have to describe the overall idea of what you want to do and to show the benefit of the patch theories.
This is the, I think, or I hope, the first thing that people will read about my patch theories.
This has to be a good description to, again, make it easy for the maintainers to pick it up.
Then I edit the list of recipes.
I edit the recipes to the individual patches and send it out.
Then what is a critical thing for tracking later,
if I note for every patch that I send out in the commit, the message ID I used to send the patch out.
This is important later.
If your patch doesn't get applied, I can quickly find the conversation in my mail client to send a ping or to ask what's up.
Then I put the commits I sent out in a dedicated branch near the top to track all the patches I have already sent out.
This is the L file.
This is generated using the getMaintainer script,
which helps you to identify the interested persons for a given patch.
This is, well, in the end, it's shell script.
Usually you really have to adapt the list of people.
For example, if I send out a patch series adapting several SPI drivers,
I usually want that the SPI maintainer takes the patch series as a whole.
What getMaintainer gives you, however, is that for the, or in this case, for the upmail PWM driver,
that the upmail maintainers are listed as contacts.
What I do then here, of course, one step back, this address append is another script that takes a list of patches
and adds the people listed with minus t to two, to the two header in these patches,
and the addresses on persons with a pass to option c to cc.
So what I usually do for, well, this is a PWM series now,
what I usually do is I replace with editor magic all minus t by minus c to first have them all on the cc list,
and then I change individual lines back to minus t to just address the maintainer.
And then I have a longer VIM command here to fix the syntax because currently it doesn't work.
I start a quote here and I have the descriptions from getMaintainers here in the end,
and this command above just throws away the parented expression and adds the closing quote.
And then I can execute it and have all the people in the right mails.
And what I'm doing here also is for each patch, I add the cover letter here to ensure that each person or each list that gets a patch
also receives the cover letter to give the right context.
This is also important if you have a patch series where you have dependencies where I introduce a helper in the first patch,
and then this is used in the second. It's a good idea to also at least carbon copy the recipe for the second patch
with the patch that introduces the helper such that they can easily understand the second patch.
Here is a short snippet of my git config which is important for sending out or which I rely on.
One is I blind carbon copy me all patches to make sure that I have all patches I send out in my index
to be able to reply later to it.
This is a good idea if you get sent email. This makes git send email ask before sending out each mail,
which is if you have a big folder of patches, you don't want to accidentally send it out.
So this gives you a chance to look again over the list of recipes and abort maybe if there is a problem.
This setting is important for the nodes I added to the commits I had here.
If I rebase them to be included in my tracking branch, the information doesn't get lost, so the nodes are copied on rebase.
For sending patch series out and addressing the right people, it's beneficial to add one series per subsystem.
That means not less. Don't mix several subsystems in a single series and also don't send several series with a similar
or the same topic to the same subsystem. This is maybe a bit subjective.
Some people, for example, NetDev is an example, they say don't send big series if you have say 30 patches,
better use two or three series. It's a bit of experience to know this, but in general it's a good idea to do one series per subsystem.
To save time and communication overhead, it's a good idea to be explicit about the expectations, how your patch series should be merged.
For example, you can write into it, I expect this series to be taken by the SPI maintainer as a whole,
even if there are maybe one or two patches which doesn't fix this topic.
This doesn't have to be fixed and people can disagree, but this is better than if you have, you get no feedback and you don't get your series applied,
and then you have to ask who will apply.
So state your idea and, yeah, such that people can know what you think should be the best path.
Another good idea is a slow start. What I mean by that is if you have a patch quest and you have to address drivers in 50 subsystems,
don't send them all out at once. Start with the first one, pick something actively maintained,
and then take the feedback to improve what you send to the other subsystems.
So first send out one and then you can slightly increase your speed.
But the effect is that you get better descriptions, people ask questions what they don't understand,
and you can improve on what you write to the next maintainers.
Good. I already presented you, I have a branch for all patches in my quest.
I base this on the latest RC1 release. This is a bit smoother than basing it on next,
where there's much more movement and it's easier to rebase from one RC1 to the next RC1,
because this is all linear and you know what patches are really in.
Occasionally it happens that you get a patch into next and it is dropped again,
and in such cases you would lose patches, because they fall out of your tracking branch
as they are included below, and then if you rebase the next time they are just missing.
My tracking patch looks as follows, it's somewhere down below, there is the RC1 release,
and then I have all the patches I sent out, and the few top commits is a collection of the remaining drivers that I have to adopt.
This is one commit for all remaining patches, for all remaining drivers.
In this case it's two such commits, because some drivers are a bit more complicated,
they are not correctly adapted by Coxinell, so I track them separately to be able to take the necessary care.
The top commit is where I rely on all platform drivers being converted,
where I change the remove callback to actually return void, which is only possible after all changes are made.
So it's the top commit to have the series still bisectable.
What is really useful is the cherry command line parameter to lock, which marks all patches with a plus or an equal sign,
and the difference is that the patches marked with an equal sign are already included in the left hand side of the expression here.
So the mailbox patches were already applied in next, but not in the last RC1 yet, so they are still included in my branch,
and the Macintosh patches are not yet included, so they are plus the work in progress patches,
obviously I also have a plus bit, but that's less important.
There's a similar option cherry pick, which has the effect that it lists only the patches that are marked with a plus in this syntax.
This is the one I usually go through if I want to track which patches need some more care,
which need a pin to make the maintainer act on them.
I have this below each patch, ideally I have this marking I added that I already talked about in an earlier slide,
and with not much, which is a full text mail indexer, it's quite easy to open a mailbox that contains the mail with the given message ID,
and all the mails in the same thread.
So if I open the virtual mailbox, well the thread belongs here actually, this is broken in a strange way,
I see the patches I sent out, and in this case I can see, okay there was no reply,
maybe it fell through the cracks at the maintainer or I added the wrong person,
in this case I see it's also nearly a month old, so maybe it's time to send a ping and ask are there any problems or what's the state of the series.
This is very useful to have an easy connection from the git commit to your mail,
and not much integration of much really helps here.
Occasionally it happens that you get feedback where you have to adapt things where things are not so optimal.
In this case B4 is a great tool that I really recommend to use, even if you're not a maintainer, it's quite handy to collect the reviewed by and the act by text.
Occasionally it happens that you already did some restructuring of your branch,
and then git range div is very useful where you can compare the two different histories,
the one that you already adopted, and the one recreated by B4 based on your previous submission,
where you can see the difference where are texts, where are no texts, where did you change the code,
and this really helps to create a single series that has all the improvements that you created on both sides.
This is what I wanted to present to you.
If you have questions either here in the forum or later, don't hesitate to ask me,
or after FOSDEM contact me by email or ISE, send me your questions.
I'm happy to help you for your next quest.
Thank you.
We have time for questions.
I don't know who was first.
I was looking at my send emails and have seen a lot of my patches sent to you
because of GitMaintenors collecting your address often.
Is it a challenge for you now to deal with all these emails because GitMaintenors will do to your many commits everywhere,
often collect your email address?
This is indeed an effect I wasn't aware of.
If you attached all 2,500 platform drivers, you'll get a massive amount of patches in the next few releases.
It's not very helpful to...
If you submit it, it's not helpful to send patches to the person who just do cleanup on the driver
who don't have the real interest in this driver, which also applies to me.
I don't have interest in some obscure IDE driver that I just touched
because it happens to be a platform driver and I changed the remove callback.
On the other hand, it's also really hard to separate the list of people you get from GitMaintenors.
Don't hesitate to keep me in the list. I'm very good at ignoring emails.
I just archived some and it's quite usual to...
If you send patches to, say, 10 people, that you don't get feedback from at least 9 of them.
So this is life and I have a very big mailbox, but I usually can handle it.
Thank you for your presentation.
You have described your send workflow and you're not using B4.
Have you talked to Konstantin, who has developed B4, because you have some special needs about CC and tool handling and cover letter and so on?
No, I didn't. Mark already knows. I don't use B4 because with B4 you cannot individually change the recipients
for the patches in the series.
So what I like to do is if I have a series that touches, again, SPI drivers,
I don't want to send the patch touching the iMix SPI driver to the Atmel SPI driver maintainer.
So the list of persons is really hand-picked, which patch is sent to which parties.
And with B4, at least last time I checked, you can only define the recipients globally.
So you have to send all patches to the same set of people.
No, I didn't talk to Konstantin.
I have little motivation to do that because my patchwork works and I think it's a bit special for these big series
and I'm not sure that there's a big benefit for extending B4 to that because for most people it's the right thing what B4 does.
And the added flexibility for my use case results in a complication for tracking and for usage for all people which is questionable, I think.