Plumbers and CKI hackfest highlights

This September, the CKI team got together for Linux Plumbers. There were a lot of great discussions and we couldn’t take part of every single one, but here are some highlights from the conference mainly related to CI, workflows and testing!

Plumbers discussions

Laura Abbott led the distribution kernel microconference. The main goal was to discuss common problems arising from maintaining a non-mainline kernel, whether it comes to keeping in sync with bugfixes, packaging or testing. A lot of the topics circled around leveraging the similar work everyone is doing and figuring out common tooling and some discussions even lead to the distribution testing BoF.

One of the great ideas mentioned was configuration fragments in the mainline – if you want to enable a feature you have to find out what the specific set of config options to enable is. Having the options condensed based on the features in the mainline kernel itself would make this enabling/disabling simpler and cleaner for everyone (users wanting to compile their own kernels, CIs, distribution maintainers…). The initial action plan is to send various feature fragments to the automated testing mailing list to compare and discuss specifics before pushing them upstream.

Another action item directly related to the previous one was to update upstream configuration file merging. Each distribution has its own implementation, usually in a form of legacy perl script everyone is afraid to touch but depends on heavily. Using the upstream version (once it provides the functionality distributions need) instead of maintaining scripts on top of kernel would both simplify the work of maintainers and provide an easier way to build their own kernels to users.

The arguably most important was the testing discussion. Distributions usually carry some patches on top of the mainline or stable releases they follow but these differences are small enough that comparing test results can help with pinpointing bugs in the base. This is however easier said than done as figuring out which tests to run is hard. The suites are not linked from the actual kernel sources, tests may be failing because of kernel or test bugs, you want a stable test release but also tests for new kernel features…

The obvious result from the testing discussion was closer collaboration with test maintainers as they know their tests the best and can help with failure debugging as well as test updates. Another very important outcome was one that was later presented at the maintainers summit, and that is linking test suites in the maintainers file. This would make the lives easier not only for distribution maintainers trying to test the newest kernels but also for any beginner contributors to kernel. The idea was positively received and as it’s already being implemented in maintainers entry profiles!

We couldn’t make this post without mentioning one of the most thought provoking talk by Dmitry Vyukov about development, testing and workflow reflections. Definitely watch the talk once recordings are available! The most problematic situations around kernel were called out – why does it take such a long time to fix security issues in stable releases, lack of testing in general, missing CI, changes introducing new bugs as the developers don’t know where to find the right tests to run, too high entry bar for potential developers, reviewers not knowing what revision/tree to apply the patches to, lost patches and bug reports, … While some of us are aware of these deficiencies and actively work on improving the situation (see previous paragraphs and - spoiler - look out for an important kernelCI announcement late October!) a lot of core developers and maintainers were still in denial and this talk definitely served as an eye opener. The sparked discussions continued in the maintainers summit and resulted in the creation of a new workflows list where people are welcome to share their ideas on making the development process and tooling just a bit more simple and shared across the subsystems. Let’s hope these conversations actually lead to a simplified and more robust development (and testing) process and we don’t have to repeat them next year!

Another highlight of the testing talks were improvements to kselftests to make them work properly in CI. LKFT is already running the test suite and we were suggested to do the same. We’ll look into enabling the tests once we finish up some packaging related functionality for upstream kernels.

And lastly, we also had a talk about CKI! We described the high level overview of the design and the value we bring and then dived in the implementation details. I’m glad to say the talk was followed by a few hours of discussions with people interested in our project, both asking questions and offering suggestions.

CKI hackfest highlights

As you may have noticed based on the previous posts, we also organized a hackfest after Plumbers. The purpose was to discuss all the CI related workflows, unify them where it makes sense and collaborate on implementations. Thanks to everyone who participated, it was definitely a great success! I won’t spend much time on the details as you can read them in the Google doc with notes (huge thanks to Major for writing down everything!).

All the discussions essentially boiled down to one thing - having a common place for all test results for upstream kernels. This is something I brought up a while ago on the automated testing list and while it gained some traction nothing really moved. Until now, when everyone realized this really is the first step we need to take in order to move forward. There is already a PoC and people are working on setting up a “production” database and pushing some data there.

Based on the data format and fields used we can then properly standardize the schema which could be used by any CI system willing to publish their data. We can also observe the test name and metadata formats and standardize those too. Doing any of these things before actually seeing real data doesn’t make sense as we wouldn’t see what we’ve totally forgotten about and is missing from the standards. Talking about test standards, documentation of kernel selftests result format is also planned. As this format is a superset of already existing result format, it should cover all needs kernel tests have and we can then work on migrating other test suites to it.

Likely the most important was the topic of reporting, as running tests has no value if there’s no one paying attention to the results. We received valuable feedback from kernel maintainers who joined us. The discussion ranged from what data should the email report contain to report customization per person. We again agreed that the common dashboard (and eventually a common report) is a must have here – people don’t need multiple reports saying the same thing. The dashboard link would also enable us to execute longer-running tests (such as performance test) without blocking the initial report, as people can just click on the link and follow as new results come in.

Final thoughts

See you at ELC/ATS in a month for an important announcement!