Full picture of CKI merge request testing

Start here if you are new to the merge request workflow

CKI can test kernels living in GitLab as well. Currently this setup is only deployed for GitLab.com, and as such this guide will reference it directly. Administrative setup description can be found under the documentation for pipeline triggering, these docs focus on the kernel engineer side.

Contacting the CKI team

Throughout this guide, contacting the CKI team is mentioned several times. You can do so by tagging the @cki-project group on gitlab.com, sending an email to cki-project@redhat.com or pinging us on the #kernelci IRC channel. Please do not contact individual team members of the project!

Trusted contributors

If you are a project member, you are considered a trusted contributor. As a trusted contributor, you get access to trigger testing pipelines in specific CKI pipeline projects. The created pipelines will be accessible from the merge request. Configuration options for the pipeline can be overriden by changing the variables section of the .gitlab-ci.yml file. This can be useful if e.g. your changes modify the target to create srpm, like kernel-ark switched from make rh-srpm to make dist-srpm. Check out our configuration option documentation for the full list of available options.

External contributors

If you are not a project member, you are considered an external contributor. In this case, a bot will leave a comment explaining which project membership is required to be considered a trusted contributor.

Limited testing pipeline

The bot will trigger a limited testing pipeline with a predefined configuration and link it in another comment. The comment with the pipeline status will be updated as the pipeline runs. The bot will create a new pipeline after any changes to your merge request. Changes to the gitlab-ci.yml file are not reflected in the pipelines created by the bot due to security concerns.

Configuration options used for external contributors' pipelines are stored in the pipeline-data repository and any changes to them need to be ACKed by either kernel maintainers or CKI team.

Full testing pipeline

For external contributors, the full testing pipelines will fail with permission issues:

Example of failed pipeline due to permissions

Two common messages are Downstream project could not be found (in case you don’t have permissions to even view the pipeline project) and No permissions to trigger downstream pipeline (in case the project is public and thus you only lack trigger permissions). These messages can be seen e.g. when hovering over the failed trigger job in the pipeline view.

Reviewers can trigger a full testing pipeline by going into the “Pipelines” tab on the MR and clicking the “Run pipeline” button.

This will include any changes to the gitlab-ci.yml file from the merge request. In the background, GitLab’s feature that allows running pipelines for MRs from forks in the parent project is used. Because of that, the following warning needs to be acknowledged:

Warning dialog when triggering a full pipeline for an external contribution

A successful full pipeline is required in order to merge the MR.

What will a high level CI flow look like?

If you want any kernels built or tested, you have to submit a merge request. This is used as an explicit “I want CI” request as opposed to testing any pushes to git branches, to conserve resources when no testing is required. Merge requests marked as “WIP:” or “Draft:” are also tested. These are useful e.g. if your changes are not yet ready or are a throwaway (e.g. building a kernel with a proposed fix to help support engineers). The prefix indicates to reviewers they don’t have to spend their time looking into the merge request.

Once you submit a merge request, a CI pipeline will automatically start. If you want to, you can follow the pipeline along (e.g. by clicking on the “Detached merge request pipeline” link on the merge request web page) to check progress. The MR CI process utilizes multi-project pipelines, and thus you’ll see a very condensed view.

Multi-project pipeline view

To view the full pipeline, click on the arrow on the right side of the box that says “Multi-project”.

Full Multi-project pipeline

Note that you may have to scroll to the side to see the full pipeline visualisation!

Full testing will likely take a few hours. If the pipeline fails, you’ll automatically get an email from GitLab. This can be changed in your notification settings. The pipeline can fail for different reasons and most of them require your attention, so we suggest to check the status periodically if you decide to turn the emails off. In case you opt to receive the emails, this is how the email will look like:

Failed email example

Notice the clickable pipeline ID in the email (“Pipeline #276757970 triggered” in the example screenshot). Clicking on it will bring you to the condensed multiproject pipeline view, as shown in the first screenshot in this section. You also get links pointing to your merge request ("!455" in the example) and the commit which the pipeline was triggered for (“5ea6959d” in the example). You can get to the pipeline from these links as well, however there’s no need to use them since the direct link is available.

CKI team has implemented automatic retries to handle intermittent infrastructure failures and outages. Such failed jobs will be automatically retried using exponential back off and thus no action is required from you on infrastructure failures. In general, any other failures are something you should check out. Follow the information in the job output to find out what happened and fix up the merge request if needed.

Testing of merge results

The CI pipelines test merge results, i.e. the result of merging the merge request into the target branch.

For each push, GitLab performs a shadow merge with the target branch in the background. A notification in the merge request GUI alerts if that resulted in any conflicts. This shadow merge will not be updated if the target branch changes. If there are any merge conflicts, the pipeline will still start, but the merge stage will fail with an appropriate error message in the logs. In that case, please rebase your merge request.

Real time kernel checks

In some cases, there will be a second pipeline named realtime_check available for the merge request as well. The purpose of the pipeline is to check if your merge request is compatible with the real time kernel. Failures in this pipeline are not blocking. They are only an indicator for the real time kernel maintainers that either their branch is out of sync with the regular kernel, or that they will run into problems when merging your changes into the real time tree. You can ignore the realtime_check pipeline, the real time kernel team will contact you if they need to follow up.

Regular and real time kernel pipelines

If the regular pipeline passes but the real time kernel check fails (as shown in the screenshot above), you will see a different marking in the GitLab web UI of your merge request:

Allowed to fail

Automation checking the bare pipeline status is not affected by the realtime_check failure.

Debug kernel testing

CKI does not build or test debug kernels for MRs. The functionality is implemented, however it is currently blocked by the maximum artifacts size on the instance (1G on gitlab.com) which is not sufficient to store a debug kernel repository. The CKI team is planning on utilizing an external artifact storage to work around this limitation, however that requires a significant amount of work. For now, please use build systems like Koji/Brew/COPR to build debug kernels. We’ll update this page when the functionality is available.

Testing outside of CI pipeline

Not all kernel tests are onboarded into CKI and thus they can’t run as part of the merge request pipeline. Please consider automating and onboarding as many kernel tests as possible to increase the test coverage and avoid having to manually trigger the tests every time they are needed.

For testing outside of CI pipelines, developers and QE and encouraged to reuse kernels built by CKI. Kernel tarballs or RPM repository links (usable by dnf and yum) – depending on which are applicable – are provided in the publish job output.

The artifacts are available for at least 6 weeks after job completion.

There are two URLs linked in the job output:

RPM repository linked in the job output

The first URL links to the job artifact page in the GitLab web interface. Access might be limited by GitLab authentication and require credentials. The second URL links to a yum/dnf repo file that does not require any authentication. Both links are also provided to QE in the Bugzilla associated with the MR.

UMB messages for built kernels

CKI is providing UMB messages for each built kernel. No extra authentication to access the provided kernel repositories is needed. Note that merge request access (e.g. to check the changes or add a comment with results) will require authentication and you may want to set up an account to provide results back on the merge request. If you’d like to start using the UMB messages or have any requests about them, please check out the messenger repository or open an issue there.

Note that no UMB messages are going to be sent for any embargoed content as doing so would make the sensitive information available company wide. If you are assigned to test any sensitive content, you need to grab the kernel repo files from the merge request pipelines (publish jobs) manually.

UMB testing will also support extended test sets. These are meant to substitute scratch build testing done by QE/CI and will not run automatically but an explicit request will be required. Please coordinate with your QE/CI counterparts and CKI if you want to enable such testing.

Testing RHEL content outside of the firewall

There are cases where RHEL kernel binaries need to be provided to partners for testing. As private kernel repositories are only available inside Red Hat, manual steps similar to the original flow using Brew are needed here.

To download the binaries, open the publish for the right architecture(s). On the right side, there will be a button to download the job artifacts (see the screenshot of this area in the section below). This will get you a zipped local copy of all the artifacts, including information about the internal pipeline state, diff of the kernel changes and, of course, the kernel repository. You may now unzip the archive and share the directory containing the kernel repository (or the specific binaries from the directory) the same way you did previously.

Targeting specific hardware

CKI automatically picks any supported machine fitting the test requirements. This setup is good for default sanity testing, but doesn’t work if you need to run generic tests on specific hardware. One such example is running generic networking tests on machines with specific drivers that are being modified. The team is working on extending the patch analysis tool to add device and driver filters based on kernel changes, so that targeted hardware is picked where applicable inside the CKI pipeline. Note that this mapping needs to be added to specific release/driver/architecture cases manually as first it needs to be verified that machines fitting all filters are available for CKI use. If you feel a commonly used mapping combination is missing and would be useful to include automatically please reach out to the CKI team or submit a merge request to kpet-db. You can find more details about the steps involved in adding the mappings in the kpet-db documentation.

Debugging and fixing failures - more details

To check failure details, click on the pipeline jobs marked with ! or a red X. Those are the ones that failed. You’ll see output of the pipeline, with some notes about what’s being done. Some links (like with the aforementioned kernel repositories) may also be provided. Extra logs (e.g. full build output) will usually be available in the job artifacts. Check those out to identify the source of the problem if the job output itself is not sufficient. In the web view, job artifacts are available at the right side.

Links to retrieve artifacts

Job artifacts

Fix any issues that look to be introduced by your changes and push new versions (or even debugging commits). Testing will be automatically retried every time.

As mentioned previously, no steps are needed for infrastructure failures as affected jobs will be automatically retried. Infrastructure issues include (but are not limited to) e.g. pipeline jobs failing to start, jobs crashing due to lack of resources on the node they ran on, or container registry issues. If in doubt, feel free to retry the job yourself. You can find the retry button right above the artifacts links, as shown in the screenshot above.

Note the retry button doesn’t pull in any new changes (neither kernel, nor CKI) and retries the original job as it was before. This is to provide a semi-reproducible environment and consistency within the pipeline. Essentially, only transient infrastructure issues can be handled by retrying. To pull in any new changes, you need to trigger a new pipeline by either pushing to your branch or clicking on the Run pipeline button in the pipeline tab on your merge request.

Test failures

Test logs are available in the test stage jobs, as well as linked in the DataWarehouse (see subsection below). If you’re having issues finding out the failure reason, please reach out to the test maintainers. Test maintainer information is directly available in both of these locations. Name and email contacts are be available for all maintainers. If the maintainers provided CKI with their gitlab.com usernames, these are logged as well so you can tag them directly on your merge request.

Example of test information in DataWarehouse, linked from the result checking job:

DataWarehouse test details

Note: If you are working on a CVE, evaluate the situation carefully, and do not reach out to people outside of the assigned CVE group!

Examples of CVE workflow - output in the test jobs:

Waived failed test information in the job logs

In the example above you can see the test status, maintainer information and direct links to the logs of the failed test. In this case, the test is waived and thus its status does not cause the test job to fail. Tests which are not waived say so in the message and the message is in red letters, as you can see in the example below. The rest of the message is same.

Failed and not waived test

Known test failures / issues

Tests may fail due to already existing kernel or test bugs or infrastructure issues. Unless they are crashing the machines, we’ll keep the tests enabled for two reasons:

  • To not miss any new bugs they could detect
  • To make QE’s job easier with fix verification; if a kernel bug is fixed the test will stop failing and it will be clearly visible

After testing finishes, the results and logs are submitted into DataWarehouse (our testing database) and checked for matches of any already known issues. If a known issue is detected, the test failure is automatically waived. Pipelines containing only known issues will be considered passing.

It’s both test maintainers' and developers' / maintainers' responsibility to submit known issues into the database. If a new issue is added to the database, all previously submitted untagged failures are automatically rechecked and marked in the database (not in the existing CI pipelines!) if they match, and the issue will be correctly recognized on any future CI runs. Please reach out to the CKI team for permissions to submit new issues.

Due to security concerns, only non-CVE testing has integrated known issue detection. If you are working on a CVE, you have to review the test results manually. The test result database is still available if you want to check for known issues yourself. There will be a clear message in the job logs stating so if that’s the case as well.

Overview - result check stage

This stage contains a single job which queries DataWarehouse for results and prints a human friendly report. Below is an example of a passing report, where the test run only contained known issues (which are automatically waived):

Passing known issues report

If your test run fails and you are convinced the problem is not related to your changes – or check with the test maintainers who confirm it – a new known issue related to the run can be submitted into DataWarehouse. After this is done, the result checker job can be restarted. If the only problems with the run are now known issues, the pipeline will be marked as pass. There is no need to rerun the full pipeline or even the testing!

Blocking on failing CI pipelines

No changes introducing test failures should be integrated into the kernel. All observed failures need to be related to known issues. If a failure is not a known issue it needs to be fixed before the merge request can be included. If it is a new failure, but unrelated to the merge request, it should be submitted as a known issue in DataWarehouse (see section about known issues above).

CI pipeline needs to be changed

  1. Your changes are adding new build dependencies. New packages can’t be installed ad-hoc, they need to be added to the builder container. If you can, please submit a merge request with the new dependency to the container image repository.

  2. Your changes require pipeline configuration changes. This is the case of the make dist-srpm example from above. Check out the configuration option documentation and adjust the .gitlab-ci.yml and .gitlab-ci-private.yml files in the top directory of the kernel repository.

  3. Your changes are incompatible with the current process and the pipeline needs to be adjusted to handle it, a new feature is requested, or a bug is detected. This should rarely happen but in case it does, please contact the CKI team to work out what’s needed.

If you need any help or assistance with any of the steps, feel free to contact the CKI team!

Customized CI runs

CI runs can be customized by modifying the configuration in .gitlab-ci.yml (or .gitlab-ci-private.yml in case of CVE work) files in the top directory of the kernel repository. The configuration option documentation contains the list of all supported variables with information on how to use them.

This can be useful for faster testing of Draft merge requests. Unless these changes are required due to kernel or environment changes (see section above), they should be reverted before the merge request is marked as ready.

Most common overrides include (but are not limited to):

  • architectures: Limit which architectures the kernel is built and tested for.
  • skip_test: Skip testing, only build the kernel.
  • test_set: Limit which tests should be executed.

The variables mentioned above imitate Brew builds and testing.

It is currently not possible to easily adjust merge request pipelines without modifying the CI file. GitLab has an issue open about this feature.

Future plans

Blocking on missing targeted testing

Every change needs to have a test associated with it (or at least one that exercises the area) to verify it works properly. You can find whether such tests are already onboarded into CKI via the targeted testing stats in the pipeline logs.

Please work with your QE counterparts to create and onboard tests so your changes can be properly tested.

In case no targeted test is detected, integration of the merge request will require an explicit approval and potentially also manual verification. Targeted hardware testing and extended UMB testing may be used to override the missing flag, but as this requires a manual step, please still consider including as many tests as possible directly!

Last modified September 1, 2021: Clarify triggering full pipelines for external contributors (58f5331)