Full picture of CKI merge request testing
CKI can test kernels living in GitLab as well. Currently this setup is only deployed for GitLab.com, and as such this guide will reference it directly. Administrative setup description can be found under the documentation for pipeline triggering, these docs focus on the kernel engineer side.
Contacting the CKI team
Throughout this guide, contacting the CKI team is mentioned several times.
You can do so by tagging the
@cki-project group on
gitlab.com, sending an
email@example.com or pinging us on the
channel. Please do not contact individual team members of the project!
If you are a project member, you are considered a trusted contributor. As a
trusted contributor, you get access to trigger testing pipelines in specific
CKI pipeline projects. The created pipelines will be accessible from the
merge request. Configuration options for the pipeline can be overriden by
variables section of the
.gitlab-ci.yml file. This can be
useful if e.g. your changes modify the target to create
kernel-ark switched from
make rh-srpm to
make dist-srpm. Check out our
configuration option documentation for the full list of available options.
If you are not a project member, you are considered an external contributor. In this case, a bot will leave a comment explaining which project membership is required to be considered a trusted contributor.
Limited testing pipeline
The bot will trigger a limited testing pipeline with a predefined
configuration and link it in another comment. The comment with
the pipeline status will be updated as the pipeline runs. The bot will create a
new pipeline after any changes to your merge request. Changes to the
.gitlab-ci.yml file are not reflected in the pipelines created by the bot
due to security concerns.
Configuration options used for external contributors' pipelines are stored in the pipeline-data repository and any changes to them need to be ACKed by either kernel maintainers or CKI team.
Full testing pipeline
For external contributors, the full testing pipelines will fail with permission issues:
Reviewers can trigger a full testing pipeline by going into the “Pipelines” tab on the MR and clicking the “Run pipeline” button.
This will include any changes to the
.gitlab-ci.yml file from the merge request.
In the background, GitLab’s feature that allows running pipelines for MRs from
forks in the parent project is used. Because of that, the following warning
needs to be acknowledged:
A successful full pipeline is required in order to merge the MR.
What will a high level CI flow look like?
If you want any kernels built or tested, you have to submit a merge request. This is used as an explicit “I want CI” request as opposed to testing any pushes to git branches, to conserve resources when no testing is required. Merge requests marked as “WIP:” or “Draft:” are also tested. These are useful e.g. if your changes are not yet ready or are a throwaway (e.g. building a kernel with a proposed fix to help support engineers). The prefix indicates to reviewers they don’t have to spend their time looking into the merge request.
Once you submit a merge request, a CI pipeline will automatically start. If you want to, you can follow the pipeline along (e.g. by clicking on the “Detached merge request pipeline” link on the merge request web page) to check progress. The MR CI process utilizes multi-project pipelines, and thus you’ll see a very condensed view.
To view the full pipeline, click on the arrow on the right side of the box that says “Multi-project”.
Note that you may have to scroll to the side to see the full pipeline visualisation!
Full testing will likely take a few hours. If the pipeline fails, you’ll automatically get an email from GitLab. This can be changed in your notification settings. The pipeline can fail for different reasons and most of them require your attention, so we suggest to check the status periodically if you decide to turn the emails off. In case you opt to receive the emails, this is how the email will look like:
Notice the clickable pipeline ID in the email (“Pipeline #276751415 triggered” in the example screenshot). Clicking on it will bring you to the condensed multiproject pipeline view, as shown in the first screenshot in this section. You also get links pointing to your merge request ("!455" in the example) and the commit which the pipeline was triggered for (“a09a61e4” in the example). You can get to the pipeline from these links as well, however there’s no need to use them since the direct link is available.
CKI team has implemented automatic retries to handle intermittent infrastructure failures and outages. Such failed jobs will be automatically retried using exponential back off and thus no action is required from you on infrastructure failures. In general, any other failures are something you should check out. Follow the information in the job output to find out what happened and fix up the merge request if needed.
Testing of merge results
The CI pipelines test merge results, i.e. the result of merging the merge request into the target branch.
For each push, GitLab performs a shadow merge with the target branch in the
background. A notification in the merge request GUI alerts if that resulted in
any conflicts. This shadow merge will not be updated if the target branch
changes. If there are any merge conflicts, the pipeline will still start,
merge stage will fail with an appropriate error message in the logs.
In that case, please rebase your merge request.
Real time kernel checks
In some cases, there will be a second pipeline named
for the merge request as well. The purpose of the pipeline is to check if your
merge request is compatible with the real time kernel. Failures in this pipeline
are not blocking. They are only an indicator for the real time kernel
maintainers that either their branch is out of sync with the regular kernel, or
that they will run into problems when merging your changes into the real time
tree. You can ignore the
realtime_check pipeline, the real time kernel team
will contact you if they need to follow up.
If the regular pipeline passes but the real time kernel check fails (as shown in the screenshot above), you will see a different marking in the GitLab web UI of your merge request:
Automation checking the bare pipeline status is not affected by the
Testing outside of CI pipeline
Not all kernel tests are onboarded into CKI and thus they can’t run as part of the merge request pipeline. Please consider automating and onboarding as many kernel tests as possible to increase the test coverage and avoid having to manually trigger the tests every time they are needed.
For testing outside of CI pipelines, developers, QE and users are encouraged to
reuse kernels built by CKI. Kernel tarballs or RPM repository links (usable by
yum) – depending on which are applicable – are provided in the
publish job output.
The artifacts are available for at least 6 weeks after job completion.
There are two URLs linked in the job output:
The first URL links to a browsable artifact page. This can use the GitLab artifacts web interface, or an external S3 storage. The second URL links to a yum/dnf repo file. Both links are also provided to QE in the Bugzilla associated with the MR.
Access to the GitLab links is limited by GitLab authentication and requires credentials for RHEL artifacts. Both S3 storage and the repo files are accessible without authentication - public artifacts are available to anyone and RHEL artifacts require internal access from the Red Hat VPN.
UMB messages for built kernels
CKI is providing UMB messages for each built kernel. No extra authentication to access the provided kernel repositories is needed. Note that merge request access (e.g. to check the changes or add a comment with results) will require authentication and you may want to set up an account to provide results back on the merge request. If you’d like to start using the UMB messages or have any requests about them, please check out the messenger repository or open an issue there.
Note that no UMB messages are going to be sent for any embargoed content as doing so would make the sensitive information available company wide. If you are assigned to test any sensitive content, you need to grab the kernel repo files from the merge request pipelines (publish jobs) manually.
UMB testing will also support extended test sets. These are meant to substitute scratch build testing done by QE/CI and will not run automatically but an explicit request will be required. Please coordinate with your QE/CI counterparts and CKI if you want to enable such testing.
Testing RHEL content outside of the firewall
There are cases where RHEL kernel binaries need to be provided to partners for testing. As private kernel repositories are only available inside Red Hat, manual steps similar to the original flow using Brew are needed here.
To download the binaries, open the
publish job for the right architecture(s),
and follow the steps outlined below to access the artifacts. The artifacts
include information about the internal pipeline state, diff of the kernel
changes and, of course, the kernel repository. You may share the complete
artifacts (or only specific packages) the same way you did previously.
Targeting specific hardware
CKI automatically picks any supported machine fitting the test requirements. This setup is good for default sanity testing, but doesn’t work if you need to run generic tests on specific hardware. One such example is running generic networking tests on machines with specific drivers that are being modified. The team is working on extending the patch analysis tool to add device and driver filters based on kernel changes, so that targeted hardware is picked where applicable inside the CKI pipeline. Note that this mapping needs to be added to specific release/driver/architecture cases manually as first it needs to be verified that machines fitting all filters are available for CKI use. If you feel a commonly used mapping combination is missing and would be useful to include automatically please reach out to the CKI team or submit a merge request to kpet-db. You can find more details about the steps involved in adding the mappings in the kpet-db documentation.
Debugging and fixing failures - more details
To check failure details, click on the pipeline jobs marked with
! or a red
X. Those are the ones that failed. You’ll see output of the pipeline, with
some notes about what’s being done. Some links (like with the aforementioned
kernel repositories) may also be provided. Extra logs (e.g. full build output)
will usually be available in the job artifacts. Check those out to identify the
source of the problem if the job output itself is not sufficient. The artifacts
are located either in S3 buckets, or in GitLab artifacts.
Some GitLab artifacts are always available to ensure the correctness of the pipeline functionality, even if the pipeline is using S3 storage. In such case, only pipeline metadata are available in the GitLab artifacts while the directory usually containing the kernel artifacts is empty.
Artifacts are available at the right side of the job output:
If the pipeline is using GitLab artifacts as the main storage for logs and
repositories, they are available in the
artifacts directory is empty, the pipeline is using S3 to store kernel
artifacts. The link to them is available under the
s3_browse_url key in the
artifacts-meta.json file that is present in the GitLab artifacts.
The same link is also available at the end of the job output:
A static index page that can be used to recursively download all artifacts is
available at the location linked by the
The directory structure of the
artifacts directory in both setups is
Fix any issues that look to be introduced by your changes and push new versions (or even debugging commits). Testing will be automatically retried every time.
As mentioned previously, no steps are needed for infrastructure failures as affected jobs will be automatically retried. Infrastructure issues include (but are not limited to) e.g. pipeline jobs failing to start, jobs crashing due to lack of resources on the node they ran on, or container registry issues. If in doubt, feel free to retry the job yourself. You can find the retry button right above the GitLab artifacts links, as shown in the screenshot above.
Note the retry button doesn’t pull in any new changes (neither kernel, nor CKI) and retries the original job as it was before. This is to provide a semi-reproducible environment and consistency within the pipeline. Essentially, only transient infrastructure issues can be handled by retrying. To pull in any new changes, you need to trigger a new pipeline by either pushing to your branch or clicking on the
Run pipelinebutton in the pipeline tab on your merge request.
Test logs are available in the test stage jobs, as well as linked in the DataWarehouse (see subsection below). If you’re having issues finding out the failure reason, please reach out to the test maintainers. Test maintainer information is directly available in both of these locations. Name and email contacts are be available for all maintainers. If the maintainers provided CKI with their gitlab.com usernames, these are logged as well so you can tag them directly on your merge request.
Example of test information in DataWarehouse, linked from the result checking job:
Note: If you are working on a CVE, evaluate the situation carefully, and do not reach out to people outside of the assigned CVE group!
Examples of CVE workflow - output in the test jobs:
In the example above you can see the test status, maintainer information and direct links to the logs of the failed test. In this case, the test is waived and thus its status does not cause the test job to fail. Tests which are not waived say so in the message and the message is in red letters, as you can see in the example below. The rest of the message is same.
Known test failures / issues
Tests may fail due to already existing kernel or test bugs or infrastructure issues. Unless they are crashing the machines, we’ll keep the tests enabled for two reasons:
- To not miss any new bugs they could detect
- To make QE’s job easier with fix verification; if a kernel bug is fixed the test will stop failing and it will be clearly visible
After testing finishes, the results and logs are submitted into DataWarehouse (our testing database) and checked for matches of any already known issues. If a known issue is detected, the test failure is automatically waived. Pipelines containing only known issues will be considered passing.
It’s both test maintainers' and developers' / maintainers' responsibility to submit known issues into the database. If a new issue is added to the database, all previously submitted untagged failures are automatically rechecked and marked in the database (not in the existing CI pipelines!) if they match, and the issue will be correctly recognized on any future CI runs. Please reach out to the CKI team for permissions to submit new issues.
Due to security concerns, only non-CVE testing has integrated known issue detection. If you are working on a CVE, you have to review the test results manually. The test result database is still available if you want to check for known issues yourself. There will be a clear message in the job logs stating so if that’s the case as well.
This stage contains a single job which queries DataWarehouse for results and prints a human friendly report. Below is an example of a passing report, where the test run only contained known issues (which are automatically waived):
If your test run fails and you are convinced the problem is not related to your changes – or check with the test maintainers who confirm it – a new known issue related to the run can be submitted into DataWarehouse. After this is done, the result checker job can be restarted. If the only problems with the run are now known issues, the pipeline will be marked as pass. There is no need to rerun the full pipeline or even the testing!
Blocking on failing CI pipelines
No changes introducing test failures should be integrated into the kernel. All observed failures need to be related to known issues. If a failure is not a known issue it needs to be fixed before the merge request can be included. If it is a new failure, but unrelated to the merge request, it should be submitted as a known issue in DataWarehouse (see section about known issues above).
CI pipeline needs to be changed
Your changes are adding new build dependencies. New packages can’t be installed ad-hoc, they need to be added to the builder container. If you can, please submit a merge request with the new dependency to the container image repository.
Your changes require pipeline configuration changes. This is the case of the
make dist-srpmexample from above. Check out the configuration option documentation and adjust the
.gitlab-ci-private.ymlfiles in the top directory of the kernel repository.
Your changes are incompatible with the current process and the pipeline needs to be adjusted to handle it, a new feature is requested, or a bug is detected. This should rarely happen but in case it does, please contact the CKI team to work out what’s needed.
If you need any help or assistance with any of the steps, feel free to contact the CKI team!
Customized CI runs
CI runs can be customized by modifying the configuration in
.gitlab-ci-private.yml in case of CVE work) files in the top directory of
the kernel repository. The configuration option documentation contains the
list of all supported variables with information on how to use them.
This can be useful for faster testing of
Draft merge requests. Unless these
changes are required due to kernel or environment changes (see section above),
they should be reverted before the merge request is marked as ready.
Most common overrides include (but are not limited to):
architectures: Limit which architectures the kernel is built and tested for.
skip_test: Skip testing, only build the kernel.
test_set: Limit which tests should be executed.
The variables mentioned above imitate Brew builds and testing.
It is currently not possible to easily adjust merge request pipelines without modifying the CI file. GitLab has an issue open about this feature.
Blocking on missing targeted testing
Every change needs to have a test associated with it (or at least one that exercises the area) to verify it works properly. You can find whether such tests are already onboarded into CKI via the targeted testing stats in the pipeline logs.
Please work with your QE counterparts to create and onboard tests so your changes can be properly tested.
In case no targeted test is detected, integration of the merge request will require an explicit approval and potentially also manual verification. Targeted hardware testing and extended UMB testing may be used to override the missing flag, but as this requires a manual step, please still consider including as many tests as possible directly!
Check the documentation about targeted tests about how to detect and handle missing targeted testing.Last modified April 21, 2022: Add the missing dot (a7be78c)