Operations handbook

Procedures for running and fixing the CKI setup

Enabling a new adhoc gitlab-runner

How to put an adhoc gitlab-runner into service

Celebrating success

How to celebrate a success from IRC

Create diagrams

How to add or modify diagrams

Debugging a failing GitLab pipeline job

How to allow a kernel developer to investigate from the inside why a GitLab pipeline job failed

Debugging DataWarehouse triager and issue regexes

How to investigate why a certain test was not triaged in DataWarehouse

Freezing deployments

How to stop continuous deployment for a while

Adding a new approver to kpet-db

How to manage the list of people that can approve merge requests in the kpet-db repository

Migrating to a new cluster

How to migrate the CKI microservices to a new Kubernetes cluster

Fixing a missing brew build

How to investigate a report of a missing brew build

Fixing missing OSCI results

How to investigate a report of a missing OSCI results

Adding a new gitlab-runner configuration

How to create a new gitlab-runner configuration and attach it to the correct projects

Adding a new Kubernetes deployment context

How to enable Kubernetes deployments to a new cluster and/or namespace in deployment-all

Adding an expression to the pipeline herder

How to get the pipeline-herder to retry GitLab jobs with certain characteristics

PSI escalation procedure

How to escalate PSI infrastructure problems

Changing the configuration of the RabbitMQ nodes

How to mess with the RabbitMQ cluster without causing an outage

Refresh Beaker machines for a recipe

How to solve lack of machines after an outage

Purging a merge request from GitLab

How to remove all data associated with a merge request

Renew expired UMB certificates

How to renew the certificates used to authenticate against shared services, such as UMB.

Syncing the data in the staging instance of DataWarehouse

How to restore a production backup into the staging instance of DataWarehouse

Rotating secrets

How to systematically rotate all secrets

Shutting down CKI

How to shut down CKI kernel testing, and communicate a planned or unplanned shutdown

Triggering test pipelines using the CKI bot

Features and details of CKI bot configuration

Investigating UMB problems

How to investigate problems with interfacing to the Red Hat Universal Message Bus

Updating cross-compilers

How to update the cross-compiler packages

Upgrading machines to a newer Fedora version

How to reprovision machines with a newer Fedora version without causing an outage

Updating pipeline images

How to update the container images used in the pipeline

Updating the pipeline repositories

How to update the pipeline repositories

Updating y-stream composes

How to update y-zstream composes

Working with Kubernetes

How to access and work with production environments based on Kuberenetes

Zstreaming a kernel release

Steps to do when a kernel stream is to be released.

Last modified October 7, 2022: Document artifact storage (b3c3b24)