Fixing missing UMB messages

How to resubmit UMB messages into the CKI message bus

Problem

A problem of a missing pipeline gets reported that can be traced back to a missing UMB message. After the underlying problem is fixed, the UMB messages during the time of the outage need to be resubmitted to the CKI message bus.

Steps

  1. Determine the time range of the outage. Tools to help with this are the various RabbitMQ dashboards in Grafana, the underlying rabbitmq_* Prometheus metrics and Datagrepper.

  2. Log into the staging Pod of the AMQP UMB bridge via

    oc --context mpp-preprod/cki--internal rsh deploy/amqp-bridge-umb-staging
    
  3. Resubmit the messages via something like

    python3 -m cki.cki_tools.amqp_bridge \
      --from-datagrepper \
      --start "2023-10-01T00:00:00Z" \
      --end "2023-10-01T23:59:59Z"
    

    Verify that the resubmitted messages are being processed by the relevant consumers.

  4. Repeat the same for the production Pod via

    oc --context mpp-prod/cki--internal rsh deploy/amqp-bridge-umb