Fixing missing UMB messages
How to resubmit UMB messages into the CKI message bus
Problem
A problem of a missing pipeline gets reported that can be traced back to a missing UMB message. After the underlying problem is fixed, the UMB messages during the time of the outage need to be resubmitted to the CKI message bus.
Steps
-
Determine the time range of the outage. Tools to help with this are the various RabbitMQ dashboards in
Grafana
, the underlyingrabbitmq_*
Prometheus metrics andDatagrepper
. -
Log into the staging Pod of the AMQP UMB bridge via
oc --context mpp-preprod/cki--internal rsh deploy/amqp-bridge-umb-staging
-
Resubmit the messages via something like
python3 -m cki.cki_tools.amqp_bridge \ --from-datagrepper \ --start "2023-10-01T00:00:00Z" \ --end "2023-10-01T23:59:59Z"
Verify that the resubmitted messages are being processed by the relevant consumers.
-
Repeat the same for the production Pod via
oc --context mpp-prod/cki--internal rsh deploy/amqp-bridge-umb