PSI escalation procedure
How to escalate PSI infrastructure problems
This page has an internal companion page which might contain additional information.
Problem
When PSI infrastructure fails, problems should be escalated in a structured way.
Steps
-
File a SNOW ticket:
- Search for ‘PnT report an issue’
- Impact: 3 - Affects multiple teams
- Urgency: 2 - No workaround; blocks business-critical processes
- Application: DevOps - PSI-OCP (or correct application)
- Support group: Openshift PNT (or correct group)
-
If after an hour, no response on SNOW ticket
- Poke someone on the exd-infra-escalation Google Chat channel
This flow should only be used for real problems and not one-off failures. A good time frame is to verify that a problem is occurring consistently for ~15mins (and verify it’s really caused by PSI OpenShift) before submitting the initial ticket/pinging people on the Google Chat channel.
Make sure to also add the outage to the outage spreadsheet.
Last modified October 7, 2022: Document artifact storage (b3c3b24)