cki_tools.rpm_cache
The rpm-cache consists of two AWS Lambda functions that provide a transparent
caching proxy for RPM downloads from Fedora mirrors. Requests for .rpm
files are cached in S3 and served via presigned URLs on subsequent hits;
all other requests (repodata, etc.) are forwarded to the origin without
caching.
| Environment variable | Secret | Required | Description |
|---|---|---|---|
ORIGIN_HOST |
no | no | upstream mirror hostname, defaults to dl.fedoraproject.org |
S3_BUCKET_NAME |
no | yes | S3 bucket for cached RPMs |
UPLOADER_LAMBDA_ARN |
no | yes | ARN of the uploader Lambda function (handler only) |
CACHE_EXTENSIONS |
no | no | space-separated cacheable extensions, defaults to .rpm |
PRESIGNED_URL_EXPIRATION |
no | no | presigned URL lifetime in seconds, defaults to 3600 |
CKI_DEPLOYMENT_ENVIRONMENT |
no | no | deployment environment for Sentry tagging |
ORIGIN_HEAD_TIMEOUT |
no | no | HEAD request timeout in seconds, defaults to 5 |
SENTRY_DSN |
yes | no | Sentry DSN for error reporting |
LAMBDA_HANDLER |
no | yes | container image Lambda handler function name |
Lambda functions
The rpm-cache image provides two Lambda handlers selected via LAMBDA_HANDLER:
cki_tools.rpm_cache.handler_lambda: API Gateway entry point that checks S3 for a cached copy, returns a presigned redirect on hit, and falls back to an origin redirect on miss (after triggering the uploader asynchronously).cki_tools.rpm_cache.uploader_lambda: async worker invoked by the handler on cache miss; downloads the RPM from origin and uploads it to S3.
Cache behavior
The handler uses a HEAD-first strategy to minimise S3 egress costs:
.rpmrequests (origin available): handler sends a HEAD request to origin first; if the origin returns 200, the client is redirected there directly (no S3 egress). If the RPM is not yet cached, the uploader is triggered asynchronously to populate the cache for future fallback..rpmrequests (origin unavailable, cached): when the HEAD returns 404 or times out and the RPM exists in S3, the handler redirects to a time-limited presigned S3 URL..rpmrequests (origin unavailable, not cached): handler redirects to origin as a last resort (the client will see the origin’s error).- Non-
.rpmrequests: handler returns 302 to origin (no caching). - Errors: any S3/Lambda error falls back to a 302 redirect to origin, so the cache is never in the critical path.
Request and response format
The handler receives API Gateway v2 (HTTP API) events with a {path+} catch-all
route parameter. The path mirrors the origin URL structure:
- Input:
event["pathParameters"]["path"]– everything after the hostname, e.g.pub/fedora/linux/updates/44/Everything/x86_64/Packages/f/foo-1.0.fc44.x86_64.rpm - 302 redirect: all responses are redirects via the
Locationheader, pointing to either the origin URL or a presigned S3 URL - 400: returned only when
pathParametersorpathis missing
All S3 or Lambda invocation errors fall back to a 302 redirect to origin, so consumers (dnf, rpm-lockfile-prototype) never see an error from the cache itself.
Deployment
- Container image:
quay.io/cki/rpm-cache - S3 key prefix: cached objects are stored under
cache/<path>in the configured bucket - Deployment config: lives in the
deployment-allrepo as an Ansible playbook using thecki_aws_lambdarole (same pattern asreceiver)
Testing
- Unit tests:
python -m pytest tests/test_rpm_cache.py - Integration tests:
inttests/images/rpm-cache/(runs via CI image inttest job, or locally withtox -e image -- inttests/images/rpm-cache)
Design rationale
This module intentionally avoids depending on cki-lib to keep the Lambda
image small (12 pip packages vs 41+). See the module docstring in
cki_tools/rpm_cache.py for details.