Working with Kubernetes
Problem
You want to access and work with any of the Kubernetes-based environments used for deployment in cee/deployment-all.
Setup
The easiest way to get started with a Kubernetes environment is to access it via the URLs of the Web console.
You can also manage Kubernetes resources using the command line tools. On
Fedora, install the tools with: dnf install origin-clients
.
There are two options for authentication:
-
From the Web console, find your name at the top right and click it. Choose Copy login command from the drop down and paste the command into your terminal. You will be logged in automatically with a one time token.
-
Install and run ocp-sso-token.
OpenShift peculiarities to keep in mind
OpenShift has some strict rules for containers to maintain security:
-
No packages can be installed once the container is running. All packages that you need for the container must be installed into the container image itself.
-
You cannot be root inside the container and your Linux capabilities are highly restricted. For example, ICMP pings and
chown
are not allowed. -
Each container starts with an arbitrary UID/GID pair. The pair is different per project, but constant across invocations. Some applications, like
git
andansible
have issues with abitrary UID/GID pairs, but there are workarounds for this. See Handling arbitrary UIDs and GIDs below. -
The default resource allotments set by the namespace LimitRange might be very low and it might be necessary to explicitly specify how much RAM and CPU the container is allowed. Some applications may work with the defaults, but you may experience strange issues or abrupt container restarts from out of memory errors. See Resource allocation for details.
Watching a running container
You can watch the logs from a deployment or container using the oc
command
line tools. This can be very helpful if you are rapidly iterating a
DeploymentConfig and trying to see if the container runs properly.
Here’s an example for monit: oc logs -f dc/slack-bot
. This will tail the logs as
the container runs.
Occasionally, the connection will drop between you and OpenShift. You can keep monitoring logs indefinitely by using something like this:
while true; do oc logs --tail 5 -f dc/slack-bot; done
This will force a reconnection each time it disconnects.
Handling abitrary UIDs and GIDs
When a container starts in OpenShift, it is assigned an arbitrary UID/GID
pair. This provides additional security for the host underneath the
container. However, it can make some application misbehave because calls to
id
or groups
will fail or return strange information.
The following changes are implemented in the CKI project to make these changing UID/GID combinations easier.
Writable /etc/passwd and /etc/group
The cleanup include file used during container image builds ensures that
container images have writable /etc/passwd
and /etc/group
files:
# Make everybody happy again with arbitrary UID/GID in OpenShift
RUN chmod g=u /etc/passwd /etc/group
Current user/group added to /etc/passwd
The default CKI container image entry point script and cronjob template run the
following commands very early after container startup to ensure the current
user can be found in /etc/passwd
:
if [ -w '/etc/passwd' ] && ! id -nu > /dev/null 2>&1; then
echo "cki:x:$(id -u):$(id -g):,,,:${HOME}:/bin/bash" >> /etc/passwd;
fi
Resource allocation
By default, the namespace LimitRange might set very low default RAM and CPU quotas for each container. Most applications will require higher limits to work properly:
In most cases, caring about Requests and Limits for CPU and memory should be
good enough. While requests and limits are specified on a container level, they
are used on a Pod level via max(...init containers, sum(containers))
.
Limits are strictly enforced, i.e. Pods can never use more resources. For CPU, cgroups are used to limit resource consumption. For memory, Pods are killed when exceeding the specified limit.
Requests are used for scheduling decisions, i.e. the total request for all Pods on a node cannot exceed the available resources on that node. Also keep in mind that some Pods on a node might not specify resource requests at all. For resource-hungry Pods, make sure that nodes are available that have enough resources to run the Pod.
Cron jobs
Recurring jobs are deployed as CronJobs. Cron jobs are not visible in the standard OpenShift Application Console, but can be found in the OpenShift Cluster Console.
To get a list of all cron jobs from the command line, you can use
$ oc get cronjob
NAME SCHEDULE SUSPEND ACTIVE LAST SCHEDULE AGE
acme-update-cluster-routes-daily 40 4 * * * False 0 11h 63d
cronjobs-acme-certs-daily 30 4 * * * False 0 11h 35d
cronjobs-acme-patch-remote-daily 40 4 * * * False 0 11h 35d
...
As an example, the cronjobs-acme-certs-daily CronJob spawns a Job which spawns a Pod with the actual containers. To get a list of everything related to one schedule, you can use something like
$ oc get job,pod -l schedule_job=cronjobs-acme-certs-daily
NAME COMPLETIONS DURATION AGE
job.batch/cronjobs-acme-certs-daily-1630989000 1/1 22s 2d11h
job.batch/cronjobs-acme-certs-daily-1631075400 1/1 22s 35h
job.batch/cronjobs-acme-certs-daily-1631161800 1/1 24s 11h
NAME READY STATUS RESTARTS AGE
pod/cronjobs-acme-certs-daily-1630989000-rk4cp 0/1 Completed 0 2d11h
pod/cronjobs-acme-certs-daily-1631075400-267zw 0/1 Completed 0 35h
pod/cronjobs-acme-certs-daily-1631161800-2m8wh 0/1 Completed 0 11h
To see the output of a schedule, you can use oc logs
with the Job or the Pod like
$ oc logs job.batch/cronjobs-acme-certs-daily-1631161800
...
Checking registration
...
$ oc logs pod/cronjobs-acme-certs-daily-1631161800-2m8wh
...
Checking registration
...