Working for IBM on the Cloud Pak for Integration, recently I’ve been spending a lot of time
with operators, and how to make running software on OpenShift as easy as possible. I came
across a situation where when running under the
restricted SCC, our pods behave, but under
the less restrictive
anyuid SCC, they can fail to access storage. This post is an exploration
of why that is, and a suggestion of how to address it.
A lot of my understanding of how SCCs affect UIDs came from the blog post A Guide to OpenShift and UIDs by William Caban Babilonia, its a fantastic resource.
The OpenShift documentation on Managing security context constraints is also good for understanding the different SCC settings and strategies.
Using the restricted SCC
Best practice when creating an OpenShift pod is to use the
restricted SCC. The restricted
SCC adds a number of security features to your running pods, including running as a random
user and group ID that cannot clash with other namespaces in your cluster.
To take advantage of those features, when creating pods you do not specify the user and group
ID so that OpenShift can assign one. This all works perfectly when the
restricted SCC is in
use, because the assignment will be done on pod admission so the metadata is all there for
Kubernetes to use when mounting storage volumes.
When Kubernetes mounts storage volumes to pods, it will (caveat: not for all provisioners!)
chown the volume recursively to allow the pod to access the files. It does this
based (partly) on the
fsGroup provided in the pod’s
securityContext. Under the restricted
SCC, if you don’t specify the
fsGroup yourself, one will be provided based on the allocated
range for the namespace.
What’s different with anyuid?
anyuid SCC enables an important use case for OpenShift - running a pod under the user
and group defined in the image. In a plain Kubernetes distribution, you can do this by not
specifying your own values in the pod spec. In OpenShift, if you don’t specify values in the
pod spec and the
restricted SCC is applied, it will provide defaults, so the image definition
will not be used.
Generally, this is a good thing, and a more secure way to run the containers, but there are
situations where you need to run using the image definition. To achieve this, you can apply
anyuid SCC to the pod, which has a higher priority, and doesn’t provide defaults to the
pod when it is applied.
Unfortunately, if your pod is relying on defaults to be applied to successfully
chown your volumes on mount, then applying the
anyuid SCC to a pod designed for the
restricted SCC can mean you can no longer access mounted storage.
What, as operator developers, can we do about it?
Ideally, we would know in advance which SCC would be applied, and could modify our resources to specify metadata if we knew OpenShift wasn’t going to do it for us, but unfortunately, there’s no easy way for the operator to know which SCC will be applied to a pod in advance, and it could change over time anyway.
When admitting pods under the
restricted SCC, if the
fsGroup is not already set, the first
group in the range defined by the
openshift.io/sa.scc.supplemental-groups annotation on the
namespace will be inserted into the pod definition. This is the part that doesn’t happen with
To make our pod compatible with both SCCs, what we can do is examine the annotation ourselves,
and set the
fsGroup for our pod to the first group in the range. This will satisfy the
restricted SCC, because the group is in range, and will provide the metadata for Kubernetes
to prepare the mount under the
Can you back that up?
I ran some tests, and you can too, using my handy test repo:
I ran these tests on a Red Hat OpenShift on IBM Cloud cluster using
ibmc-block-gold storage with
OpenShift 4.6.36. The important thing to note is that the volume is always writable where the fsGroup
has been defined within the range of groups for the namespace.
scc-nonroot-no-context job is an example of a pod designed for the
restricted SCC only.
scc-anyuid-no-context job shows what happens when a pod designed for the
restricted SCC is
run using the
scc-anyuid-fsgroupproject jobs show how adding an
within the correct range can solve the problem under both SCCs.
Here are my results:
|Job||Job spec fsGroup||Pod Admitted||Effective SCC||Pod spec fsGroup||UID||GID||Groups||Volume UID||Volume GID||Volume perms||Writable||Written UID||Written GID||Written perms|
- Name of the job from the kustomize, includes the name of the SCC (above the default - restricted) that the service account has access to.
- Job spec fsGroup
- The fsGroup applied in the pod template in the job spec
- Pod Admitted
- Was the pod allowed to run on the cluster?
- Effective SCC
- The SCC that OpenShift used to admit the pod
- Pod spec fsGroup
- The fsGroup in the pod spec after admission (to see if OpenShift changed it
- The UID of the user in the container
- The GID of the user in the container
- The additional groups of the user in the container
- Volume UID
- The UID of the root dir of the volume
- Volume GID
- The GID of the root dir of the volume
- Volume perms
- The permissions applies to the root dir of the volume
- Could the container write to storage?
- Written UID
- What UID was the file written with?
- Written GID
- What GID was the file written with?
- Written perms
- What permissions was the file written with (controlled by umask)