Host device administration on a Kubernetes node with no SSH access

Using Kubernetes to do Linux disk administration on a Kubernetes node (Talos Linux, wipefs, Piraeus storage)

I’ve recently been building a Kubernetes lab environment to learn some AI and machine learning concepts on. The OS I chose was Talos Linux. Talos is an immutable OS that has no SSH access. When it came time to configure storage, I was in a bit of a pickle. After some research, I was really pleased to find that disk operations can be done through a K8s pod fairly easily. There are a few considerations, though.

environment

  • Talos Linux 1.7.0

  • Kubernetes 1.29.4

  • kubectl installed locally

make sure you can run privileged pods

Since we’re going to be working with the host system, you’ll want to be able to run privileged pods.

storage with Piraeus

I wanted to use an existing nvme disk in my lab’s single worker node for Piraeus storage, since it seems like a really simple way to run storage in an on-premise Kubernetes environment. I followed the Talos Linux documentation for configuring a storage pool, but there was an issue. I already had a partition on the nvme device that I wanted to use.

kubectl linstor physical-storage create-device-pool --pool-name nvme_lvm_pool LVM talos-6o3-rkz /dev/nvme0n1 --storage-pool nvme_pool

ERROR:
Description:
    (Node: 'talos-6o3-rkz') Failed to pvcreate on device: /dev/nvme0n1
Details:
    Command 'pvcreate --config 'devices { filter=['"'"'a|/dev/nvme0n1|'"'"','"'"'r|.*|'"'"'] }' /dev/nvme0n1' returned with exitcode 5. 

    Standard out: 


    Error message: 
      Cannot use /dev/nvme0n1: device is partitioned

My options are limited with no SSH access to the box, and no evident way to do destructive operations on a disk with talosctl, Talos Linux’s CLI. I could flash a gparted live USB, boot my worker node with that, and easily erase the partition. However, I wanted to accomplish this through Kubernetes instead. Here’s what I did:

introduce a namespace with privileged annotations

The following creates a namespace resource called disk-utilities in your cluster where pods can run privileged.

cat <<EOF | kubectl apply --filename -
apiVersion: v1
kind: Namespace
metadata:
  name: disk-utilities
  labels:
    pod-security.kubernetes.io/enforce: privileged
    pod-security.kubernetes.io/audit: privileged
    pod-security.kubernetes.io/warn: privileged
EOF

spin up a pod using an ubuntu container

Really, you can do this with any image that contains a number of disk utilities, but I found wipefs really easy to use for this usecase.

cat <<EOF | kubectl apply --filename -
apiVersion: v1
kind: Pod
metadata:
  name: disk-partitioner
  namespace: disk-utilities
spec:
  containers:
  - name: ubuntu
    image: ubuntu:latest
    securityContext:
      privileged: true
    command: ["sleep", "infinity"]
    volumeMounts:
    - name: dev
      mountPath: /dev
  volumes:
  - name: dev
    hostPath:
      path: /dev
  nodeSelector:
    kubernetes.io/hostname: {{your_hostname_goes_here}}
EOF

Two important things to note here:

  • mounting /dev directory from the host into the container, using a volume and volumeMount. That’s how we associate the container to the host’s devices!

  • make sure you select the correct host using nodeSelector.

I ended up using the node label kubernetes.io/hostname to ensure my pod spins up on the correct host. To list out your node labels, simply:

kubectl get nodes --show-labels

You don’t have to use kubernetes.io/hostname, but make sure whichever label you use is unique to the host!

exec into the container

Next, you’ll exec into the container and launch a shell /bin/bash.

kubectl exec --namespace disk-utilities -it disk-partitioner -- /bin/bash

list block storage devices

Use lsblk to find the device you want to erase

root@disk-partitioner:/# lsblk

Outputs list of devices

erase disk signatures

wipefs --all /dev/{your_device_name_from_lsblk}

exit container

exit

Now I can go back and provision my device pool!

kubectl linstor physical-storage create-device-pool --pool-name nvme_lvm_pool LVM talos-6o3-rkz /dev/nvme0n1 --storage-pool nvme_pool

SUCCESS:
    (talos-6o3-rkz) PV for device '/dev/nvme0n1' created.
SUCCESS:
    (talos-6o3-rkz) VG for devices [/dev/nvme0n1] with name 'nvme_lvm_pool' created.
SUCCESS:
    Successfully set property key(s): StorDriver/StorPoolName
SUCCESS:
Description:
    New storage pool 'nvme_pool' on node 'talos-6o3-rkz' registered.
Details:
    Storage pool 'nvme_pool' on node 'talos-6o3-rkz' UUID is: 7cde04eb-b266-4a59-8711-6731162b9f76
SUCCESS:
    (talos-6o3-rkz) Changes applied to storage pool 'nvme_pool'

Linstor (the underlying storage engine used by Piraeus) is happy, since it can use the newly de-partitioned device.

cleaning up

To clean up, simply delete the namespace that was created. This will delete the pod that was created in the namespace, preventing the namespace from being hijacked for privileged operations by a would-be attacker.

kubectl delete namespaces disk-utilities