Nine Kubernetes Tools You Might Not Know

Marvin Beckers

January 27, 2024

Categorized as kubernetes

Everyone working with Kubernetes (mostly likely) has kubectl installed. Most people also have helm. But what other tools are out there for your daily work with Kubernetes and containers? This post explores a couple of projects that range from somewhat known to heavily obscure, but all of them are part of my daily workflows and are my recommendations to aspiring (and seasoned) Kubernetes professionals.

Let’s dive right into our list!

protokol

This one takes the cake as “most obscure” because I am the only person who has starred it on GitHub at the time of writing this. People are seriously missing out.

protokol is a small tool by my friend Christoph (also known as xrstf) that allows you to easily dump Kubernetes pod logs to disk for later analysis. This is especially useful in environments that do not have a logging stack set up that you can query later on. For example, to get all logs from the kube-system namespace until you stop the protokol command (e.g. with Ctrl+C), run:

$ protokol -n kube-system
INFO[Sat, 27 Jan 2024 11:51:47 CET] Storing logs on disk.                         directory=protokol-2024.01.27T11.51.47
INFO[Sat, 27 Jan 2024 11:51:47 CET] Starting to collect logs…                     container=coredns namespace=kube-system pod=coredns-787d4945fb-4q7jv
INFO[Sat, 27 Jan 2024 11:51:47 CET] Starting to collect logs…                     container=coredns namespace=kube-system pod=coredns-787d4945fb-ghskz
INFO[Sat, 27 Jan 2024 11:51:47 CET] Starting to collect logs…                     container=etcd namespace=kube-system pod=etcd-lima-k8s
INFO[Sat, 27 Jan 2024 11:51:47 CET] Starting to collect logs…                     container=kube-controller-manager namespace=kube-system pod=kube-controller-manager-lima-k8s
INFO[Sat, 27 Jan 2024 11:51:47 CET] Starting to collect logs…                     container=kube-proxy namespace=kube-system pod=kube-proxy-rppbc
INFO[Sat, 27 Jan 2024 11:51:47 CET] Starting to collect logs…                     container=kube-scheduler namespace=kube-system pod=kube-scheduler-lima-k8s
INFO[Sat, 27 Jan 2024 11:51:47 CET] Starting to collect logs…                     container=kube-apiserver namespace=kube-system pod=kube-apiserver-lima-k8s
^C
$ tree
.
└── protokol-2024.01.27T11.51.47
    └── kube-system
        ├── coredns-787d4945fb-4q7jv_coredns_008.log
        ├── coredns-787d4945fb-ghskz_coredns_008.log
        ├── etcd-lima-k8s_etcd_008.log
        ├── kube-apiserver-lima-k8s_kube-apiserver_006.log
        ├── kube-controller-manager-lima-k8s_kube-controller-manager_010.log
        ├── kube-proxy-rppbc_kube-proxy_008.log
        └── kube-scheduler-lima-k8s_kube-scheduler_010.log

3 directories, 7 files

protokol comes with a huge set of flags to alter behaviour and to target specific namespaces or pods. It is really useful in troubleshooting situations where you want to grab large parts of the cluster’s current logs, e.g. to grep for certain things. It’s also quite nice in CI/CD systems where logs of pods should be downloaded as artifacts that will be stored alongside the pipeline results.

Tanka

Do you remember ksonnet? No? A lot of people probably don’t. The ksonnet/ksonnet repository was archived in September 2020, which feels like a lifetime ago. ksonnet used to provide Kubernetes-specific tooling based on the jsonnet configuration language, which is basically a way to template and composite JSON data. The generated JSON structures can be Kubernetes objects, which can be converted to YAML or sent to the Kubernetes API directly. In essence, this was an alternative way to distribute your Kubernetes manifests with configuration options.

ksonnet ceased development but Grafana decided to revive the idea with tanka, which was really nice. The unfortunate truth is that jsonnet is very niche, so niche that the syntax highlighting for my blog doesn’t even support it. The only major project outside of Grafana that seems to use jsonnet is kube-prometheus (which doesn’t use tanka, unfortunately).

I personally find the syntax of it great though, much better than Helm doing string templating on YAML. See below for a jsonnet snippet that generates a full Deployment object:

local k = import "k.libsonnet";

{
    grafana: k.apps.v1.deployment.new(
        name="grafana",
        replicas=1,
        containers=[k.core.v1.container.new(
            name="grafana",
            image="grafana/grafana",
        )]
    )
}

You might feel some resistance to introducing tanka to your workplace because it has a learning curve, but once it clicks you never want to go back to helm. If you get buy-in from your colleagues this might be a huge win – The ability to provide standardized libraries to generate manifests can be extremely helpful in providing a consistent baseline to teams. So it might be worth trying it out for your next project.

stalk

Another tool made by xrstf! stalk allows you to observe the changes in Kubernetes resources over time. This can be very useful if you just can’t understand what is happening to your Deployment (or any other resource) if you struggle to observe changes when running kubectl get. Usually, this is most needed when your Kubernetes controller goes into a reconciling loop. stalk to the rescue - it will show diff formatted output with timestamps when changes happen.

In the example below, the observed changes are limited to the .spec field of a Deployment. So stalk will start by showing .spec at start time, and then log any changes it observes over time (in the example, the Deployment has been scaled down to one replica later on):

$ stalk deployment sample-app -s spec
--- (none)
+++ Deployment default/sample-app v371701 (2024-01-27T12:43:10+01:00) (gen. 2)
@@ -0 +1,30 @@
+spec:
+  progressDeadlineSeconds: 600
+  replicas: 2
+  revisionHistoryLimit: 10
+  selector:
+    matchLabels:
+      app: sample-app
+  strategy:
+    rollingUpdate:
+      maxSurge: 25%
+      maxUnavailable: 25%
+    type: RollingUpdate
+  template:
+    metadata:
+      creationTimestamp: null
+      labels:
+        app: sample-app
+    spec:
+      containers:
+      - image: quay.io/embik/sample-app:latest-arm
+        imagePullPolicy: IfNotPresent
+        name: sample-app
+        resources: {}
+        terminationMessagePath: /dev/termination-log
+        terminationMessagePolicy: File
+      dnsPolicy: ClusterFirst
+      restartPolicy: Always
+      schedulerName: default-scheduler
+      securityContext: {}
+      terminationGracePeriodSeconds: 30

--- Deployment default/sample-app v371701 (2024-01-27T12:43:10+01:00) (gen. 2)
+++ Deployment default/sample-app v371736 (2024-01-27T12:43:13+01:00) (gen. 3)
@@ -1,6 +1,6 @@
 spec:
   progressDeadlineSeconds: 600
-  replicas: 2
+  replicas: 1
   revisionHistoryLimit: 10
   selector:
     matchLabels:

No more head scratching when two controllers compete on specific fields and update them several times a second.

Inspektor Gadget

If the tools in this blog form a toolbox, Inspektor Gadget is the toolbox in the toolbox. For someone with a sysadmin background (like me) this is a treasure trove when troubleshooting low-level issues. The various small tools in Inspektor Gadget are called – unsurprisingly – gadgets and are based on eBPF. You can even write your own gadgets!

Inspektor Gadget consists of a client component (which is a kubectl plugin) and a server component, which runs as DaemonSet on each Kubernetes node (after installing it).

In its essence gadgets give you access to system data you could also fetch from a Kubernetes node’s shell via SSH, but Inspektor Gadget allows to fetch and process this data with the context of containers and across nodes. To just show two of the many available gadgets, below is a snapshot of active sockets in all pods in the current namespace (which usually would be much more):

$ kubectl gadget snapshot socket
K8S.NODE                 K8S.NAMESPACE            K8S.POD                  PROTOCOL SRC                            DST                            STATUS
lima-k8s                 default                  sample-app-6…bf695-4cv89 TCP      r/:::8080                      r/:::0                         LISTEN

The tracing gadgets are also amazing to understand what is actually happening in pods over time. If you want to see which DNS requests and responses happen, you can just use the trace dns gadget:

$ kubectl gadget trace dns
K8S.NODE             K8S.NAMESPACE        K8S.POD              PID         TID         COMM       QR TYPE      QTYPE      NAME                RCODE      NUMA…
lima-k8s             default              sample-app…695-4cv89 98533       98533       ping       Q  OUTGOING  A          google.com.                    0
lima-k8s             default              sample-app…695-4cv89 98533       98533       ping       Q  OUTGOING  AAAA       google.com.                    0
lima-k8s             default              sample-app…695-4cv89 98533       98533       ping       R  HOST      A          google.com.         NoError    1
lima-k8s             default              sample-app…695-4cv89 98533       98533       ping       R  HOST      AAAA       google.com.         NoError    0

Seriously, it’s impossible to overstate how much information in an ongoing incident or a situational analysis can be discovered with Inspektor Gadget. If you operate Kubernetes clusters it should be in your go-to toolbox.

skopeo

This one has the most GitHub stars on the list so it is statistically the tool most people are familiar with, but it still made sense to include it on the list due to its sheer usefulness.

skopeo is strictly speaking not a tool for Kubernetes either – it’s for interacting with container images without the need for a fully blown container runtime running (which is extremely useful on systems that don’t run docker natively, like macOS or Windows). It can assist in both discovering image metadata or manipulating images in various ways.

The two frequent options in daily workflows are likely skopeo copy and skopeo inspect. Here’s an example of inspecting the metadata of an image in a remote registry:

$ skopeo inspect docker://quay.io/embik/sample-app:v0.1.0
{
    "Name": "quay.io/embik/sample-app",
    "Digest": "sha256:efbbf29b92bd8fca3e751c1070ba5bf0f2af31983bfc9b007c7bf26681c59b4c",
    "RepoTags": [
        "v0.1.0"
    ],
    "Created": "2023-04-07T11:38:28.791201794Z",
    "DockerVersion": "",
    "Labels": {
        "maintainer": "marvin@kubermatic.com"
    },
    "Architecture": "amd64",
    "Os": "linux",
    "Layers": [
        "sha256:91d30c5bc19582de1415b18f1ec5bcbf52a558b62cf6cc201c9669df9f748c22",
        "sha256:565a1b6d716dd3c4fdf123298b33e1b3e87525cff1bdb0da54c47f70cb427727"
    ],
    "LayersData": [
        {
            "MIMEType": "application/vnd.oci.image.layer.v1.tar+gzip",
            "Digest": "sha256:91d30c5bc19582de1415b18f1ec5bcbf52a558b62cf6cc201c9669df9f748c22",
            "Size": 2807803,
            "Annotations": null
        },
        {
            "MIMEType": "application/vnd.oci.image.layer.v1.tar+gzip",
            "Digest": "sha256:565a1b6d716dd3c4fdf123298b33e1b3e87525cff1bdb0da54c47f70cb427727",
            "Size": 3189012,
            "Annotations": null
        }
    ],
    "Env": [
        "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
    ]
}

fubectl

fubectl is a collection of handy aliases for your shell so you don’t have to type out kubectl commands all the time. While this is a project hosted by my current employer, I’ve been using it since before joining Kubermatic.

fubectl is a bit hard to show off in a blog post – the repository README does a much better job at that. Besides the obvious aliases (k instead of kubectl, kall instead of kubectl get pods -A, etc) it does a great job at integrating fuzzy finding via fzf. It makes interacting with Kubernetes much more interactive.

It is much easier to get logs for a pod by running klog and then searching for the pod by typing fragments of its name, and then going through the second stage of selecting the right container within that pod. In a similar fashion, kcns and kcs help switching between contexts and namespaces without much friction.

Once the various aliases of fubectl are in your muscle memory, you can never got back to running kubectl config get-contexts and kubectl config use-context <context> instead of kcs.

kube-api.ninja

This completes the xrstf trifecta of Kubernetes tools you should know about. The difference to all other tools on this list is that this is not a command line tool but a website. It tracks Kubernetes API changes over time in an easy to read table view.

When was a specific resource in a specific API version added to Kubernetes? When was it migrated to another API version? What important API changes are in a specific Kubernetes version (e.g. what APIs might need to be updated in your manifests before upgrading to this Kubernetes version)? kube-api.ninja answers all those questions and many more.

For example the notable API changes for Kubernetes 1.29, showing that some resource types got removed with that version:

kube-api.ninja notable changes for Kubernetes 1.29

kube-api.ninja is also helpful if you are interested in the evolution of APIs. Did you know that HorizontalPodAutoscalers existed as a resource type before Deployments? These days the Kubernetes APIs have stabilized a bit, but I wish I had this around during the extensions to apps migration days.

kubeconform

The last (serious) entry on this list is kubeconform, which is extremely helpful in validating your Kubernetes manifests before applying them. It works great in tandem with helm by first rendering your Helm chart into YAML and then passing that to kubeconform to check for semantic correctness. This is how a simple CI pipeline will much improve your Helm chart’s development process:

helm template \
  --debug \
  name path/to/helm/chart | tee bundle.yaml
# run kubeconform on template output to validate Kubernetes resources.
# the external schema-location allows us to validate resources for
# common CRDs (e.g. cert-manager resources).
kubeconform \
  -schema-location default \
  -schema-location 'https://raw.githubusercontent.com/datreeio/CRDs-catalog/main/{{.Group}}/{{.ResourceKind}}_{{.ResourceAPIVersion}}.json' \
  -strict \
  -summary \
  bundle.yaml

This will help ensure that PRs changing your Helm chart still produce semantically valid Kubernetes resources, not just valid YAML.

kubeconfig-bikeshed

Okay, okay, okay. This one is shameless self-promotion, so I’ll keep it short. If you struggle with juggling access to many Kubernetes clusters and you feel like multiple contexts in your kubeconfig no longer cut it, kubeconfig-bikeshed (kbs) might be for you. I’ve started writing it to replace my various shell snippets that I was using to manage access to Kubernetes clusters.

How many of the listed tools did you know already? Hopefully you found some new things to try out in your next troubleshooting session or CI/CD pipeline design.

Private DNS with CoreDNS, Podman and Ansible

Marvin Beckers

August 12, 2020

Categorized as linux

Running a private DNS resolver is useful in quite a few situations, for example in a home lab or on an internal company network; basically everywhere where you want to give names to private systems. CoreDNS is a simple DNS server that can be used in such a situation - if the name rings a bell with you, it’s because CoreDNS is the standard in-cluster DNS solution for Kubernetes for some time now. In practice, this means that large microservice architectures rely on it to resolve internal and external hostnames.

I decided to go with CoreDNS for my setup because of that fact (it’s a tool proved to be battle-tested in Kubernetes environments I worked with) and the simplicity of configuration - it needs a single configuration file, called Corefile, that defines the DNS resolver’s behavior. Moreover, CoreDNS is based on a plugin architecture, which potentially allows to extend it with custom functionality.

The small network that required a private DNS resolver in my case was however too small to run Kubernetes, so it was necessary to find a different way of running it. Since it is a Go program, downloading and running a binary would probably work - But I wanted to run it in a container for better isolation. CentOS 8 added support for a new container runtime largely spearheaded by Red Hat and Fedora called podman. podman is interesting because it gets rid of the omniscient (Docker) daemon, allowing for running containers in a more stand-alone manner. It also supports running containers as non-root users, albeit we won’t use this feature for now.

To deploy CoreDNS and maintain its configuration, I chose Ansible. Again, mostly because I worked with it before and because all I need to get started is a SSH key on the target system. The following steps will assume that you already have a running CentOS 8 system (likely a VM) running somewhere in your private network and that you have access to it. I won’t go into the basics of Ansible and I recommend you to search for tutorials / getting started guides on it before tagging along.

At the end of this, we will have a private DNS resolver that is capable of resolving public names (with some filtering in place) and resolving one or more private domains we can adjust to our needs.

Important: You should make sure that port 53 (DNS) of your system is not available on the public internet, securing access with firewall or security group rules.

Required packages

Depending on the disk/CD image you used to install CentOS 8, not all required tools will be available right away. Besides the container runtime we want to use (podman), let’s install two more packages: udica and bind-utils. bind-utils is just useful as it contains the dig command, something we can use to query our DNS resolver later on.

udica is more interesting: It hasn’t been mentioned yet, but I would like to integrate with existing security mechanisms on CentOS as much as possible. This means that we will keep SELinux in enforcing mode. udica will enable us to generate SELinux modules that will allow running CoreDNS in a podman container while keeping SELinux up.

All things considered, translated to a task in Ansible this would mean adding this to your coredns role:

- name: install base packages
  dnf:
    name:
      - podman
      - udica
      - bind-utils
    state: present

Later on, I will not present tasks for trivial operations in Ansible - I strongly recommend to adapt my findings to your needs.

Setting up a Corefile

Before we get too much into the details of deploying CoreDNS in podman, we should focus on defining our requirements for the DNS resolver and templating a Corefile around that.

We need to ensure that on the target system, there is a directory we can put our configuration into. My Ansible role creates /etc/coredns for that purpose:

- name: add directory for CoreDNS configuration
  file:
    path: /etc/coredns
    state: directory
    mode: '0755'

The bad news first: CoreDNS does not come with any kind of built in ad block solution. A lot of people are running services like Pi-hole on their private network and our solution won’t get as sophisticated as that. The good news is, if you want to block nepharious websites in your DNS resolver, you can still do that. Earlier we defined the requirement for our resolver to resolve all kinds of names, so let’s look into public names first.

Resolve public DNS, with a blocklist

To implement some kind of blocklist for bad DNS names, we can use the hosts plugin of CoreDNS. It reads the content of a /etc/hosts-style file and serves requests based on that. I have included a list from StevenBlack/hosts as a blocklist that will resolve the hosts on that list to 0.0.0.0, thus failing resolution. Download one of the lists (depending on what you would like to block) into your Ansible role and copy them to your target system.

Let’s set up the part in our Corefile that will allow our resolver to resolve public names with the restrictions of our blocklist in place. To do so, we can ask CoreDNS to look into our blocklist and fallthrough to the next lookup method if a name is not on our naughty list. We will rely on /etc/resolv.conf for upstream DNS, but you can adjust the forward plugin configuration we are including to suit your network situation.

. {
    hosts /etc/coredns/blocklist.hosts {
        fallthrough
    }
    forward . /etc/resolv.conf
    log
}

This short configuration snippet instructs CoreDNS in the following ways:

  • It defines a server block for ., which is the root zone, meaning that any requests should go through this configuration.
  • It uses the hosts plugin to try resolving a name from /etc/coredns/blocklist.hosts (our blocklist). If a name is not on that list, it will go to the next method of resolving the name.
  • It forwards any further requests to the DNS servers defined in /etc/resolv.conf.
  • It enables logging of all incoming requests via the log plugin.

Resolve private DNS

Now that we cover public DNS, let’s look into setting up one or more private DNS domains on this resolver. I am using the hosts plugin again to resolve a templated list of DNS names; you can adjust that to your needs and situation.

First, let’s generate another /etc/hosts-style file for our private zone. The minimal viable solution for this would be this Jinja2 template:

{% for entry in coredns_entries %}
{{ entry.ip }}  {{ entry.host }}
{% endfor %}

This requires passing the host variable coredns_entries to our target host in Ansible - A simple coredns_entries configuration could look like this (let’s throw in a variable for our private DNS zone for good measure):

coredns_internal_domain: privatezone.internal
coredns_entries:
    - ip: 10.0.0.1
      host: gateway.privatezone.internal
    - ip: 10.0.0.2
      host: gitlab.privatezone.internal

You can get smarter with this (e.g. using the coredns_internal_domain variable to eliminate the requirement for adding a fqdn each time), but this is the bare minimum that will get us started. If it’s possible, consider pulling this kind of information out of your inventory. Ansible should template this to the target system, e.g. into /etc/coredns/internal.hosts. Afterwards, adjust the Corefile (that should become a template at this point) to load our internal hosts file:

{{ coredns_internal_domain }} {
    hosts /etc/coredns/internal.hosts
    log
}

. {
    hosts /etc/coredns/blocklist.hosts {
        fallthrough
    }
    forward . /etc/resolv.conf
    log
}

There we go! It’s important to load your private zones before the big catch-all . block, so CoreDNS won’t try to send your private hostnames to public DNS resolvers. Now we have a minimal Corefile on our target system. Let’s look into running CoreDNS as a podman container next.

Running CoreDNS in Podman

The next thing we want to do is figuring out the correct podman command we want to run to start our CoreDNS container. While there is an equivalent to docker-compose available, let’s stay with the (mostly docker-compatible) podman cli. We will later wrap this call in systemd to leverage it as supervisor for our container.

The full podman run call that I came up with is this one:

/usr/bin/podman run --name coredns --read-only -p 10.0.0.1:53:53/tcp -p 10.0.0.1:53:53/udp -v /etc/coredns:/etc/coredns:ro --cap-drop ALL --cap-add NET_BIND_SERVICE coredns/coredns:1.7.0 -conf /etc/coredns/Corefile

Let’s look at the parameters passed here to understand what is going on:

  • --name coredns will set the container name to “coredns”. This means that running another instance of this command will conflict with any running or stopped one.
  • --read-only sets the container’s root filesystem to read-only mode. The root filesystem of a container is transient, so applications should not store any important data there anyway. CoreDNS doesn’t need write access to its root filesystem, so let’s disable this to improve isolation.
  • -p 10.0.0.1:53:53/tcp and -p 10.0.0.1:53:53/udp bind to port 53 on both TCP and UDP for the private IP the system; replace this with your own system’s private IP (or rather, use Jinja templating for it; we will follow up on that). Binding to the IP might be necessary because systemd-resolved is possibly taking the port on localhost, making it impossible to bind to.
  • -v /etc/coredns:/etc/coredns:ro mounts our configuration folder into the container. Again, this is read-only, to ensure integrity of our configuration files. CoreDNS has no need to write back into this directory.
  • --cap-drop ALL and --cap-add NET_BIND_SERVICE will drop as much Linux capabilities as possible while maintaining NET_BIND_SERVICE, which is required to bind to the port. Again, this improves isolation and is considered good practice to reduce attack surfaces.

All in all, this tries to limit the application running in the container in what it is capable of doing.

Now, if you try to run this command, it will hopefully fail - CoreDNS will be unable to bind to port 53 (if it does not fail or complain in the logs, you probably disabled SELinux previously - shame on you!). SELinux does not allow it yet. Let’s fix that.

Generating a SELinux policy

Remember when we installed udica in the beginning? Now its time to shine has come! udica is a small but awesome tool that allows generating SELinux policies from container definitions. It takes away a lot of pain with containers and active SELinux. With that being said, I urge you to read up on SELinux and how it works, giving you the opportunity to further slim down policies if possible.

Let’s pull the definition of our “coredns” container, put it into a JSON file and pass it to udica:

$ podman inspect coredns > container.json
$ udica -j container.json coredns

Policy coredns created!

Please load these modules using: 
# semodule -i coredns.cil /usr/share/udica/templates/{base_container.cil,net_container.cil}

Restart the container with: "--security-opt label=type:coredns.process" parameter

This generates a coredns.cil file in the current directory. For me, it looked like this:

(block coredns
    (blockinherit container)
    (blockinherit restricted_net_container)
    (allow process process ( capability ( net_bind_service )))

    (allow process dns_port_t ( tcp_socket (  name_bind )))
    (allow process dns_port_t ( udp_socket (  name_bind )))
    (allow process etc_t ( dir ( getattr search open read lock ioctl )))
    (allow process etc_t ( file ( getattr read ioctl lock open  )))
    (allow process etc_t ( sock_file ( getattr read open  )))
)

It’s relatively small and quite declarative, so the main takeaways from this policy should be:

  • it will allow using the net_bind_service capability.
  • it will allow the process to bind to ports labeled dns_port_t for UDP and TCP (which is only port 53).
  • it will allow the process to read from directories labeled etc_t (which is /etc, where our CoreDNS configuration directory is).

You could further lock down this policy with your own SELinux labels on /etc/coredns, but for now, this should suffice. It will get us past SELinux as a gatekeeper and improve our security posture because we didn’t disable SELinux (yay). To enable this policy, run the command in the udica output.

Interlude: SELinux policies and Ansible

As all configuration steps, this should be included in our Ansible role to make the installation reproducible on fresh systems. Unfortunately, Ansible doesn’t seem to offer a module that will allow applying this, which means we need to fall back to using the shell module. This is how my role handles applying this:

# tasks/main.yml
[...]
- name: create /etc/udica for custom SELinux policies
  file:
    path: /etc/udica
    state: directory
    mode: '0755'

- name: create coredns SELinux policy file
  copy:
    src: files/coredns.cil
    dest: /etc/udica/coredns.cil
  notify:
    - load coredns SELinux module
[...]

# handlers/main.yml
- name: load coredns SELinux module
  shell: 'semodule -i /etc/udica/coredns.cil /usr/share/udica/templates/{base_container.cil,net_container.cil}'

This creates a handler running semodule which is being called upon changes to the SELinux policy file.

systemd service

We’re almost done! At this point, our CoreDNS container should run successfully and you should be able to resolve hosts with dig @10.0.0.1 google.com (replace 10.0.0.1 with your own private IP) as long as you’re running the podman command from earlier.

But as mentioned, podman doesn’t come with a supervising daemon - Which means that no one will restart the container once it’s dead (for whatever reason); Nor will it be restarted upon reboot.

To fix this, let’s use something on our system that is already working as a supervisor to services: systemd! Fortunately, integration with systemd works quite well under podman, as long as you follow a template recommended by Red Hat.

This is the Jinja template that I use to generate /etc/systemd/system/coredns.service, which is based on the service unit template linked above. The main difference to a standard systemd service unit is that podman will write to PID files that systemd will read to be informed about the container status.

[Unit]
Description=CoreDNS private DNS in a container

[Service]
Restart=on-failure
ExecStartPre=/usr/bin/rm -f /%t/%n-pid /%t/%n-cid
ExecStartPre=-/usr/bin/podman rm -f coredns
ExecStart=/usr/bin/podman run --name coredns --read-only --security-opt label=type:coredns.process -p {{ ansible_eth1.ipv4.address }}:53:53/tcp -p {{ ansible_eth1.ipv4.address }}:53:53/udp -v /etc/coredns:/etc/coredns:ro --cap-drop ALL --cap-add NET_BIND_SERVICE --conmon-pidfile /%t/%n-pid --cidfile /%t/%n-cid -d coredns/coredns:1.7.0 -conf /etc/coredns/Corefile
ExecStop=/usr/bin/podman stop coredns
KillMode=none
Type=forking
PIDFile=/%t/%n-pid

[Install]
WantedBy=multi-user.target

Beyond what the template requires, note the additions in comparison to the previous iteration:

  • --security-opt label=type:coredns.process will allow the container to use our previously created SELinux policy.
  • -p {{ ansible_eth1.ipv4.address }}:53:53/tcp and -p {{ ansible_eth1.ipv4.address }}:53:53/udp will template in the private IP of my system, which is associated with eth1. Adjust to your situation.

This will deliver a fully functional systemd unit called coredns.service that will utilise podman and a targeted SELinux policy to run CoreDNS on your local network. Do not forget to write your Ansible role in a way that changes to the SELinux policy or your configuration files will trigger a restart.

Wrapping up

That’s it - We now have CoreDNS running, safely wrapped into podman, SELinux and systemd. Using Ansible is a nice bonus, allowing us to deploy our private DNS resolver to additional systems with little overhead. Not all steps have been presented as Ansible snippets, but I hope it will prove useful to those writing better roles than I do.

If everything worked well, we will now see a container running in podman:

$ podman ps
CONTAINER ID  IMAGE                            COMMAND               CREATED       STATUS           PORTS                                         NAMES
16f66c7140ed  docker.io/coredns/coredns:1.7.0  -conf /etc/coredn...  2 hours ago  Up 2 hours ago  10.0.0.1:53->53/tcp, 10.0.0.1:53->53/udp  coredns

To access the logs (which includes all requests the server received and answered), we can run podman logs coredns.

This setup obviously has further room for improvement. Next steps to explore could be (but are not limited to):

  • Pull the hostnames of our private dns zone out of the Ansible inventory during templating.
  • Use the cache plugin in CoreDNS to improve performance and reduce the need to reach out to upstream DNS resolvers.
  • Investigate other CoreDNS plugins for host discovery, exploring the possibilities of Corefile configuration.
  • Expose metrics via the prometheus plugin. This would require running a Prometheus server somewhere on your internal network, but improving visibility on your services is worthwhile.
  • Look into implementing a health check via podman to catch any service malfunction that might not result in the container crashing.