Monitor Security and Best Practices with
Kyverno and Policy Reporter

Article published: April 7, 2021

Foreword

This article, like my previous article about Falco and Falcosidekick, is a practical guide to installing, configuring and using Kyverno with additional monitoring using Policy Reporter.

Background

As I described in my last article, there is no single tool to make a Kubernetes cluster secure. There are different levels that need to be considered and for which there are mechanisms to make them securer. While Falco increases runtime security through its rule engine and alerts, there are some best practices that should already be taken into account already when creating Kubernetes manifests.

In order to observe, monitor, and, if possible, automate these best practices from the start in my company's internal Kubernetes cluster, we looked for a suitable solution.

Requirement analysis

We built this cluster as an internal playground because most of the developers had little to no previous experience with Kubernetes. Therefore, the solution should work as Kubernetes native as possible to minimize additional learning effort. The main focus is on defining and monitoring best practices and security standards as rules. Additional features are optional but not required. Some example rule would be to ensure that Pods have defined resource requests and limits, and prod like environments using NetworkPolicies to configure possible Pod communication.

Possibilities

At this time there is one deprecated native resource I would like to mention and two established open-source solutions.

  • Pod Security Policies is defined as a cluster-level resource that controls security sensitive aspects of the pod specification. This resource is available in every Kubernetes cluster, but is also deprecated and will be removed in the near future. The core aspect is to validate pods against a fixed set of rules to improve the security of your cluster. The drawbacks of this solution are that the possible rules are fixed and not extensible, also only pods can be validated, and the configuration is tied to the user/resource that created the pod.

  • Open Policy Agent Gatekeeper is described as a validating (mutating TBA) webhook that enforces CRD-based policies executed by Open Policy Agent, a policy engine for Cloud Native environments hosted by CNCF. OPA Gatekeeper is a very popular solution and uses Rego as a policy language. It is able to enforce and audit policies and mutate resources.

  • Kyverno is described as a policy engine designed for Kubernetes. Kyverno policies can validate, mutate, and generate Kubernetes resources. The Kyverno CLI can be used to test policies and validate resources as part of a CI/CD pipeline. As with OPA, it is possible to audit or enforce validation policies. Kyverno uses only YAML and CRDs to define policies in a single namespace or the entire cluster.

Decisions

Pod Security Policies are deprecated and not that easy to configure. They have a fixed set of rules and are not extensible, but the rules themselves are still valid and valuable. Thankfully, both Open Policy Agent Gatekepper as well as Keyverno offer the Pod Security Policies as ready to use policies for their respective engines. So, both are a great replacement for PSP. Both solutions support all types of resources and are not restricted to Pods like PSP.

OPA Gatekeeper is able to validate in audit or enforce mode as well as mutate resources based on policies written in Rego, allowing all the capabilities and features of a programming language instead of static configurations. Kyverno can also validate, in audit or enforce mode, mutate and generate resources based on policies defined in pure YAML. It offers features such as ConfigMap values, JMESPath and Kubernetes apiCalls to support dynamic policies based on the current state of your cluster.

OPA Gatekepper supports high availability, provides metrics, and can be used outside of Kubternetes. Kyverno has an open proposal to support metrics soon and plans HA support for version 1.4.0, but is limited to the Kubernetes ecosystem.

In making our decision, we focused mainly on simplicity and support for validation and monitoring. We chose Kyverno for one important reason. And that is that it is much easier for us to define policies with simple YAML-based manifests. It supports all the features and more to fulfill our requirements. We can define policies for validation in audit or enforce mode and are able to automate behavior with mutate and generate policies.

Because our Kubernetes cluster is only intended as a playground, high availability does not currently play a major role.

In audit mode Kyverno creates CRDs called PolicyReports and ClusterPolicyReports to provide the validation results. These Reports cannot be monitored very well in terms of status and Kyverno provides no metrics about validation results yet. This is where the Policy Reporter comes in, which I will discuss in more detail later.

Environment

Getting Started with Kyverno

Installing Kyverno

Kyverno can be installed with Helm or with static manifests using kubectl. Because I want to profit from the additional features like PSP based default policies in audit mode and to tweak the configuration for my environment, I will use the provided Helm Chart.

Adding the Helm repository:

helm repo add kyverno https://kyverno.github.io/kyverno/
helm repo update

I want to tweak the values.yaml for Kyverno because:

  • I am using RKE and have a few more namespaces to be ignored from Kyverno.
  • I want to generate nonstandard CRDs with Kyverno
# values.yaml
# this snipped includes only the changeset

# This configuration installs with the "default" value 10 PSP based policies in audit mode.
# This improves your cluster security without additional effort.
#
# Supported- default/restricted
# For more info- https://kyverno.io/policies/pod-security
podSecurityStandard: default

# filter resources or namespaces to be validated from Kyverno https://kyverno.io/docs/installation/#resource-filters
config:
    resourceFilters:
    - "[Event,*,*]"
    - "[*,kube-system,*]"
    - "[*,kube-public,*]"
    - "[*,kube-node-lease,*]"
    - "[Node,*,*]"
    - "[APIService,*,*]"
    - "[TokenReview,*,*]"
    - "[SubjectAccessReview,*,*]"
    - "[*,kyverno,*]"
    - "[Binding,*,*]"
    - "[ReplicaSet,*,*]"
    - "[ReportChangeRequest,*,*]"
    - "[ClusterReportChangeRequest,*,*]"
    # additional RKE / Rancher specific namespaces
    - "[*,cattle-system,*]"
    - "[*,cattle-monitoring-system,*]"
    - "[*,fleet-system,*]"

With these values I install Kyverno into the automatically created kyverno namespace.

helm install kyverno kyverno/kyverno --namespace kyverno --create-namespace

After a few minutes you should see the Kyverno Pod up and running.

kubectl get pod -n kyverno

NAME                      READY   STATUS    RESTARTS   AGE
kyverno-6b456946c-4927d   1/1     Running   0          96s

You should also see the created default Pod Security Policies provided from Kyverno in audit mode. Background is a configuration for validation policies and specifies whether the policy should audit already existing resources (value true) or only newly created resources (value false). This works also for Policies in enforce mode.

# cpol is the short version for clusterpolicies
kubectl get cpol

NAME                             BACKGROUND   ACTION
disallow-add-capabilities        true         audit
disallow-host-namespaces         true         audit
disallow-host-path               true         audit
disallow-host-ports              true         audit
disallow-privileged-containers   true         audit
disallow-selinux                 true         audit
require-default-proc-mount       true         audit
restrict-apparmor-profiles       true         audit
restrict-sysctls                 true         audit

Example Policy

Let us see how a manifest of one of those policies looks like.

kubectl get cpol disallow-host-path -o yaml
# I removed unnessecary information like managed fields
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  annotations:
    meta.helm.sh/release-name: kyverno
    meta.helm.sh/release-namespace: kyverno
    policies.kyverno.io/category: Pod Security Standards (Default)
    policies.kyverno.io/description: HostPath volumes let pods use host directories
      and volumes in containers. Using host resources can be used to access shared
      data or escalate privileges and should not be allowed.
  labels:
    app.kubernetes.io/managed-by: Helm
  name: disallow-host-path
spec:
  background: true
  rules:
  - match:
      resources:
        kinds:
        - Pod
    name: host-path
    validate:
      message: HostPath volumes are forbidden. The fields spec.volumes[*].hostPath
        must not be set.
      pattern:
        spec:
          =(volumes):
          - X(hostPath): "null"
  validationFailureAction: audit

As you can see, each policy can define multiple rules. A rule has a match section which uses different features to selected resources to validate. This policy validates all resources of Kind Pod. Also, common filters are selector for label selection, namespaceSelector to filter resources in selected namespaces, resource name with wildcard support. For detailed information have a look into the Kyverno Documentation.

Each rule has a unique name within the same policy. The validate part is where the magic happens. The message is returned by kubectl for a policy in enforce mode if your resource definition failed. In audit mode, it is displayed in the generated PolicyReportResult for each validated resource.

With a pattern, you define what a resource must satisfy to pass validation, the structure, and the available configurations are based on the validated resource. In this example, Pods with defined volumes are validated. If a Pod has at least one volume defined, it checks that no definition uses hostPath, regardless of its value. For more information about validation operations, see the Kyverno Documentation.

For more examples, consider the Best Practices Policies from Kyverno.

Kyverno CRDs have also full support for kubectl explain.

kubectl explain ClusterPolicy.spec.rules.match

KIND:     ClusterPolicy
VERSION:  kyverno.io/v1

RESOURCE: match <Object>

DESCRIPTION:
     MatchResources defines when this policy rule should be applied. The match
     criteria can include resource information (e.g. kind, name, namespace,
     labels) and admission review request information like the user name or
     role. At least one kind is required.

FIELDS:
   clusterRoles <[]string>
     ClusterRoles is the list of cluster-wide role names for the user.

   resources    <Object>
     ResourceDescription contains information about the resource being created
     or modified. Requires at least one tag to be specified when under
     MatchResources.

   roles    <[]string>
     Roles is the list of namespaced role names for the user.

   subjects <[]Object>
     Subjects is the list of subject names like users, user groups, and service
     accounts.

Define custom validation Policies

The goal is to define our requirements and best practices as validation policies. We can choose between a Policy and ClusterPolicy. The only difference is that a Policy belongs to a single Namespace, while a ClusterPolicy is applied to all namespaces. In the most cases you are using a ClusterPolicy combined with the resourceFilter and namespaceSelectors to validate only resources in a subset of namespaces in your cluster.

Check Resource Requests and Limits are configured

The aim of this policy is to check whether Pods outside our resourceFilter have configured resources. We want to audit this configuration, to enforce rules are good but often have the disadvantage that they can influence the productivity during the development. Especially when simple applications are to be tested and tools evaluated. In addition, at the beginning of the development the required resources may be unclear and lead to unnecessarily high or too low values.

With monitoring for this missing configuration, I can know where to add resource constraints if needed.

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: check-resource-requests-and-limits-configured
  annotations:
    policies.kyverno.io/category: Best Practices
spec:
  rules:
  - match:
      # select all Pod resources
      resources:
        kinds:
        - Pod
    name: check-for-resource-requirements
    validate:
      # message for logs
      message: resource requests and limits should be configured
      pattern:
        spec:
          containers:
            # check if container spec has resource configured
          - resources:
              limits:
                # check if spec.containers[*].limits.cpu is configured with any value
                cpu: "?*"
                memory: "?*"
              requests:
                cpu: "?*"
                memory: "?*"
  # validation mode
  validationFailureAction: audit

You may want to ask if this also works for Pods created by other resources like Deployment, Job, StatefulSet, etc. If you configure your Policy to validate Pods it will automatically extend your Policy with additional rules for these Kind of resources. You can check/verify this behavior by checking your newly created Policy:

kubectl get clusterpolicy.kyverno.io/check-resource-requests-and-limits-configured -o yaml
# this snipped shows only metainformation and
# a single autogenerated rule as example
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  annotations:
    pod-policies.kyverno.io/autogen-controllers: DaemonSet,Deployment,Job,StatefulSet,CronJob
    policies.kyverno.io/category: Best Practices
    policies.kyverno.io/severity: high
spec:
  background: true
  rules:
  ...
  - match:
      resources:
        kinds:
        - DaemonSet
        - Deployment
        - Job
        - StatefulSet
    name: autogen-check-for-resource-requirements
    validate:
      message: resource requests and limits should be configured
      pattern:
        spec:
          template:
            spec:
              containers:
              - resources:
                  limits:
                    cpu: ?*
                    memory: ?*
                  requests:
                    cpu: ?*
                    memory: ?*
    ...

Important in this snipped is the automatically created annotation pod-policies.kyverno.io/autogen-controllers with DaemonSet,Deployment,Job,StatefulSet,CronJob as value. By defining this annotation yourself, you can change this behavior. To disable it, use none as value. See the Kyverno Documentation for details.

Check validation Results

To check validation results we need some kind of workload to validate. So, I create a test-kyverno namespace and run an nginx as sample workload.

kubectl create ns test-kyverno
kubectl run nginx --image nginx:alpine -n test-kyverno

For each created namespace, which is not on the resourceFilter, Kyverno creates a PolicyReport. This resource includes all validation results for each executed validation rule and resource in this namespace. You can get a summary as follows:

kubectl get polr -n test-kyverno

NAME                   PASS   FAIL   WARN   ERROR   SKIP   AGE
polr-ns-test-kyverno   9      1      0      0       0      1m

As you see, we have 9 passed and 1 failed validation result. We do not see how many resources have been validated and which validation fails. To get more details we can describe the resource and filter for status fail.

At this time Kyverno PolicyReportResults can pass or fail.

kubectl describe polr -n test-kyverno | grep -i "status: \+fail" -B10

  Message:        validation error: resource requests and limits should be configured. Rule check-for-resource-requirements failed at path /spec/containers/0/resources/requests/
  Policy:         check-resource-requests-and-limits-configured
  Resources:
    API Version:  v1
    Kind:         Pod
    Name:         nginx
    Namespace:    test-kyverno
    UID:          28508c80-73d6-4836-9895-e1fc80069c90
  Rule:           check-for-resource-requirements
  Scored:         true
  Status:         fail

Now we see that our nginx Pod fails resource validation because a basic kubectl run does not configure resource constraints by default. We need to improve our Pod declaration to fulfill our custom policy.

kubectl delete po nginx -n test-kyverno
kubectl run nginx --image nginx:alpine --requests 'cpu=1m,memory=10M' --limits='cpu=5m,memory=15M' -n test-kyverno

Let us check again.

kubectl get polr -n test-kyverno

NAME                   PASS   FAIL   WARN   ERROR   SKIP   AGE
polr-ns-test-kyverno   10     0      0      0       0      5m

Now the nginx Pod does all the validations, but as you may have noticed, it is not so easy and convenient to check the validation status for validation rules. An additional problem comes into play when you have multiple namespaces and want to know how many validation failures occur across the cluster. You need to check each PolicyReport in the namespaces that your policy affects.

For this Problem I created a tool called Policy Reporter.

Monitor validation Results with Policy Reporter

The story behind Policy Reporter

As I described in the requirement analysis, our goal was to define our best practices as policies, but also to monitor them. As this was not so easy with the given tools, I started to build my own solution for this problem. I am a contributor to Falcosidekick, which inspired me to build something similar for the Kyverno ecosystem.

To be able to monitor the validation results, the first task was to provide Prometheus metrics for common monitoring tools such as Grafana. Also included is an optional integration for Prometheus Operator and three configurable Grafana dashboards.

The second part was to extend the monitoring capabilities by sending notifications to additional tools when new validation failures are detected. Since Policy Reporter already supported Grafana with metrics, I decided to start supporting Grafana Loki as a log aggregator solution. In the current version, in addition to Loki, it also supports Elasticsearch, Microsoft Teams, Discord and Slack.

After the basic functionality was given, I realized that not everyone has a running monitoring solution in their cluster. Also, there were users who just wanted to try out or demonstrate Kyverno once, and therefore did not want to put any additional effort into monitoring. For this reason, in addition to Policy Reporter, I developed the Policy Reporter UI. It is installed as an additional app and provides a standalone interface with charts, tables and filters for a flexible overview of all available validation results.

As a note, I would like to mention that the CRD for PolicyReport is a prototype for a new Kubernetes standard. It is possible that in the future other tools like kube-bench will also use these PolicyReports and thus also be supported by PolicyReporter.

Installing Policy Reporter

Policy Reporter provides a Helm Chart to install it together with the Policy Reporter UI.

Adding the Policy Reporter repository:

helm repo add policy-reporter https://fjogeleit.github.io/policy-reporter
helm repo update

Installation together with the optional Policy Reporter UI:

helm install policy-reporter policy-reporter/policy-reporter --set ui.enabled=true -n policy-reporter --create-namespace

After a few seconds there should be 2 Pods up and running.

kubectl get pods -n policy-reporter

NAME                                 READY   STATUS    RESTARTS   AGE
policy-reporter-747c456dcb-j5j7f     1/1     Running   0          15s
policy-reporter-ui-ddd888675-rv9s8   1/1     Running   0          15s

Using Policy Reporter UI

To see at least one failed validation, we create another workload that violates our resource validation.

kubectl run nginx-invalid --image nginx:alpine -n test-kyverno

Now we can port-forward the UI to localhost to be able to use it.

kubectl port-forward service/policy-reporter-ui 8082:8080 -n policy-reporter

Open http://localhost:8082/ in the browser and you will see the dashboard with our failed nginx-invalid pod as well as the failed policy-report and policy-report-ui deployments. These dashboards display all failed or erroneous validation results.

Policy Reporter UI

Under Policy Reports you can view all validation results for the selected set of policies with an optional namespace filter. This sample screen shows all the results of our custom ClusterPolicy for the namespace test-kyverno only.

Policy Reporter UI

ClusterPolicy Reports refers to the second CRD provided for validation results, a ClusterPolicyReports is generated when you have validation against namespaces or other cluster-scoped resources.

Logs provides a list of the last 200 failed validation results by default. Both are customizable in the Helm Chart. 'Warning' is the priority of your policy, it is the default for all policies and can be configured in Policy Reporter or by configuring 'Severity' in your Kyverno policy, which is available since Kyverno 1.3.5.

Policy Reporter UI

Using Grafana

To use Grafana as monitoring solution for more production like use cases you can use the provided optional monitoring Sub Chart.

You can install it locally with the kube-prometheus-stack Helm Chart.

helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update

Installation with default values:

helm install monitoring prometheus-community/kube-prometheus-stack --namespace monitoring --create-namespace

After a few minutes all pods should be in a running state.

kubectl get pods -n monitoring

NAME                                                     READY   STATUS    RESTARTS   AGE
alertmanager-monitoring-kube-prometheus-alertmanager-0   2/2     Running   0          7m35s
monitoring-grafana-58bf4946c8-tt8zn                      2/2     Running   0          7m52s
monitoring-kube-prometheus-operator-5dc569645-v7z29      1/1     Running   0          7m52s
monitoring-kube-state-metrics-6bfb865c69-fvncg           1/1     Running   0          7m52s
monitoring-prometheus-node-exporter-fmdpt                1/1     Running   0          7m52s
prometheus-monitoring-kube-prometheus-prometheus-0       2/2     Running   1          7m32s

Access the installed Grafana with a port-forward like Policy Reporter UI before:

kubectl port-forward service/monitoring-grafana 8080:80 -n monitoring

You can access Grafana with http://localhost:8080 and login with username admin and password prom-operator.

Now let us upgrade Policy Reporter with monitoring enabled and configured for our needs.

# values.yaml
# deploy the UI and monitoring together, both are independent from each other

ui:
  enabled: true

monitoring:
  enabled: true
  namespace: monitoring
  serviceMonitor:
    labels:
      release: monitoring
helm upgrade policy-reporter policy-reporter/policy-reporter -f values.yaml -n policy-reporter

After a refresh you should see three new available dashboards labeled with Policy Reporter.

Grafana Policy Reporter Dashboard

It can take a few minutes until the created metrics are crawled and the results are shown in the Dashboards. After 2 minutes I got the following information from the Policy Reporter dashboard.

Grafana Policy Reporter Dashboard

Conclusion

Kyverno is a simple yet powerful tool. It already provides a great security update with its included PSP alike policies. With its YAML based policy engine it offers a quick start for users with some Kubernetes experience. It was possible to define common best practices as policies with little effort and to set up monitoring with familiar tools using Policy Reporter. We started with audit policies to see how the policies would affect our cluster and to save our developers additional effort initially. We can always decide to move policies from audit to enforce mode to further improve security. In addition to validation, Kyverno offers other helpful tools such as resource mutation and generation, but these will be the subject of an upcoming article.