How To Automatically Remove Completed Kubernetes Jobs From CronJobs?

Published August 6, 2024

Problem: Accumulating Completed Kubernetes Jobs

Kubernetes CronJobs can create many completed Jobs over time, which can clutter the cluster. Removing these finished Jobs automatically helps keep the Kubernetes environment clean and efficient.

Implementing Automatic Removal of Completed Jobs

Using History Limits in CronJob Specifications

Kubernetes has options to manage the retention of completed Jobs created by CronJobs. These options are:

  • .spec.successfulJobsHistoryLimit: This setting controls how many successful Jobs the system keeps. It lets you keep a number of successful Job executions for review or auditing.

  • .spec.failedJobsHistoryLimit: This setting determines the number of failed Jobs that stay in the system. It helps you track recent failures for troubleshooting.

Tip: Balance History Retention

Consider setting different limits for successful and failed jobs. For example, you might keep fewer successful job histories (e.g., 3) but more failed job histories (e.g., 10) to aid in troubleshooting while still managing cluster resources.

Configuring CronJobs for Zero History Retention

To remove all completed Jobs automatically, you can set both history limits to 0 in your CronJob specification:

apiVersion: batch/v1beta1
kind: CronJob
metadata:
  name: example-cronjob
spec:
  schedule: "*/1 * * * *"
  successfulJobsHistoryLimit: 0
  failedJobsHistoryLimit: 0
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: example-container
            image: example-image
            command: ["example-command"]
          restartPolicy: OnFailure

This configuration tells Kubernetes to remove both successful and failed Jobs right after they complete. While this keeps your cluster clean, it has some effects:

  1. You won't have any history about Job executions in Kubernetes.
  2. Troubleshooting recent failures becomes harder without access to the Job history.
  3. You may need to use external logging or monitoring systems to track Job execution results.

Consider your needs for Job history retention when deciding to use this zero-retention configuration.

Alternative Approaches to Job Cleanup

Creating a Dedicated Cleanup CronJob

You can create a CronJob that removes completed Jobs regularly. This cleanup Job would run at set times and delete Jobs that meet specific criteria, such as completion time or status.

Design of a cleanup job:

  • Use a container image with kubectl installed
  • Run a script that finds and deletes completed Jobs
  • Set RBAC permissions for the cleanup Job

Pros of this method:

  • Flexible scheduling of cleanup operations
  • Can implement custom logic for Job retention
  • Keeps cleanup logic separate from individual Jobs

Cons of this method:

  • Adds complexity to cluster management
  • Requires maintenance of an additional Job
  • May use more resources compared to built-in limits

Example: Cleanup CronJob YAML

apiVersion: batch/v1beta1
kind: CronJob
metadata:
  name: job-cleanup
spec:
  schedule: "0 1 * * *"
  jobTemplate:
    spec:
      template:
        spec:
          serviceAccountName: job-cleanup-sa
          containers:
          - name: kubectl
            image: bitnami/kubectl
            command:
            - /bin/sh
            - -c
            - kubectl delete jobs --field-selector status.successful=1
          restartPolicy: OnFailure

Using Kubernetes API for Programmatic Cleanup

You can use the Kubernetes API to create a custom solution for Job cleanup. This approach involves writing a program or script that interacts with the API to manage Job lifecycles.

Overview of using Kubernetes API:

  • Use a Kubernetes client library in your preferred programming language
  • Authenticate with the cluster using service account tokens or kubeconfig
  • List Jobs, filter based on completion status and time, and delete as needed

Sample script suggestions:

  • Python script using the kubernetes-client library
  • Go program using the client-go library
  • Bash script using kubectl commands in a loop

This method offers flexibility but requires more development effort and maintenance compared to using built-in CronJob settings.

Tip: Use Labels for Efficient Cleanup

Add labels to your Jobs to make cleanup more efficient. For example, add a 'cleanup-after' label with a timestamp. Your cleanup script can then use label selectors to quickly identify and delete Jobs that are ready for removal.