Profile Applicability:

  • Level 1

Description:

Ensure that the --terminated-pod-gc-threshold argument is configured appropriately in the Kubernetes controller manager. This argument controls the number of terminated pods that the controller manager will retain in the system. Once the threshold is exceeded, the controller manager will automatically clean up the terminated pods to free up resources.

Rationale:

Setting the --terminated-pod-gc-threshold argument ensures that the system does not retain unnecessary terminated pods, which can consume resources and affect cluster performance. Properly configuring this threshold helps maintain optimal resource utilization by cleaning up old pods and preventing resource wastage.

Impact:

Pros:

  • Helps free up resources by automatically cleaning up terminated pods once the threshold is exceeded.

  • Reduces the risk of resource consumption caused by leftover pods, improving cluster performance and efficiency.

Cons:

  • If set too low, useful information or logs from terminated pods might be prematurely deleted, potentially hindering debugging or troubleshooting.

Default Value:

By default, the --terminated-pod-gc-threshold argument may not be set or may be set to a default threshold. It needs to be manually configured according to the needs of the cluster.

Pre-Requisites:

  • Access to the Kubernetes controller manager configuration.

  • Sufficient privileges (root or administrator access) to modify the controller manager flags.

  • An understanding of the cluster's resource usage patterns and the need for pod cleanup.

Test Plan:

Using AWS Console:

  1. Sign in to the AWS Management Console.

  2. Open the Amazon Elastic Kubernetes Service (EKS) console.

  3. Navigate to the "Clusters" section and locate your cluster.

  4. Review the settings for the --terminated-pod-gc-threshold argument in the controller manager configuration.

  5. Ensure that the --terminated-pod-gc-threshold argument is set to an appropriate value based on the cluster's needs.

Using AWS CLI:

Retrieve the configuration for the Kubernetes controller manager:

kubectl get deployment -n kube-system kube-controller-manager -o yaml

Check for the presence of the --terminated-pod-gc-threshold argument in the controller manager arguments section:

- --terminated-pod-gc-threshold=1000
  1. Ensure that the --terminated-pod-gc-threshold argument is set to an appropriate value. If it is not, update the configuration to set the appropriate threshold.

Implementation Plan:

Using AWS Console:

  1. Sign in to the AWS Management Console.

  2. Open the EKS service and navigate to your cluster.

  3. Review the cluster's configuration for the --terminated-pod-gc-threshold argument.

  4. If the argument is missing or misconfigured, update the controller manager configuration to set the correct threshold:

     --terminated-pod-gc-threshold=<desired-threshold>

    Save and apply the changes to the Kubernetes controller manager configuration.

Using AWS CLI:

Modify the controller manager deployment by adding or updating the --terminated-pod-gc-threshold=<desired-threshold> argument:

kubectl edit deployment -n kube-system kube-controller-manager
  1. In the deployment YAML, locate the command section under the spec for the kube-controller-manager container.

Add or update the following line to set the threshold:

- --terminated-pod-gc-threshold=1000
  1. Save and exit the editor to apply the changes.

Backout Plan:

Using AWS Console:

  1. Sign in to the AWS Console.

  2. Open the EKS service and navigate to your cluster.

  3. Locate the controller manager configuration.

  4. If necessary, revert the change by setting the --terminated-pod-gc-threshold argument to its previous value or removing it entirely.

  5. Save and apply the changes to the Kubernetes controller manager configuration.

Using AWS CLI:

To revert the change, edit the controller manager deployment to set the --terminated-pod-gc-threshold argument back to its previous value:

kubectl edit deployment -n kube-system kube-controller-manager

  1. Update the deployment YAML to include the previous value for the --terminated-pod-gc-threshold argument.

  2. Save and exit the editor to apply the changes.

References: