# Plans

# Overview

Plans capture the individual steps of operational tasks. A plan organizes these tasks into phases and steps. Each step references the tasks to run for this step. By organizing steps into phases, complex behavior of services can be captured.

Plans and phases have a strategy. The strategy indicates if phases and steps should run in parallel or in serial.

...
tasks:
  - name: deploy-master
    kind: Apply
    spec:
      resources:
        - master-service.yaml
        - master.yaml
  - name: deploy-agent
    kind: Apply
    spec:
      resources:
        - agent-service.yaml
        - agent.yaml
plans:
  deploy:
    strategy: parallel
    phases:
      - name: deploy-master
        strategy: serial
        steps:
          - name: deploy-master
            tasks:
              - deploy-master
      - name: deploy-agent
        strategy: serial
        steps:
          - name: deploy-agent
            tasks:
              - deploy-agent

The KUDO controller deploys the plans of an operator. Only a single plan can be in active deployment at a time. KUDO has different approaches to decide which plan to deploy.

# Deploy and update plans

By default, KUDO will start the deploy plan if no other plan has been deployed yet. In case of an instance update, KUDO will start a upgrade or update plan if it exists. If none of these plans exist, KUDO will start the deploy plan.

In the example below, the deploy plan creates a service defined in service.yaml. In case of an update, the service's cache needs to be updated. The update-cache.yaml provides the necessary resources to do that and thus is part of the update plan.

...
tasks:
  - name: app
    kind: Apply
    spec:
      resources:
        - service.yaml
  - name: update
    kind: Apply
    spec:
      resources:
        - service.yaml
        - update-cache.yaml
plans:
  deploy:
    strategy: serial
    phases:
      - name: deploy-service
        strategy: serial
        steps:
          - name: deploy
            tasks:
              - app
  update:
    strategy: serial
    phases:
      - name: update-service
        strategy: serial
        steps:
          - name: update
            tasks:
              - update

# Cleanup plans

If an optional cleanup plan is part of an operator, this plan will run as part of the deletion of an Instance. Once this plan completes or fails, the instance will be deleted. This plan (if exists) will be triggered by the KUDO manager automatically once the Instance is being deleted. Trying to trigger it during the normal life-cycle of the Instance will lead to an error. Furthermore, it should be expected that the steps of this plan could fail. E.g., users may want to delete an instance because its deploy plan is stuck. In that case resources that the cleanup plan tries to remove might not exist on the cluster. The cleanup plan will start even if other plans are in progress.

...
tasks:
  - name: database
    kind: Apply
    spec:
      resources:
        - database.yaml
  - name: cleanup
    kind: Apply
    spec:
      resources:
        - cleanup-job.yaml
spec:
  plans:
    deploy:
      strategy: serial
      phases:
        - name: deploy-database
          strategy: serial
          steps:
            - name: deploy
              tasks:
                - database
    cleanup:
      strategy: serial
      phases:
        - name: cleanup-databse
          strategy: serial
          steps:
            - name: cleanup
              tasks:
                - cleanup

Note, that it is not necessary to remove the database.yaml as all resources belonging to the Instance will be removed automatically when the instance is deleted. However, complex application sometimes create state which cannot be captured by Kubernetes resources. As this state may have to be removed when removing the respective operator, a cleanup plan can take care of that. The example above bundles that tasks needed to remove this state in a Job provided by the cleanup-job.yaml. In general a cleanup plan is like any other plan except that it is being called before the Instance is deleted.

The cleanup plan is implemented using finalizers (opens new window). The instance's metadata.finalizers contains the value "kudo.dev.instance.cleanup" while the cleanup plan is in progress.

# Parameter triggers

Parameters can have optional triggers. A trigger references a plan that will run if the parameter is updated.

The operations required to update a running application can vary depending on which parameter is being updated. For instance updating the BROKER_COUNT, may require a simple update of the deployment, whereas updating the APPLICATION_MEMORY may require rolling out a new version via a canary or blue/green deployment.

...
parameters:
- name: REPLICAS
  trigger: deploy
- name: APPLICATION_MEMORY
  trigger: canary

# Executing Plans

In general, KUDO manager will automatically execute a plan when the corresponding parameter changes. However, sometimes this is not enough. Sometimes you need to trigger a plan manually, e.g. to create a periodic backup, or restore data in case of data corruption. Such plans typically do not need a corresponding parameter. KUDO v0.11.0 (opens new window) introduced new feature: manual plan execution. In a nutshell, you can now:

$ kubectl kudo plan trigger --name deploy --instance my-instance

which will trigger the deploy plan and execute it on my-instance. While this looks relatively easy on the surface, the devil is in detail, so let's take a closer look.

# Plan Life Cycle

Having the ability to trigger multiple plans on demand raises the question: what happens if two plans run concurrently? The answer is: it depends. Should two plans be completely independent of each other (e.g. deploy deploys the services and monitor deploys the monitoring pods) both can be executed in parallel. But if two plans in question are backup and restore or deploy and migrate? Some plans are incompatible with others. A few may not even be reentrant. While it is probably ok to restart a running deploy plan, a restore plan might not be reentrant because of possible data corruption.

We're planning to explore the realm of plan compatibility further in the future. Annotating plan affinity and anti-affinity, reentrant vs non-reentrant plans, plan cancellation and transient plan parameters are some of the topics we're examining. All contributions and feedback are highly welcome.

Having all this in mind how can we ensure correct plan execution? Meet Kubernetes admission controllers.

# Admission Controllers

In a nutshell, Kubernetes admission controllers are plugins that govern and enforce how the cluster is used. They can be thought of as a gatekeeper that intercepts (authenticated) API requests and may change the request object or deny the request altogether. Kubernetes already comes with a bunch of these pre-installed (opens new window) which govern everything from user authorization to the namespace life cycle.

KUDO manager employs an Instance admission controller that governs changes to Instances, making sure that plans do not interfere. Schematically this looks like the following: Instance update

Instance admission controller governs any update to the Instance either through manual plan execution or Instance parameter updates. The general rule of thumb is the following: all plans should be terminal (either successfully with status COMPLETE or unsuccessfully with FATAL_ERROR) before a new plan is allowed to start. A singular plan can be restarted so we assume all plans to be reentrant at least for now. While this might not be true for all plans, we think that it covers the 80/20 case e.g. when a deploy plan is stuck and must be restarted with less memory per node. In case a request is rejected, the Instance controller returns an error explaining why exactly the update was denied.

The admission controller would also reject parameter updates that would trigger multiple distinct plans. There are a few exceptions too: for example, a cleanup plan is special and is executed when an Instance is deleted. cleanup can not be triggered manually and is allowed to override any existing plan (since the Instance is being deleted anyway).

As of KUDO v0.11.0, the Instance admission controller is optional though we're planing to make it mandatory in the near future. See kudo init documentation for more details.