# KUDO Dependencies

One of the most requested features (internally and externally) has been operator dependencies. We've experienced this pain ourselves: Kafka (opens new window) was one of the first KUDO operators, and it requires the user to manually install Zookeeper (opens new window) as a prerequisite. This sounds simple enough but can get out of hand quickly with multiple dependencies as the Flink demo (opens new window) shows.

We've been thinking about the best way of introducing dependencies to KUDO for a while now. A few months ago we created KEP-29 (opens new window) that aimed at solving part of the problems. Dependencies are a complex topic. KEP-29 is not trying to boil the dependency ocean but rather limits itself to installation dependencies only, i.e. your operator instance and all its dependencies (including transitive ones) will be installed and/or removed as one unit.

# KudoOperator Task

KUDO operators already have a mechanism to deal with installation dependencies called plans, phases, and steps (opens new window) with serial or parallel execution strategy. This mechanism is already powerful enough to express any dependency hierarchy including transitive dependencies. As of KUDO 0.15.x, KUDO supports a KudoOperator task which allows you to specify a dependency on other operators. At the very least it requires the name of the operator to be installed. You can also provide an operator version (defaults to the latest one).

Here a simple example of a task specifying a dependency on the community Zookeeper operator version 0.3.0 (which will install zookeeper 3.4.14 (opens new window))

tasks:
  - name: deploy-zookeeper
    kind: KudoOperator
    spec:
      package: zookeeper
      operatorVersion: 0.3.0

This task can be used as part of e.g. deploy plan same as any other:

plans:
  deploy:
    strategy: serial
    phases:
      - name: prereqs
        strategy: parallel
        steps:
          - name: first
            tasks:
              - deploy-zookeeper
      - name: main
        strategy: parallel
        steps:
          - name: second
            tasks:
              - deploy-main

As with other tasks, KUDO will make sure this task is healthy before moving to the next one. "Healthy", in the case of the operator means that its deploy plan has finished successfully. Note, that any transitive dependencies of the zookeeper operator itself will also be resolved and installed in the right order. KudoOperator closely mimics the kubectl kudo install command semantics. It allows to additionally specify the appVersion (defaults to most recent one) and the instanceName (will be generated by KUDO by default).

# Dependency Parametrization

Overall we want to encourage operator composition by providing a way of operator encapsulation. In other words, operator users should not be allowed to arbitrarily modify the parameters of embedded operator instances. The higher-level operator should define all parameters that its direct dependency operators need. If a child operator needs to be parametrised a parameter file can be specified using the parameterFile field. Let's take a look at an example. Suppose, we have two operators (parent and child) and child has a required parameter USERNAME which is empty by default. First, the parent operator needs to specify a parameter file for the child and reference it in the corresponding KudoOperator task:

tasks:
  - name: deploy-child
    kind: KudoOperator
    spec:
      package: child-operator
      parameterFile: child-params.yaml 

child-params.yaml is located in the parents template folder along with other template files:

# parent/templates/child-params.yaml
USERNAME: {{ .Params.CHILD_USERNAME }}

Child USERNAME value references the parent CHILD_USERNAME parameter that can look like:

# parent/params.yaml
apiVersion: kudo.dev/v1beta1
parameters:
  - name: CHILD_USERNAME
    displayName: "child username"
    description: "username for the underlying instance of child operator"
    required: true

When installing the parent operator the user then has to define the CHILD_USERNAME as usual:

$ kubectl kudo install parent -p CHILD_USERNAME=secret

Note, that the parent operator may decide to provide a sensible default or even to hardcode the username and not expose it at all to the end user.

# Summary

Obviously, there is more to dependencies than the right installation order. One area that KEP-29 deliberately avoided is life-cycle dependencies. It is easy to imagine a situation when a new operator (e.g Kafka) may want to depend on the existing Zookeeper instance. However, such life-cycle dependencies presents major challenges e.g. what happens when Zookeeper is removed? What happens when Zookeeper is upgraded, and the new version is incompatible with the current Kafka Instance? How can we ensure the compatibility? We want to gather feedback before we address these issues so ping us if you have any ideas. For more information and implementation details, take a look at the KEP-29 (opens new window), updated Flink demo (opens new window) or dependencies documentation (opens new window).

Alex About the author
Alex is a Staff Engineer at D2iQ and he solves problems. Sometimes this involves writing code. Find Alex on GitHub (opens new window)