Manage Kubernetes Cluster with Terraform and Argo CD

Manage Kubernetes Cluster with Terraform and Argo CD

In this article, you will learn how to create a Kubernetes cluster with Terraform and then manage it with Argo CD. Terraform is very useful for automating infrastructure. On the other hand, Argo CD helps us implement GitOps and continuous delivery for our applications. It seems that we can successfully combine both these tools. Let’s consider how they can help us to work with Kubernetes in the GitOps style.

For a basic introduction to using Argo CD on Kubernetes, you may refer to this article.

Introduction

First of all, I would like to define the whole cluster and store its configuration in Git. I can’t use only Argo CD to achieve it, because Argo CD must run on the existing Kubernetes cluster. That’s why I need a tool that is able to create a cluster and then install Argo CD there. In that case, Terraform seems to be a natural choice. On the other hand, I don’t want to use Terraform to manage apps running on Kubernetes. It is perfect for a one-time activity like creating a cluster, but not for continuous tasks like app delivery and configuration management.

Here’s the list of things we are going to do:

  1. In the first step, we will create a local Kubernetes cluster using Terraform
  2. Then we will install OLM (Operator Lifecycle Manager) on the cluster. We need it to install Kafka with Strimzi (Step 5)
  3. We will use Terraform to install Argo CD from the Helm chart and create a single Argo CD Application responsible for the whole cluster configuration based on Git
  4. After that, Argo CD Application installs Strimzi Operator, creates Argo CD Project dedicated to Kafka installation and Argo CD Application that runs Kafka on Kubernetes
  5. Finally, the Argo CD Application automatically creates all the CRD objects required for running Kafka

The most important thing here is that everything should happen after running the terraform apply command. Terraform installs Argo CD, and then Argo CD installs Kafka, which is our sample app in that scenario. Let’s see how it works.

terraform-kubernetes-arch

Source Code

If you would like to try it by yourself, you may always take a look at my source code. In order to do that you need to clone my GitHub repository. After that, you should just follow my instructions. Let’s begin.

1. Create Kubernetes Cluster with Terraform

In order to easily create a Kubernetes cluster, we will use Kind. There is a dedicated Terraform provider for Kind available here. Of course, you can run Kubernetes on any cloud, and you will also find Terraform providers for that.

Our cluster consists of three worker nodes and a single master node. We need three nodes because finally, we will install a Kafka cluster running in three instances. Each of them will be deployed on a different node. Here’s our Terraform main.tf file for that step. We need to define the latest version of the tehcyx/kind provider (which is 0.0.12) in the required_providers section. The name of our cluster is cluster1. We will also enable the wait_for_ready option, to proceed to the next steps after the cluster is ready.

terraform {
  required_providers {
    kind = {
      source = "tehcyx/kind"
      version = "0.0.12"
    }
  }
}

provider "kind" {}

resource "kind_cluster" "default" {
  name = "cluster-1"
  wait_for_ready = true
  kind_config {
    kind = "Cluster"
    api_version = "kind.x-k8s.io/v1alpha4"

    node {
      role = "control-plane"
    }

    node {
      role = "worker"
      image = "kindest/node:v1.23.4"
    }

    node {
      role = "worker"
      image = "kindest/node:v1.23.4"
    }

    node {
      role = "worker"
      image = "kindest/node:v1.23.4"
    }
  }
}

Just to verify a configuration you can run the command terraform init, and then terraform plan. After that, you could apply the configuration using terraform apply, but as you probably remember we will do it after the last all the configuration is ready to apply everything in one command.

2. Install OLM on Kubernetes

As I mentioned before, Operator Lifecycle Manager (OLM) is a prerequisite for installing the Strimzi Kafka operator. You can find the latest release of OLM here. In fact, it comes down to applying two YAML manifests on Kubernetes. The first of them crds.yaml contains CRD definitions. The second of them olm.yaml provides all required Kubernetes objects to install OLM. Let’s just copy both these files into the local directory inside our Terraform repository. In order to apply them to Kubernetes, we first need to enable the Terraform kubectl provider.

terraform {
  ...

  required_providers {
    kubectl = {
      source  = "gavinbunney/kubectl"
      version = ">= 1.7.0"
    }
  }
}

Why do we use the kubectl provider instead of the official Terraform Kubernetes provider? The crds.yaml contains pretty large CRDs that go over size limits. We can easily solve that problem by enabling the server-side apply on the kubectl provider. The next case is that there are multiple Kubernetes objects defined inside both the YAML files. The kubectl provider supports it via the for_each parameter.

data "kubectl_file_documents" "crds" {
  content = file("olm/crds.yaml")
}

resource "kubectl_manifest" "crds_apply" {
  for_each  = data.kubectl_file_documents.crds.manifests
  yaml_body = each.value
  wait = true
  server_side_apply = true
}

data "kubectl_file_documents" "olm" {
  content = file("olm/olm.yaml")
}

resource "kubectl_manifest" "olm_apply" {
  depends_on = [data.kubectl_file_documents.crds]
  for_each  = data.kubectl_file_documents.olm.manifests
  yaml_body = each.value
}

Le’s consider the last case in this section. Before applying any YAML we are creating a new Kubernetes cluster in the previous step. Therefore, we cannot use the existing context. Fortunately, we can use the output arguments from the kubectl provider with the Kubernetes address and auth credentials.

provider "kubectl" {
  host = "${kind_cluster.default.endpoint}"
  cluster_ca_certificate = "${kind_cluster.default.cluster_ca_certificate}"
  client_certificate = "${kind_cluster.default.client_certificate}"
  client_key = "${kind_cluster.default.client_key}"
}

3. Install Argo CD with Helm

This is the last step on the Terraform side. We are going to install Argo CD using its Helm chart. We also need to create a single Argo CD Application responsible for the cluster management. This Application will install the Kafka Strimzi operator and create another Argo CD Application‘s used e.g. for running the Kafka cluster. In the first step, we need to do the same thing as before: define a provider and set the Kubernetes cluster address. Here’s our definition in Terraform:

provider "helm" {
  kubernetes {
    host = "${kind_cluster.default.endpoint}"
    cluster_ca_certificate = "${kind_cluster.default.cluster_ca_certificate}"
    client_certificate = "${kind_cluster.default.client_certificate}"
    client_key = "${kind_cluster.default.client_key}"
  }
}

The tricky thing here is that we need to create the Application just after Argo CD installation. By default, Terraform verifies if there are required CRD objects on Kubernetes. In that case, it requires the Application CRD from argoproj.io/v1alpha1. Fortunately, we can use the Helm chart parameter allowing us to pass the declaration of additional Applications. In order to do that, we have to set a custom values.yaml file. Here’s the Terraform declaration for the Argo CD installation:

resource "helm_release" "argocd" {
  name  = "argocd"

  repository       = "https://argoproj.github.io/argo-helm"
  chart            = "argo-cd"
  namespace        = "argocd"
  version          = "4.9.7"
  create_namespace = true

  values = [
    file("argocd/application.yaml")
  ]
}

In order to create an initial Application, we need to use the Helm chart server.additionalApplications parameter as shown. Here’s the whole argocd/application.yaml file. To simplify, the configuration used by Argo CD is located in the repository as Terraform configuration. You can find all the required YAMLs in the argocd/manifests directory.

server:
  additionalApplications:
   - name: cluster-config
     namespace: argocd
     project: default
     source:
       repoURL: https://github.com/piomin/sample-terraform-kubernetes-argocd.git
       targetRevision: HEAD
       path: argocd/manifests/cluster
       directory:
         recurse: true
     destination:
       server: https://kubernetes.default.svc
     syncPolicy:
       automated:
         prune: false
         selfHeal: false

4. Configure Kubernetes cluster with Argo CD

The last two steps are managed by Argo CD. We have successfully completed the Kubernetes cluster installation process. Now, it’s time to install our first application there. Our example app is Kafka. So, firstly we need to install the Kafka Strimzi operator. To do that, we just need to define a Subscription object managed by the previously installed OLM. The definition is available in the repository as the strimzi.yaml file.

apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: my-strimzi-kafka-operator
  namespace: operators
spec:
  channel: stable
  name: strimzi-kafka-operator
  source: operatorhubio-catalog
  sourceNamespace: olm

We could configure a lot of aspects related to the whole cluster here. However, we just need to create a dedicated Argo CD Project and Application for Kafka configuration. Here’s our Project definition:

apiVersion: argoproj.io/v1alpha1
kind: AppProject
metadata:
  name: kafka
  namespace: argocd
spec:
  clusterResourceWhitelist:
    - group: '*'
      kind: '*'
  destinations:
    - name: '*'
      namespace: '*'
      server: '*'
  sourceRepos:
    - '*'

Let’s place the kafka ArgoCD Application inside the newly created Project.

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: kafka
  namespace: argocd
spec:
  destination:
    namespace: kafka
    server: https://kubernetes.default.svc
  project: kafka
  source:
    path: argocd/manifests/kafka
    repoURL: https://github.com/piomin/sample-terraform-kubernetes-argocd.git
    targetRevision: HEAD
  syncPolicy:
    syncOptions:
      - CreateNamespace=true

5. Create Kafka Cluster using GitOps

Finally, the last part of our exercise. We will create and run a 3-node Kafka cluster on Kind. Here’s the Kafka object definition we store in Git. We are setting 3 replicas for both Kafka and Zookeeper (used by the Kafka cluster). This manifest is available in the repository under the path argocd/manifests/kafka/cluster.yaml. We are exposing the cluster on 9092 (plain) and 9093 (TLS) ports. The Kafka cluster has storage mounted as the PVC into the Deployment.

apiVersion: kafka.strimzi.io/v1beta2
kind: Kafka
metadata:
  name: my-cluster
spec:
  kafka:
    replicas: 3
    version: 3.2.0
    logging:
      type: inline
      loggers:
        kafka.root.logger.level: "INFO"
    config:
      auto.create.topics.enable: "false"
      offsets.topic.replication.factor: 3
      transaction.state.log.replication.factor: 3
      transaction.state.log.min.isr: 2
      default.replication.factor: 3
      min.insync.replicas: 2
      inter.broker.protocol.version: "3.2"
    listeners:
      - name: plain
        port: 9092
        type: internal
        tls: false
      - name: tls
        port: 9093
        type: internal
        tls: true
    storage:
      type: jbod
      volumes:
        - id: 0
          type: persistent-claim
          size: 30Gi
          deleteClaim: true
  zookeeper:
    replicas: 3
    storage:
      type: persistent-claim
      size: 10Gi
      deleteClaim: true
  entityOperator:
    topicOperator: {}
    userOperator: {}

We will also define a single Kafka Topic inside the argocd/manifests/kafka/cluster.yaml manifest.

apiVersion: kafka.strimzi.io/v1beta2
kind: KafkaTopic
metadata:
  name: my-topic
  labels:
    strimzi.io/cluster: my-cluster
spec:
  partitions: 10
  replicas: 3

Execution on Kubernetes

Terraform

We have already prepared all the required scripts. Let’s proceed to the execution phase. If you still haven’t cloned the Git repository it’s time to do it:

$ git clone https://github.com/piomin/sample-terraform-kubernetes-argocd.git
$ cd sample-terraform-kubernetes-argocd

Firstly, let’s initialize our working directory containing Terraform configuration:

$ terraform init

Once we do it, we may preview a list of actions to perform:

$ terraform plan

You should receive a pretty large content as a response. Here’s the last part of my result:

If everything looks fine and there are no errors we may proceed to the next (final) step. Let’s begin the process:

$ terraform apply

All 24 objects should be successfully applied. Here’s the last part of the logs:

Now, you should have your cluster ready and running. Let’s display a list of Kind clusters:

$ kind get clusters
cluster-1

The name of our cluster is cluster-1. But the name of the Kubernetes context is kind-cluster-1:

Let’s display a list of applications deployed on the Kind cluster. You should have at least Argo CD and OLM installed. After some time Argo CD applies the configuration stored in the Git repository. Then, you should see the Kafka Strimzi operator installed in the operators namespace.

terraform-kubernetes-apps

Argo CD

After that, we can go to the Argo CD web console. To access it easily on the local port let’s enable port-forward:

$ kubectl port-forward service/argocd-server 8443:443 -n argocd

Now, you can display the Argo CD web console on the https://localhost:8443. The default username is admin. The password is auto-generated by the Argo CD. You can find it inside the Kubernetes Secret argocd-initial-admin-secret.

$ kubectl get secret argocd-initial-admin-secret -n argocd --template={{.data.password}} | base64 -D

Here’s the list of our Argo CD Applications. The cluster-config has an auto-sync option enabled. It installs the Strimzi operator and creates kafka Argo CD Application. I could also enable auto-sync for kafka Application. But just for the demo purpose, I left there a manual approval. So, let’s run Kafka on our cluster. To do that click the Sync button on the kafka tile.

terraform-kubernetes-argocd

Once you do the Kafka installation is starting. Finally, you should have the whole cluster ready and running. Each Kafka and Zookeeper node are running on the different Kubernetes worker node:

That’s all. We created everything using a single Terraform command and one click on the Argo CD web console. Of course, we could enable auto-sync for the kafka application, so we even don’t need to log in to the Argo CD web console for the final effect.

23 COMMENTS

comments user
Singou

great job

    comments user
    piotr.minkowski

    Thanks!

comments user
Tanko Filibus

Great content, thank you.

comments user
Daniel

Excellent content. I had problems with the creation of the Clusters in macBook M1. I had to end up trying everything on another computer with Intel Chip

    comments user
    piotr.minkowski

    Yes, that may be an issue. I’m still having that one with intel chip

      comments user
      Gabby

      Having an issue to create the first cluster on an M1 macos machine. did you find a solution for it or I need to use another machine …?

      kind_cluster.default: Creating…

      │ Error: command “docker run –name cluster-1-control-plane –hostname cluster-1-control-plane –label io.x-k8s.kind.role=control-plane –privileged –security-opt seccomp=unconfined –security-opt apparmor=unconfined –tmpfs /tmp –tmpfs /run –volume /var –volume /lib/modules:/lib/modules:ro -e KIND_EXPERIMENTAL_CONTAINERD_SNAPSHOTTER –detach –tty –label io.x-k8s.kind.cluster=cluster-1 –net kind –restart=on-failure:1 –init=false –cgroupns=private –publish=0.0.0.0:80:80/TCP –publish=0.0.0.0:444:444/TCP –publish=127.0.0.1:64044:6443/TCP -e KUBECONFIG=/etc/kubernetes/admin.conf kindest/node:v1.29.2@sha256:51a1434a5397193442f0be2a297b488b6c919ce8a3931be0ce822606ea5ca245” failed with error: exit status 125

      │ with kind_cluster.default,
      │ on main.tf line 16, in resource “kind_cluster” “default”:
      │ 16: resource “kind_cluster” “default” {

      Other than that, great article 🙂

        comments user
        piotr.minkowski

        Hi,
        I don’t have mac with M1… Do you use Docker Desktop for that? Did you also try to create the kind cluster using the `kind` CLI (kind create cluster …)?

comments user
Segun Adebayo

I always enjoy reading your articles. thanks for this content

    comments user
    piotr.minkowski

    Thanks!

comments user
Marco

Hi.
Thanks for your article, please notice that

server.additionalApplications

was deprecated in ArgoCD Helm Chart 5.0:

cfr:
https://github.com/argoproj/argo-helm/tree/main/charts/argo-cd#500

    comments user
    piotr.minkowski

    Yes. Now there is another helm chart just for adding apps and projects

comments user
giaverma

That is really attention-grabbing, You’re a very professional blogger. I’ve joined your feed and sat up in quest of more of your wonderful post.

    comments user
    piotr.minkowski

    Thanks!

comments user
Dan Voyce

How do you manage the cluster and helm resources in this case? There is a longstanding issue where using Kubernetes and Helm providers in the same state causes eventual disconnection of the K8s cluster (see https://itnext.io/terraform-dont-use-kubernetes-provider-with-your-cluster-resource-d8ec5319d14a for information).
It works… until it doesn’t and there is no recovery from it.

    comments user
    piotr.minkowski

    I’m using terraform to create a cluster and install ArgoCD on it. Then I’m managing the cluster continuously with ArgoCD with the GitOps approach

comments user
Alex

getting this issue:
│ Error: olm/catalog-operator failed to fetch resource from kubernetes: context deadline exceeded

│ with kubectl_manifest.olm_apply[“/apis/apps/v1/namespaces/olm/deployments/catalog-operator”],
│ on main.tf line 66, in resource “kubectl_manifest” “olm_apply”:
│ 66: resource “kubectl_manifest” “olm_apply” {



│ Error: olm/olm-operator failed to fetch resource from kubernetes: context deadline exceeded

│ with kubectl_manifest.olm_apply[“/apis/apps/v1/namespaces/olm/deployments/olm-operator”],
│ on main.tf line 66, in resource “kubectl_manifest” “olm_apply”:
│ 66: resource “kubectl_manifest” “olm_apply” {

when I went into the pod logs for olm, this is what it says:
Failed to pull image “quay.io/operator-framework/olm@sha256:39081bf0c4a9a167a ││ 5244bcc1bceda0a0b92be340776d498e99a51c544cf53ca”: rpc error: code = Unknown desc = failed to pull and unpack image “quay.io/operator-framewo ││ rk/olm@sha256:39081bf0c4a9a167a5244bcc1bceda0a0b92be340776d498e99a51c544cf53ca”: failed to resolve reference “quay.io/operator-framework/olm ││ @sha256:39081bf0c4a9a167a5244bcc1bceda0a0b92be340776d498e99a51c544cf53ca”: failed to do request: Head “https://quay.io/v2/operator-framework ││ /olm/manifests/sha256:39081bf0c4a9a167a5244bcc1bceda0a0b92be340776d498e99a51c544cf53ca”: x509: certificate signed by unknown authority ││ Warning Failed 11m (x4 over 12m) kubelet Error: ErrImagePull ││ Warning Failed 10m (x6 over 12m) kubelet Error: ImagePullBackOff

any thoughts?

    comments user
    piotr.minkowski

    Maybe these are just temporary problems with quay.io. Did you try once again?

comments user
Alan

im failing to pull the image as Im getting this error:
Failed to pull image “quay.io/argoproj/argocd:v2.4.2”: rpc error: code = Unkno ││ wn desc = failed to pull and unpack image “quay.io/argoproj/argocd:v2.4.2”: failed to resolve reference “quay.io/argoproj/argocd:v2.4.2”: fa ││ iled to do request: Head “https://quay.io/v2/argoproj/argocd/manifests/v2.4.2”: x509: certificate signed by unknown authority

any thoughts?

    comments user
    piotr.minkowski

    Hi. No, I don’t have any problems with that

comments user
Tometchy

Awesome article, I was looking for such demo, thank you! 🙂

    comments user
    piotr.minkowski

    You are welcome 🙂

comments user
Simon

Thank you very much for the step by step content. However, I have a question. Where is the Kubernetes cluster going to be installed? on-prem or cloud? Also, I would have appreciate if you specify the .tf name on each section of the code so that it can be easy for a Terraform newbie like myself. Thank you

    comments user
    piotr.minkowski

    In this particular case, we are creating the local Kubernetes cluster on Docker (with Kind).

Leave a Reply