Kubernetes Cluster Creation in VMware Cloud on AWS with CAPV: Part 1

One of the biggest challenges in starting a Cloud Native practice is understanding how to establish a repeatable and consistent method of deploying and managing Kubernetes clusters. That’s where ClusterAPI comes in handy!! ClusterAPI (CAPI) is a Kubernetes project to bring declarative, Kubernetes-style APIs to cluster creation, configuration, and management. It provides optional, additive functionality on top of core Kubernetes to manage the lifecycle of a Kubernetes cluster. Now you can use Kubernetes to create more Kubernetes!!!!

ClusterAPI is responsible for provisioning all of the infrastructure required to support a Kubernetes cluster. CAPI also provides the ability to perform Day2 operations, such as scaling and upgrading clusters. Most importantly, it provides a consistent management plane to perform these actions on multiple clusters. In fact, ClusterAPI is a big part of what will allow VI admins to orchestrate and automate the provisioning of Kubernetes clusters natively as a part of vSphere with Project Pacific. Learn more about the Project Pacific architecture and how it utilizes ClusterAPI here.

ClusterAPI Provider vSphere (CAPV)

The ClusterAPI special interest group has helped foster and sponsor implementations of CAPI for specific infrastructure providers. That’s where ClusterAPI Provider vSphere (CAPV) comes in! CAPV is a specific implementation of ClusterAPI that brings in additional functionality for allowing ClusterAPI to deploy Kubernetes clusters to vSphere environments. In Part 1 of my series, I’m going to walk through the process of preparing my VMC environment to support Kubernetes cluster creation via CAPV. I’m also going to detail the steps required to provision the control plane (bootstrap and management clusters) of my CAPV environment.

Environment and Terminology

The environment I am utilizing in this post consists of a couple of different components. First, I have a CentOS jumpbox that is deployed in my on-premises VMware environment. This jumpbox will house what is called the “bootstrap cluster” in CAPI terms. A bootstrap cluster is a temporary cluster that is used to provision a management cluster. In the case of CAPV, we are going to use a KinD (Kubernetes in Docker) cluster, deployed on the jumpbox, as the bootstrap cluster. KinD is a great tool for deploying Kubernetes clusters on a single machine, like your local workstation! Learn more about KinD here.

As mentioned above, the bootstrap cluster is responsible for provisioning the “management cluster.” The management cluster is the cluster where information about one or more Infrastructure Providers (in this case, my VMC lab) is stored. The management cluster also stores information about the different components of workload clusters, such as machines, control planes, bootstrap configuration, etc. The management cluster is responsible for provisioning X amount of workload clusters; it is the brains of the operation.

Workload clusters are conformant Kubernetes clusters that our developers applications will be deployed to. The high level workflow is that developers will use kubectl to pass .yaml files that define the spec of a workload cluster to the management cluster and the management cluster will handle the creation of all of the resources to create the workload cluster. In this post, I will use the management cluster to provision a 5 node workload cluster to support my applications.

As you may notice in the diagram, I am going to be utilizing our team’s VMware Cloud on AWS SDDC lab environment to support my cluster deployments. In the next section, I’ll go over some prereqs required to prepare the VMC environment for Kubernetes cluster creation via CAPV.


In order to use CAPV to deploy clusters to VMC, I needed to complete a couple of prereqs in the VMC SDDC. First off, I needed to create a network segment that my Kubernetes nodes would utilize to communicate with each other and reach out to the internet to pull OS packages and Docker images from public repos. For information on creating segments in VMC, please refer to the product documentation. I created a routed segment (named sddc-k8-jomann) with a DHCP range to provide IP addresses to my Kubernetes cluster nodes:

After creating the segment, I also needed to create a compute firewall rule that applies to the segment to allow ingress to my cluster nodes. Since our VMC SDDC is hosted behind a VPN, I simply decided to allow all traffic to the sddc-k8-jomann segment for simplicity sake. This will allow users behind the VPN to access their Kubernetes clusters:

Finally, CAPV will deploy clusters that utilize the vSphere CSI Driver in conjunction with the Kubernetes vSphere Cloud Provider to allow developers to dynamically provision persistent storage for their workloads running in the Kubernetes clusters. In order for the vSphere Cloud Provider to be configured during deployment, the Kubernetes nodes need to be able to communicate with vCenter to set the required configuration on the Kubernetes nodes. This means I’ll need to create a Management Gateway Firewall Rule to allow communication between my sddc-k8-jomann segment and vCenter on port 443:

Last but not least, I need to load a OVA template into the VMC environment that CAPV will use to build out my Kubernetes nodes. I will be utilizing a CentOS 7 image preloaded with Kubernetes 1.16.3. You can find a list of available images here.

Now that I’ve covered the VMC prereqs, let’s talk about the jumpbox. If you’d like to use this post as a guide (along with the Getting Started Guide put together by the CAPV team), you’ll need to ensure the following tools are installed and configured on the jumpbox:

clusterctl is a tool that CAPV utilizes to automate the creation of the bootstrap and management clusters. It is not required but makes the process of instantiating the management plane of CAPV a lot easier.

Docker is utilized by KinD to create the bootstrap cluster. I’ll also use a “manifests” CAPV Docker image to automate the creation of all the manifests I’ll need to create my clusters.

Finally, kubectl is the Kubernetes command-line tool that allows me to run commands against my Kubernetes clusters. clusterctl will also utilize kubectl during the creation of the bootstrap/management clusters.

Creating the Bootstrap and Management Clusters

The first thing I’ll need to do is create the management cluster. CAPV provides a “manifests” docker image that I can use to automatically generate the .yaml manifests that clusterctl will use to create my KinD bootstrap cluster. I’ll also provide a envvars.txt file that contains information about the VMware Cloud on AWS environment that I’ll be deploying the clusters to. See example output below:

# cat envvars.txt

# vCenter config/credentials
export VSPHERE_SERVER='vmc.demolab.com'
export VSPHERE_USERNAME='cloudadmin@vmc.local'
export VSPHERE_PASSWORD='MyPassword!'

# vSphere deployment configs
export VSPHERE_DATASTORE='WorkloadDatastore'
export VSPHERE_NETWORK='sddc-k8-jomann'
export VSPHERE_RESOURCE_POOL='/SDDC-Datacenter/host/Cluster-1/Resources/Compute-ResourcePool/mannimal-k8s'
export VSPHERE_FOLDER='/SDDC-Datacenter/vm/Workloads/mannimal-k8s'
export VSPHERE_TEMPLATE='centos-7-kube-v1.16.3-temp'
export SSH_AUTHORIZED_KEY='<ssh-pub-key>'

# Kubernetes configs
export KUBERNETES_VERSION='1.16.3'

As you can see above, this is where I’ll define things like the datacenter, datastore, network, etc. that the VMs will be deployed to. I also defined a public ssh key that will be loaded onto the VMs that are created in case I need to troubleshoot deployments at the OS level. Finally, I defined the Kubernetes version (1.16.3) that I’d like to be utilized in my deployments, both for management and workload clusters. There are additional optional variables that can be defined in the envvars.txt file such as VM config (mem, cpu, storage) and additional Kubernetes cluster configs. For a full list of those optional values, refer to the CAPV Quick Start Guide.

A Shoutout to govc

govc is a vSphere CLI tool that is designed as an alternative to the vSphere Web UI. I’ve found govc very useful in confirming the “locations” of the vSphere resources I’ll need to define in the envvars.txt file. From my experience, most of the issues I’ve had with CAPV deployments stem from incorrect values in the envvars.txt file.

I recommend installing and configuring govc and using it to confirm the values utilized in the vSphere Deployment Configs section of the envvars.txt. In my example, I created and sourced the following govc-creds.sh file to ensure govc knows how to reach my VMC environment:

# cat govc-creds.sh

 # vCenter host
 export GOVC_URL=vmc.demolab.com
 # vCenter credentials
 export GOVC_USERNAME=cloudadmin@vmc.local
 export GOVC_PASSWORD=MyPassword!
 # disable cert validation
 export GOVC_INSECURE=true

# source govc-creds.sh

Now I can use govc to verify my vSphere config variables. For example, to confirm the VSPHERE_RESOURCE_POOl and VSPHERE_FOLDER variables:

# govc pool.info mannimal-k8s
Name:               mannimal-k8s
  Path:             /SDDC-Datacenter/host/Cluster-1/Resources/Compute-ResourcePool/mannimal-k8s

# govc folder.info mannimal-k8s
Name:        mannimal-k8s
  Path:      /SDDC-Datacenter/vm/Workloads/mannimal-k8s

I utilized the Path: values from the govc output in my envvars.txt variables to ensure the bootstrap cluster can locate all of the required vSphere resources when provisioning the management cluster. Ok, now back to the fun stuff…

Creating the Management Cluster Manifests

Now that I’ve verfied my vSphere config variables with govc I’m ready to use the following command, which will utilize v0.5.4 version of the “manifests” image, to create the .yaml manifests for the CAPV management cluster. I’ll also use the -c flag to set the cluster name to management-cluster:

# docker run --rm \
  -v "$(pwd)":/out \
  -v "$(pwd)/envvars.txt":/envvars.txt:ro \
  gcr.io/cluster-api-provider-vsphere/release/manifests:v0.5.4 \
  -c management-cluster

Generated ./out/management-cluster/cluster.yaml
Generated ./out/management-cluster/controlplane.yaml
Generated ./out/management-cluster/machinedeployment.yaml
Generated /build/examples/provider-components/provider-components-cluster-api.yaml
Generated /build/examples/provider-components/provider-components-kubeadm.yaml
Generated /build/examples/provider-components/provider-components-vsphere.yaml
Generated ./out/management-cluster/provider-components.yaml
WARNING: ./out/management-cluster/provider-components.yaml includes vSphere credentials

Notice the output of the docker run command gives me the location of various .yaml files that define the configuration of my management cluster. The clusterctl tool will utilize these .yaml files to create the KinD bootstrap cluster as well as the CAPV management cluster running on a VM in my VMC envrionment.

Now that I’ve got my bootstrap/management cluster scaffolding, I’m ready to use clusterctl to create my boostrap cluster, which will in turn provision the VM in VMC that will serve as my management cluster. clusterctl will then “pivot” the CAPV management stack from the boostrap KinD cluster to the CAPV management cluster running in VMC. I’ll use the following clusterctl command, complete with the .yaml files generated by the docker manifests packages, to kick off this process:

clusterctl create cluster \
  --bootstrap-type kind \
  --bootstrap-flags name=management-cluster \
  --cluster ./out/management-cluster/cluster.yaml \
  --machines ./out/management-cluster/controlplane.yaml \
  --provider-components ./out/management-cluster/provider-components.yaml \
  --addon-components ./out/management-cluster/addons.yaml \
  --kubeconfig-out ./out/management-cluster/kubeconfig

Let’s go step by step and examine the output of the clusterctl command:

Creating the Bootstrap Cluster

26007 createbootstrapcluster.go:27] Preparing bootstrap cluster
26007 clusterdeployer.go:82] Applying Cluster API stack to bootstrap cluster
26007 applyclusterapicomponents.go:26] Applying Cluster API Provider Components

The first thing clusterctl does is provision a KinD Kubernetes cluster on the jumpbox server that will serve as the boostrap cluster for CAPV. Then, clusterctl applies the CAPV components to the Kubernetes cluster and ensures the Provider Components, which is the VMC environment info, is available to the cluster as well.

Creating the Infrastructure for the CAPV Management Cluster

clusterdeployer.go:87] Provisioning target cluster via bootstrap cluster
26007 applycluster.go:42] Creating Cluster referenced object "infrastructure.cluster.x-k8s.io/v1alpha2, Kind=VSphereCluster" with name "management-cluster" in namespace "default"
26007 applycluster.go:48] Creating cluster object management-cluster in namespace "default"
26007 clusterdeployer.go:96] Creating control plane machine "management-cluster-controlplane-0" in namespace "default"
I0121 12:11:03.460051   26007 applymachines.go:40] Creating Machine referenced object "bootstrap.cluster.x-k8s.io/v1alpha2, Kind=KubeadmConfig" with name "management-cluster-controlplane-0" in namespace "default"
26007 applymachines.go:46] Creating machines in namespace "default"

At this point, the boostrap cluster reaches out to the VMC environment and provisions a VM that will eventually serve as the CAPV management cluster. From the output above, note the bootstrap cluster is creating various objects, including the management-cluster-controlplane-0 machine as well as instantiating that machine as a Kubernetes cluster using the KubeadmConfig created from the “manifests” docker image.

If I navigate over to my VMC console, I can observe the VM is created in the resource pool defined in the envvars.txt file referenced earlier in the post:

“Pivoting” the Management Stack

26007 clusterdeployer.go:123] Pivoting Cluster API stack to target cluster
26007 pivot.go:76] Applying Cluster API Provider Components to Target Cluster
26007 pivot.go:81] Pivoting Cluster API objects from bootstrap to target cluster
26007 clusterdeployer.go:128] Saving provider components to the target cluster

Now the fun begins! After creating the VM and instantiating it as a Kubernetes cluster, the bootstrap cluster “pivots” the CAPV management stack over to the newly created management cluster. This ensure that the management cluster has the neccessary Provider config to support the creation of workload clusters going forward.

Cleaning Up

26007 clusterdeployer.go:164] Done provisioning cluster. You can now access your cluster with kubectl --kubeconfig ./out/management-cluster/kubeconfig
26007 createbootstrapcluster.go:36] Cleaning up bootstrap cluster.

Now that the management cluster has been created in VMC, clusterctl outputs the location of the kubeconfig file that I’ll use to interact with the management cluster as well as deleting the KinD bootstrap cluster. From this point forward, I will use the CAPV management cluster in VMC to create additional workload clusters. In order to ensure this is the case, I’m going to set the KUBECONFIG envrionment variable to the kubeconfig file of the management cluster I just created:

# export KUBECONFIG="$(pwd)/out/management-cluster/kubeconfig"

Now, when I use kubectl I am interacting directly with my CAPV management cluster deployed in VMC.


If you’re lucky, clusterctl will work without a hitch and you’ll have your bootstrap and management clusters provisioned on your first try! If you’re like me, things may not go as planned on the first couple of runs… If KinD and Docker are installed installed and configured correctly, clusterctl should have no issue moving through the Creating the Bootstrap Cluster steps referenced above.

Generally problems occur when the bootstrap cluster is trying to provision the management cluster in the target environment. I’ve found the best way to troubleshoot the process is to view the logs of the capv-system pods on the bootstrap cluster. Normally, if there is a problem during deployment of the management cluster, you’ll see the clusterctl output hang at the following step:

26007 applymachines.go:46] Creating machines in namespace "default"

When a KinD cluster is created, the kubeconfig file is stored in the default location kubectl will look to for a config file (${HOME}/.kube/config) unless the $KUBECONFIG environment variable has been set. If no $KUBECONFIG envrionment variable is set, you can run the following command on the jumpbox server in another terminal to follow the capv-system pod’s logs:

kubectl logs -n capv-system $(kubectl -n capv-system get po -o jsonpath='{.items..metadata.name}') -f

For example, in an earlier deployment, there was a typo in my VSPHERE_RESOURCE_POOL variable that I was able to confirm by viewing the following error message in the capv-system logs:

E1217 21:07:41.348178       1 controller.go:218] controller-runtime/controller "msg"="Reconciler error" 
"error"="failed to reconcile VM: unable to get resource pool for \"default/management-cluster/management-cluster-controlplane-0\": resource pool 'Cluster-1/mannimal-k8s' not found"  
"controller"="vspheremachine" "request"={"Namespace":"default","Name":"management-cluster-controlplane-0"}

As you may notice from my error message, the bootstrap cluster is looking for a vSphere resource pool at Cluster-1/mannimal-k8s and is unable to find it. Through utilizing govc, I was able to confirm the full path of the resource pool and correct the VSPHERE_RESOURCE_POOL variable in my envvars.txt file. For additional troubleshooting tips, please refer to the troubleshooting guide from the CAPV documentation


This concludes Part 1 of my post on automating the deployment of Kubernetes clusters to VMware Cloud on AWS with ClusterAPI Provider vSphere. In this post, I walked through the various steps required to prepare the VMC environment to support cluster creation via CAPV as well as walking through the process of deploying the bootstrap and management clusters with clusterctl.

Join me in Part 2 of my post where I’ll utilize the management cluster to create a workload cluster that I can use to provision my applications!!

Leave a Reply

Your email address will not be published. Required fields are marked *