In this post, I’m going to walk through the process of installing and using Velero v1.1 to back up a Kubernetes application that includes persistent data stored in persisentvolumes
. I will then simulate a DR scenario by completely deleting the application and using Velero to restore the application to the cluster, including the persistent data.
Meet Velero!! ⛵
Velero is a backup and recovery solution built specifically to assist in the backup (and migration) of Kubernetes applications, including their persistent storage volumes. You can even use Velero to back up an entire Kubernetes cluster for restore and/or migration! Velero address various use cases, including but not limited to:
- Taking backups of your cluster to allow for restore in case of infrastructure loss/corruption
- Migration of cluster resources to other clusters
- Replication of production cluster/applications to dev and test clusters
Velero is essentially comprised of two components:
- A server that runs as a set of resources with your Kubernetes cluster
- A command-line client that runs locally
Velero also supports the back up and restore of Kubernetes volumes using restic, an open source backup tool. Velero will need to utilize a S3 API-compatible storage server to store these volumes. To satisfy this requirement, I will also deploy a Minio server in my Kubernetes cluster so Velero is able to store my Kubernetes volume backups. Minio is a light weight, easy to deploy S3 object store that you can run on premises. In a production environment, you’d want to deploy your S3 compatible storage solution in another cluster or environment to prevent from total data loss in case of infrastructure failure.
Environment Overview
As a level set, I’d like to provide a little information about the infrastructure I am using in my lab environment. See below for infrastructure details:
- VMware vCenter Server Appliance 6.7u2
- VMware ESXi 6.7u2
- VMware NSX-T Datacenter 2.5.0
- VMware Enterprise PKS 1.5.0
Enterprise PKS handles the Day1 and Day2 operational requirements for deploying and managing my Kubernetes clusters. Click here for additional information on VMware Enterprise PKS.
However, I do want to mention that Velero can be installed and configured to interact with ANY Kubernetes cluster of version 1.7 or later (1.10 or later for restic support).
Installing Minio
First, I’ll deploy all of the components required to support the Velero service, starting with Minio.
First things first, I’ll create the velero
namespace to house the Velero installation in the cluster:
$ kubectl create namespace velero
I also decided to create a dedicated storageclass
for the Minio service to use for its persistent storage. In Enterprise PKS Kubernetes clusters, you can configure the vSphere Cloud Provider plugin to dynamically create VMDK
s in your vSphere environment to support persistentvolumes
whenever a persistentvolumeclaim
is created in the Kubernetes cluster. Click here for more information on the vSphere Cloud Provider plugin:
$ kubectl create -f minio-storage-class.yaml
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: minio-disk
provisioner: kubernetes.io/vsphere-volume
parameters:
diskformat: thin
Now that we have a storage class, I’m ready to create a persistentvolumeclaim
the Minio service will use to store the volume backups via restic. As you can see from the example .yaml
file below, the previously created storageclass
is referenced to ensure the persistentvolume
is provisioned dynamically:
$ cat minio-pvc.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: velero-claim
namespace: velero
annotations:
volume.beta.kubernetes.io/storage-class: minio-disk
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
$ kubectl create -f minio-pvc.yaml
Verify the persistentvolumeclaim
was created and its status is Bound
:
$ kubectl get pvc -n velero
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
minio-claim Bound pvc-cc7ac855-e5f0-11e9-b7eb-00505697e7e7 6Gi RWO minio-disk 8s
Now that I’ve created the storage to support the Minio deployment, I am ready to create the Minio deployment. Click here for access to the full .yaml
file for the Minio deployment:
$ kubectl create -f minio-deploy.yaml
deployment.apps/minio created
service/minio created
secret/cloud-credentials created
job.batch/minio-setup created
ingress.extensions/velero-minio created
Use kubectl
to wait for the minio-xxxx
pod to enter the Running
status:
$ kubectl get pods -n velero -w
NAME READY STATUS RESTARTS AGE
minio-754667444-zc2t2 0/1 ContainerCreating 0 4s
minio-setup-skbs6 1/1 Running 0 4s
NAME READY STATUS RESTARTS AGE
minio-754667444-zc2t2 1/1 Running 0 9s
minio-setup-skbs6 0/1 Completed 0 11s
Now that our Minio application is deployed, we need to expose the Minio service to requests outside of the cluster via a LoadBalancer
service type with the following command:
$ kubectl expose deployment minio --name=velero-minio-lb --port=9000 --target-port=9000 --type=LoadBalancer --namespace=velero
Note, because of the integration between VMware Enterprise PKS and VMware NSX-T Datacenter, when I create a “LoadBalancer” service type in the cluster, the NSX Container Plugin, which we are using as our Container Network Interface, reaches out to the NSX-T API to automatically provision a virtual server in a NSX-T L4 load balancer.
I’ll use kubectl
to retrieve the IP of the virtual server created within the NSX-T load balancer and access the Minio UI in my browser at EXTERNAL-IP:9000
I am looking for the IP address under the EXTERNAL-IP
section for the velero-minio-lb
service, 10.96.59.116
in this case:
$ kubectl get services -n velero
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
minio ClusterIP 10.100.200.160 <none> 9000/TCP 7m14s
velero-minio-lb LoadBalancer 10.100.200.77 10.96.59.116 9000:30711/TCP 12s
Now that Minio has been succesfully deployed in the my Kubernetes cluster, I’m ready to move on to the next section to install and configure Velero and restic.
Installing Velero and Restic
Now that I have an s3-compatible storage solution deployed in my environment, I am ready to complete the installation of Velero (and restic).
However, before I move forward with the installation of Velero, I need to install the Velero CLI client on my workstation. The instructions detailed below will allow you to install the client on a Linux server (I’m using a CentOS 7 instance).
First, I navigated to the Velero github releases page and copied the link for the v1.1 .rpm
file for my OS distribution:
Then, I used wget
to pull the image down to my linux server, extracted the contents of the file, and moved the velero
binary into my path:
$ cd ~/tmp
$ wget https://github.com/vmware-tanzu/velero/releases/download/v1.1.0/velero-v1.1.0-linux-amd64.tar.gz
$ tar -xvf https://github.com/vmware-tanzu/velero/releases/download/v1.1.0/velero-v1.1.0-linux-amd64.tar.gz
$ sudo mv velero-v1.1.0-linux-amd64/velero /usr/bin/velero
Now that I have the Velero client installed on my server, I am ready to continue with the installation.
I’ll create a credentials-velero
file that we will use during install to authenticate against the Minio service. Velero will use these credentials to access Minio to store volume backups:
$ cat credentials-velero
[default]
aws_access_key_id = minio
aws_secret_access_key = minio123
Now I’m ready to install Velero! The following command will complete the installation of Velero (and restic) where:
--provider aws
instructs Velero to utilize S3 storage which is running on-prem, in my case--secret-file
is our Minio credentials--use-restic
flag ensures Velero knows to deploy restic forpersistentvolume
backups-
--s3Url
value is the address of the Minio service that is only resolvable from within the Kubernetes cluster *--publicUrl
value is the IP address for theLoadBalancer
service that allows access to the Minio UI from outside of the cluster:$ velero install –provider aws –bucket velero –secret-file credentials-velero \ –use-volume-snapshots=false –use-restic –backup-location-config \ region=minio,s3ForcePathStyle=”true”,s3Url=http://minio.velero.svc:9000,publicUrl=http://10.96.59.116:9000
Velero is installed! ⛵ Use ‘kubectl logs deployment/velero -n velero’ to view the status.
Note: The velero install
command creates a set of CRDs that power the Velero service. You can run velero install --dry-run -o yaml
to output all of the .yaml
files used to create the Velero deployment.
After the installation is complete, I’ll verify that I have 3 restic-xxx
pods and 1 velero-xxx
pod deployed in the velero
namespace. As the restic service is deployed as a daemonset
, I will expect to see a restic
pod per node in my cluster. I have 3 worker nodes so I should see 3 restic pods
:
Note: Notice the status of the restic-xxx
pods…
$ kubectl get pod -n velero
NAME READY STATUS RESTARTS AGE
minio-5559c4749-7xssq 1/1 Running 0 7m21s
minio-setup-dhnrr 0/1 Completed 0 7m21s
restic-mwgsd 0/1 CrashLoopBackOff 4 2m17s
restic-xmbzz 0/1 CrashLoopBackOff 4 2m17s
restic-235cz 0/1 CrashLoopBackOff 4 2m17s
velero-7d876dbdc7-z4tjm 1/1 Running 0 2m17s
As you may notice, the restic pods are not able to start. That is because in Enterprise PKS Kubernetes clusters, the path to the pods on the nodes is a little different (/var/vcap/data/kubelet/pods) than in “vanilla” Kubernetes clusters (/var/lib/kubelet/pods). In order to allow the restic pods to run as expected, I’ll need to edit the restic daemon set and change the hostPath variable as referenced below:
$ kubectl edit daemonset restic -n velero
volumes:
- hostPath:
path: /var/vcap/data/kubelet/pods
type: ""
name: host-pods
Now I’ll verify all of the restic
pods are in Running
status:
$ kubectl get pod -n velero
NAME READY STATUS RESTARTS AGE
minio-5559c4749-7xssq 1/1 Running 0 12m
minio-setup-dhnrr 0/1 Completed 0 12m
restic-p4d2c 1/1 Running 0 6s
restic-xvxkh 1/1 Running 0 6s
restic-e31da 1/1 Running 0 6s
velero-7d876dbdc7-z4tjm 1/1 Running 0 7m36s
Woohoo!! Velero is successfully deployed in my Kubernetes clusters. Now I’m ready to take some backups!!
Backup/Restore the WordPress Application using Velero
Now that I’ve deployed Velero and all of its supporting components in my cluster, I’m ready to perform some backups. But in order to taste my backup/recovery solution, I’ll need an app that preferably utilizes persistent data.
In one of my previous blog posts, I walked through the process of deploying Kubeapps in my cluster to allow me to easily deploy application stacks to my Kubernetes cluster.
For this exercise, I’ve used Kubeapps to deploy a WordPress blog that utilizes persistentvolumes
to store post data for my blog. I’ve also populated the blog with a test post to test backup and recovery.
First, I’ll verify that the WordPress pods are in a Running
state:
$ kubectl get pods -n wordpress
NAME READY STATUS RESTARTS AGE
cut-birds-mariadb-0 1/1 Running 0 23h
cut-birds-wordpress-fbb7f5b76-lm5bh 1/1 Running 0 23h
I’ll also verify the URL of my blog and access it via my web browser to verify current state:
$ kubectl get svc -n wordpress
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
cut-birds-mariadb ClusterIP 10.100.200.39 <none> 3306/TCP 19h
cut-birds-wordpress LoadBalancer 10.100.200.32 10.96.59.116 80:32393/TCP,443:31585/TCP 19h
Everything looks good, especially the cat!!
In order for Velero to understand where to look for persistent data to back up, in addition to other Kubernetes resources in the cluster, we need to annotate each pod that is utilizing a volume so Velero backups up the pods AND the volumes.
I’ll review both of the pods in the wordpress
namespace to view the name of each volume being used by each pod:
$ kubectl describe pod/cut-birds-mariadb-0 -n wordpress
---output omitted---
Volumes:
data:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: data-cut-birds-mariadb-0
ReadOnly: false
config:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: cut-birds-mariadb
Optional: false
default-token-6q5xt:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-6q5xt
Optional: false
$ kubectl describe pods/cut-birds-wordpress-fbb7f5b76-lm5bh -n wordpress
---output omitted---
Volumes:
wordpress-data:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: cut-birds-wordpress
ReadOnly: false
default-token-6q5xt:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-6q5xt
Optional: false
As you can see, the mariadb
pod is using 2 volumes: data
and config
, while the wordpress
pod is utilizing a single volume: wordpress-data
.
I’ll run the following commands to annotate each pod with the backup.velero.io
tag with each pods’ corresponding volume(s):
$ kubectl -n wordpress annotate pod/cut-birds-mariadb-0 backup.velero.io/backup-volumes=data,config
$ kubectl -n wordpress annotate pod/cut-birds-wordpress-fbb7f5b76-lm5bh backup.velero.io/backup-volumes=wordpress-data
Now I’m ready to use the velero
client to create a backup. I’ll name the backup wordpress-backup
and ensure the backup only includes the resources in the wordpress
namespace:
$ velero backup create wordpress-backup --include-namespaces wordpress
Backup request "wordpress-backup" submitted successfully.
Run `velero backup describe wordpress-backup` or `velero backup logs wordpress-backup` for more details.
I can also use the velero
client to ensure the backup is compeleted by waiting for Phase: Complete
:
$ velero backup describe wordpress-backup
Name: wordpress-backup
Namespace: velero
Labels: velero.io/storage-location=default
Annotations: <none>
Phase: Completed
--output omitted--
I’ll navigate back to the web browser and refresh (or log back into) the Minio UI. Notice the restic
folder, which holds houses our backups persistent data, as well as a backups
folder:
I’ll select the backups
folder and note the wordpress-backup
folder in the subsequent directory. I’ll also explore the contents of the wordpress-backup
folder, which contains all of the Kubernetes resources from mywordpress
namespace:
Now that I’ve confirmed my backup was successful and have verified the data has been stored in Minio via the web UI, I am ready to completely delete my WordPress application. I will accomplish this by deleting the wordpress
namespace, which will delete all resources created in the namespace to support the WordPress application, even the persistentvolumeclaim
s
$ kubectl delete namespace wordpress
$ kubectl get pods -n wordpress
$ kubectl get pvc -n wordpress
After I’ve confirmed all of the resources in the wordpress
namespace have been deleted, I’ll refresh the browser to verify the blog is no longer available.
Now we’re ready to backup!! I’ll use the velero
client to verify the existence/name of the backup that was previously created and restore the backup to the cluster:
$ velero backup get
NAME STATUS CREATED EXPIRES STORAGE LOCATION SELECTOR
wordpress-backup Completed 2019-10-03 15:47:07 -0400 EDT 29d default <none>
$ velero restore create --from-backup wordpress-backup
I can monitor the pods in the wordpress
namespace and wait for both pods to show 1/1
in the READY
column and Running
in the STATUS
column:
$ kubectl get pods -n wordpress -w
NAME READY STATUS RESTARTS AGE
cut-birds-mariadb-0 0/1 Init:0/1 0 12s
cut-birds-wordpress-fbb7f5b76-qtcpp 0/1 Init:0/1 0 13s
cut-birds-mariadb-0 0/1 PodInitializing 0 18s
cut-birds-mariadb-0 0/1 Running 0 19s
cut-birds-wordpress-fbb7f5b76-qtcpp 0/1 PodInitializing 0 19s
cut-birds-wordpress-fbb7f5b76-qtcpp 0/1 Running 0 20s
cut-birds-mariadb-0 1/1 Running 0 54s
cut-birds-wordpress-fbb7f5b76-qtcpp 1/1 Running 0 112s
Then, I can verify the URL of the WordPress blog:
$ kubectl get services -n wordpress
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
cut-birds-mariadb ClusterIP 10.100.200.39 <none> 3306/TCP 2m56s
cut-birds-wordpress LoadBalancer 10.100.200.32 10.96.59.120 80:32393/TCP,443:31585/TCP 2m56s
And finally, I can access the URL of the blog in the web broswer, confirm the test post that was visible initially is still present:
There you have it!! Our application and it’s persistent data have been completely restored!!
In this example, we manually created a backup, but we can also use the Velero client to schedule backups on a certain interval. See examples below:
velero schedule create planes-daily --schedule="0 1 * * *" --include-namespaces wordpress
velero schedule create planes-daily --schedule="@daily" --include-namespaces wordpress
Conclusion
In this blog post, I walked through the process of installing Velero in a Kubernetes cluster, including all it’s required components, to support taking backups of Kubernetes resources. I also walked through the process for taking a backup, simulating a data loss scenario, and restoring that backup to the cluster.
I see at least a few issues from the very beginning.
For example:
1) right after `kubectl create -f minio-pvc.yaml` we can observe odd copy-paste of minio-storage-class.yaml.
2) `kubectl create -f minio-deploy.yaml` fails with:
-ValidationError(Ingress): unknown field “servicePort” in io.k8s.api.extensions.v1beta1.Ingress,
-ValidationError(Ingress.spec.rules[0].http.paths[0].backend): missing required field “servicePort” in io.k8s.api.extensions.v1beta1.IngressBackend
@Nick, thank you for pointing out the copy pasta errors, just fixed those.
Regarding the minio-deploy.yaml. Are you attempting to deploy this on a Kubernetes cluster with an ingress controller deployed? If not, that may be why you are seeing the error.
The Velero team has also updated the documentation (and the minio-deployment.yaml) for Velero 1.2. The updated .yaml file can be found here: https://github.com/vmware-tanzu/velero/blob/master/examples/minio/00-minio-deployment.yaml.
I’ll update the article to reflect the change.
https://waterfallmagazine.com
Spot on with this write-up, I seriously believe that this site needs a lot more attention.
I’ll probably be returning to read through more, thanks for the advice!
hey, many thanks for the work you have done here. It’s excellent. I have one question, please excuse in case it feels that my question is already answered in your write-up. What is Restic providing, that cannot be done with Velero alone? Thank You.
Hi Cyrus, thanks for the comment! Restic is actually used to take file level backups of the persistent volumes. Velero has native support for snapshotting volumes but not all CSIs support this feature. So that’s where restic comes in to play! To provide backup of persitent volumes when volume snapshotting is not supported by the underlying CSI.