Using CSE4.1 with Terraform VCD Provider 3.11.0

0
626

[ad_1]

The Terraform VMware Cloud Director Provider v3.11.0 now helps putting in and managing Container Service Extension (CSE) 4.1, with a brand new set of enhancements, the brand new vcd_rde_behavior_invocation knowledge supply and up to date guides for VMware Cloud
Director customers to deploy the required elements.

In this weblog publish, we can be putting in CSE 4.1 in an current VCD and creating and managing a TKGm cluster.

Preparing the set up

First of all, we should be sure that all of the conditions listed within the Terraform VCD Provider documentation are met. CSE 4.1 requires at the very least VCD 10.4.2, we will verify our VCD model within the popup that reveals up by clicking the About possibility inside the assistance “(?)” button subsequent to our username within the high proper nook:

Check that you just even have ALB controllers accessible to be consumed from VMware Cloud Director, because the created clusters require them for load-balancing functions.

Step 1: Installing the conditions

The first step of the set up mimics the UI wizard step during which conditions are created:

We will do that actual step programmatically with Terraform. To do this, let’s clone the terraform-provider-vcd repository so we will obtain the required schemas, entities, and examples:

git clone https://github.com/vmware/terraform-provider-vcd.git
cd terraform-provider-vcd
git checkout v3.11.0
cd examples/container-service-extension/v4.1/set up/step1

If we open 3.11-cse-install-2-cse-server-prerequisites.tf we will see that these configuration recordsdata create all of the RDE framework elements that CSE makes use of to work, consuming the schemas which can be hosted within the GitHub repository, plus all of the rights and roles which can be wanted. We gained’t customise something inside these recordsdata, as they create the identical gadgets because the UI wizard step proven within the above screenshot, which doesn’t permit customization both.

Now we open 3.11-cse-install-3-cse-server-settings.tf, this one is equal to the next UI wizard step:

We can observe that the UI wizard permits us to set some configuration parameters, and if we glance to terraform.tfvars.instance we’ll observe that the requested configuration values match.

Before making use of all of the Terraform configuration recordsdata which can be accessible on this folder, we’ll rename terraform.tfvars.instance to terraform.tfvars, and we’ll set the variables with appropriate values. The defaults that we will see in variables.tf and terraform.tfvars.instance match with these of the UI wizard, which needs to be good for CSE 4.1. In our case, our VMware Cloud Director has full Internet entry, so we’re not setting any customized Docker registry or certificates right here.

We must also have in mind that the terraform.tfvars.instance is asking for a username and password to create a consumer that can be used to provision API tokens for the CSE Server to run. We additionally go away these as they’re, as we just like the "cse_admin" username.

Once we evaluation the configuration, we will safely full this step by operating:

terraform init
terraform apply

The plan ought to show all the weather which can be going to be created. We full the operation (by writing sure to the immediate) so step one of the set up is completed. This might be simply checked within the UI as now the wizard doesn’t ask us to finish this step, as a substitute, it reveals the CSE Server configuration we simply utilized:

Step 2: Configuring VMware Cloud Director and operating the CSE Server

We transfer to the subsequent step, which is positioned at examples/container-service-extension/v4.1/set up/step2 of our cloned repository.

cd examples/container-service-extension/v4.1/set up/step2

This step is probably the most customizable one, because it is determined by our particular wants. Ideally, because the CSE documentation implies, there needs to be two Organizations: Solutions Organization and Tenant Organization, with Internet entry so all of the required Docker photos and packages might be downloaded (or with entry to an inner Docker registry if we had chosen a customized registry within the earlier step).

We can examine the totally different recordsdata accessible and alter all the things that doesn’t match with our wants. For instance, if we already had the Organization VDCs created, we may change from utilizing assets to utilizing knowledge sources as a substitute.

In our case, the VMware Cloud Director equipment the place we’re putting in CSE 4.1 is empty, so we have to create all the things from scratch. This is what the recordsdata on this folder do, they create a fundamental and minimal set of elements to make CSE 4.1 work.

Same as earlier than, we rename terraform.tfvars.instance to terraform.tfvars and examine the file contents so we will set the proper configuration. As we talked about, organising the variables of this step is determined by our wants and the way we wish to arrange the networking, the NSX ALB, and which TKGm OVAs we wish to present to our tenants. We must also bear in mind that some constraints have to be met, just like the VM Sizing Policies which can be required for CSE to work being revealed to the VDCs, so let’s learn and perceive the set up information for that objective.

Once we evaluation the configuration, we will full this step by operating:

terraform init
terraform apply

Now we must always evaluation that the plan is appropriate and matches to what we wish to obtain. It ought to create the 2 required Organizations, our VDCs, and most significantly, the networking configuration ought to permit Internet site visitors to retrieve the required packages for the TKGm clusters to be provisioned with out points (do not forget that within the earlier step, we didn’t set any inner registry nor certificates). We full the operation (by writing sure to the immediate) so the second step of the set up is completed.

We also can double-check that all the things is appropriate within the UI, or do a connectivity take a look at by deploying a VM and utilizing the console to ping an outside-world web site.

Cluster creation with Terraform

Given that now we have completed the set up course of and we nonetheless have the cloned repository from the earlier steps, we transfer to examples/container-service-extension/v4.1/cluster.

cd examples/container-service-extension/v4.1/cluster

The cluster is created by the configuration file 3.11-cluster-creation.tf, by additionally utilizing the RDE framework. We encourage the readers to verify each the vcd_rde documentation and the cluster administration information earlier than continuing, because it’s vital to know the way this useful resource works in Terraform, and most significantly, how CSE 4.1 makes use of it.

We will open 3.11-cluster-creation.tf and examine it, to instantly see that it makes use of the JSON template positioned at examples/container-service-extension/v4.1/entities/tkgmcluster.json.template. This is the payload that the CSE 4.1 RDE requires to initialize a TKGm cluster. We can customise this JSON to our wants, for instance, we’ll take away the defaultStorageClassOptions block from it as we gained’t use storage in our clusters.

The preliminary JSON template tkgmcluster.json.template appears to be like like this now:

{
“apiVersion”: “capvcd.vmware.com/v1.1”,
“kind”: “CAPVCDCluster”,
“name”: “${name}”,
“metadata”: {
“name”: “${name}”,
“orgName”: “${org}”,
“site”: “${vcd_url}”,
“virtualDataCenterName”: “${vdc}”
},
“spec”: {
“vcdKe”: {
“isVCDKECluster”: true,
“markForDelete”: ${delete},
“forceDelete”: ${force_delete},
“autoRepairOnErrors”: ${auto_repair_on_errors},
“secure”: {
“apiToken”: “${api_token}”
}
},
“capiYaml”: ${capi_yaml}
}
}

There’s nothing else that we will customise there, so we go away it like that.

The subsequent factor that we discover is that we want a legitimate CAPVCD YAML, we will obtain it from right here. We’ll deploy a v1.25.7 Tanzu cluster, so we obtain this one to start out making ready it.

We open it with our editor and add the required snippets as acknowledged in the documentation. We begin with the variety: Cluster blocks which can be required by the CSE Server to provision clusters:

apiVersion: cluster.x-k8s.io/v1beta1
variety: Cluster
metadata:
identify: ${CLUSTER_NAME}
namespace: ${TARGET_NAMESPACE}
labels: # We add this block
cluster-role.tkg.tanzu.vmware.com/administration: “”
tanzuKubernetesLaunch: ${TKR_VERSION}
tkg.tanzu.vmware.com/cluster-name: ${CLUSTER_NAME}
annotations: # We add this block
TKGVERSION: ${TKGVERSION}
# …

We added the 2 labels and annotations blocks, with the required placeholders TKR_VERSION, CLUSTER_NAME, and TKGVERSION. These placeholders are used to set the values through Terraform configuration.

Now we add the Machine Health Check block, which can permit to make use of one of many new highly effective options of CSE 4.1, that remediates nodes in failed standing by changing them, enabling cluster self-healing:

apiVersion: cluster.x-k8s.io/v1beta1
variety: MachineHealthCheck
metadata:
identify: ${CLUSTER_NAME}
namespace: ${TARGET_NAMESPACE}
labels:
clusterctl.cluster.x-k8s.io: “”
clusterctl.cluster.x-k8s.io/transfer: “”
spec:
clusterName: ${CLUSTER_NAME}
maxUnhealthy: ${MAX_UNHEALTHY_NODE_PERCENTAGE}%
nodeStartupTimeout: ${NODE_STARTUP_TIMEOUT}s
selector:
matchLabels:
cluster.x-k8s.io/cluster-name: ${CLUSTER_NAME}
unhealthyConditions:
– sort: Ready
standing: Unknown
timeout: ${NODE_UNKNOWN_TIMEOUT}s
– sort: Ready
standing: “False”
timeout: ${NODE_NOT_READY_TIMEOUT}s

Notice that the timeouts have an s because the values launched throughout set up had been in seconds. If we hadn’t put the worth in seconds, or we put the worth like 15m, we will take away the s suffix from these block choices.

Let’s add the final components, that are most related when specifying customized certificates through the set up course of. In variety: KubeadmConfigTemplate we should add the preKubeadmCommands and useExperimentalRetryJoin blocks beneath the spec > customers part:

preKubeadmCommands:
– mv /and many others/ssl/certs/custom_certificate_*.crt
/usr/native/share/ca-certificates && update-ca-certificates
useExperimentalRetryJoin: true

In variety: KubeadmControlPlane we should add the preKubeadmCommands and controllerManager blocks contained in the kubeadmConfigSpec part:

preKubeadmCommands:
– mv /and many others/ssl/certs/custom_certificate_*.crt
/usr/native/share/ca-certificates && update-ca-certificates
controllerManager:
extraArgs:
enable-hostpath-provisioner: “true”

Once it’s accomplished, the ensuing YAML needs to be much like the one already supplied within the examples/cluster folder, cluster-template-v1.25.7.yaml, because it makes use of the identical model of Tanzu and has all of those additions already launched. This is an effective train to verify whether or not our YAML is appropriate earlier than continuing additional.

After we evaluation the crafted YAML, let’s create a tenant consumer with the Kubernetes Cluster Author function. This consumer can be required to provision clusters:

knowledge “vcd_global_role” “k8s_cluster_author” {
identify = “Kubernetes Cluster Author”
}

useful resource “vcd_org_user” “cluster_author” {
identify = “cluster_author”
password = “dummyPassword” # This one needs to be in all probability a wise variable and a bit safer.
function = knowledge.vcd_global_role.k8s_cluster_author.identify
}

Now, we will full the customization of the configuration file 3.11-cluster-creation.tf by renaming terraform.tfvars.instance to terraform.tfvars and configuring the parameters of our cluster. Let’s verify ours:

vcd_url = “https://…”
cluster_author_user = “cluster_author”
cluster_author_password = “dummyPassword”

cluster_author_token_file = “cse_cluster_author_api_token.json”

k8s_cluster_name = “example”
cluster_organization = “tenant_org”
cluster_vdc = “tenant_vdc”
cluster_routed_network = “tenant_net_routed”

control_plane_machine_count = “1”
worker_machine_count = “1”

control_plane_sizing_policy = “TKG small”
control_plane_placement_policy = “”””
control_plane_storage_profile = “*”

worker_sizing_policy = “TKG small”
worker_placement_policy = “”””
worker_storage_profile = “*”

disk_size = “20Gi”
tkgm_catalog = “tkgm_catalog”
tkgm_ova_name = “ubuntu-2004-kube-v1.25.7+vmware.2-tkg.1-8a74b9f12e488c54605b3537acb683bc”

pod_cidr = “100.96.0.0/11”
service_cidr = “100.64.0.0/13”

tkr_version = “v1.25.7—vmware.2-tkg.1”
tkg_version = “v2.2.0”

auto_repair_on_errors = true

We can discover that control_plane_placement_policy = """", that is to keep away from errors once we don’t wish to use a VM Placement Policy. We can verify that the downloaded CAPVCD YAML forces us to put double quotes on this worth when it’s not used.

The tkr_version and tkg_version values had been obtained from the already supplied in the documentation.

Once we’re proud of the totally different choices, we apply the configuration:

terraform init
terraform apply

Now we must always evaluation the plan as a lot as doable to forestall errors. It ought to create the vcd_rde useful resource with the weather we supplied.
We full the operation (by writing sure to the immediate) so the cluster ought to begin getting created. We can monitor the method both in UI or with the 2 outputs supplied for example:

locals tobool(jsondecode(vcd_rde.k8s_cluster_instance.input_entity)[“spec”][“vcdKe”][“forceDelete”])
has_status = lookup(native.k8s_cluster_computed, “status”, null) != null

output “computed_k8s_cluster_status” {
worth = native.has_status && !native.being_deleted ? native.k8s_cluster_computed[“status”][“vcdKe”][“state”] : null
}

output “computed_k8s_cluster_events” {
worth = native.has_status && !native.being_deleted ? native.k8s_cluster_computed[“status”][“vcdKe”][“eventSet”] : null
}

Then we will do terraform refresh as many occasions as we would like, to observe the occasions with:

terraform output computed_k8s_cluster_status
terraform output computed_k8s_cluster_events

Once computed_k8s_cluster_status states provisioned, this step can be completed and the cluster can be prepared to make use of. Let’s retrieve the Kubeconfig, which in CSE 4.1 is completed fully in a different way than in 4.0, as we’re required to invoke a Behavior to get it. In 3.11-cluster-creation.tf we will see a commented part that has a vcd_rde_behavior_invocation knowledge supply. If we uncomment these and do one other terraform apply, we must always be capable to get the Kubeconfig by operating

terraform output kubeconfig

We can reserve it to a file to start out interacting with our cluster and kubectl.

Cluster replace

Example use case: we realized that our cluster is simply too small, so we have to scale it up. We’ll arrange 3 employee nodes.

To replace it, we have to make sure that it’s in provisioned standing. For that, we will use the identical mechanism that we used when the cluster creation began:

terraform output computed_k8s_cluster_status

This ought to show provisioned. If that’s the case, we will proceed with the replace.

As with the cluster creation, we first want to know how the vcd_rde useful resource works to keep away from errors, so it’s inspired to verify each the vcd_rde documentation and the cluster administration information earlier than continuing. The vital concept is that we should replace the input_entity argument with the knowledge that CSE saves within the computed_entity attribute, in any other case, we may break the cluster.

To do this, we will use the next output that may return the computed_entity attribute:

output “computed_k8s_cluster” {
worth = vcd_rde.k8s_cluster_instance.computed_entity # References the created cluster
}

Then we run this command to reserve it to a file for a greater studying:

Let’s open computed.json for inspection. We can simply see that it appears to be like just about the identical as tkgmcluster.json.template however with the addition of an enormous "standing" object that incorporates very important details about the cluster. This have to be despatched again on updates, so we copy the entire "standing" object as it’s and we place it within the unique tkgmcluster.json.template.

After that, we will change worker_machine_count = 1 to worker_machine_count = 3 within the current terraform.tfvars, to finish the replace course of with:

terraform apply

Now it’s essential to confirm and make sure that the output plan reveals that the "standing" is being added to the input_entity payload. If that’s not the case, we must always cease the operation instantly and verify what went improper. If "standing" is seen within the plan as being added, you’ll be able to full the replace operation by writing sure to the immediate.

Cluster deletion

The essential concept of deleting a TKGm cluster is that we must always not use terraform destroy for that, even when that’s the first concept we take note of. The purpose is that the CSE Server creates a variety of parts (VMs, Virtual Services, and many others) that might be in an “orphan” state if we simply delete the cluster RDE. We have to let the CSE Server do the cleanup for us.

For that matter, the vcd_rde current in 3.11-cluster-creation.tf incorporates two particular arguments, that mimic the deletion possibility from UI:

delete = false # Make this true to delete the cluster
force_delete = false # Make this true to forcefully delete the cluster

To set off an asynchronous deletion course of we must always change them to true and execute terraform apply to carry out an replace. We should additionally introduce the newest "standing" object to the tkgmcluster.json.template when making use of, just about like within the replace situation described within the earlier part.

Final ideas

We hope you loved the method of putting in CSE 4.1 in your VMware Cloud Director equipment. For a greater understanding of the method, please learn the present installation and cluster administration guides.

LEAVE A REPLY

Please enter your comment!
Please enter your name here