Ops on Kops

In this tutorial you’ll see how Sugarkube can be used to create an ops-type cluster running Jenkins, Prometheus and Keycloak. We’ll run it both locally on Minikube and as a private Kops cluster on AWS. A VPN will be created automatically to access it. After creating each cluster we’ll tear them down, deleting all resources that were created for them.

We assume you’ve already installed Sugarkube.

Depending on your AWS account it may cost you some money to follow this tutorial. After tearing down the cluster make sure to confirm all resources were deleted by checking the AWS console just to be 100% sure.

We have a docker container with most of the dependencies you need for this tutorial except OpenVPN and Minikube (because it doesn’t seem possible to run Minikube in a docker container). If you don’t want to install stuff on your local machine, just run our image with docker run -it sugarkube/tutorial:0.10.0 /bin/bash (but note you won’t actually be able to start a minikube cluster or use OpenVPN).


TLDR

If you’re in a hurry and don’t want to work through a long tutorial, just run the following commands. If you’ve got time or want to learn more about Sugarkube, skip this whole TLDR section.

git clone https://github.com/sugarkube/sample-project.git
cd sample-project
git checkout tutorial-0.10.0
sugarkube ws create stacks/ops.yaml local-ops workspaces/local-ops/
sugarkube kapps install stacks/ops.yaml local-ops workspaces/local-ops -x security:keycloak --one-shot --run-actions

If the last command tells you you’re missing dependencies, install them and rerun the command. Sugarkube should create a minikube cluster. Note: you can’t create a minikube cluster if you’re using our docker image.

When you’re done exploring it, delete the minikube cluster with minikube delete as usual. Then repeat the process using Kops on AWS (this will create cloud resources and probably cost you money).

Because S3 bucket names are globally unique and this is a tutorial you’ll need to explicitly name clusters to avoid collisions. In the following commands replace <UNIQUE_NAME> with a name of your choice for the demo cluster.

export AWS_ACCESS_KEY_ID=<your key>
export AWS_SECRET_ACCESS_KEY=<your secret>
export AWS_DEFAULT_REGION=eu-west-2
aws ec2 create-key-pair --key-name sugarkube-example | jq -r '.KeyMaterial' > ~/.ssh/sugarkube-example
chmod 0600 ~/.ssh/sugarkube-example
ssh-keygen -y -f ~/.ssh/sugarkube-example > ~/.ssh/sugarkube-example.pub
echo '127.0.0.1 kubernetes.default.svc.cluster.local' >> /etc/hosts
sugarkube ws create stacks/ops.yaml dev-ops workspaces/dev-ops/
sugarkube kapps install stacks/ops.yaml dev-ops workspaces/dev-ops/ --run-actions --one-shot -x security:keycloak -n <UNIQUE_NAME>

You should see an OpenVPN config file drop into your ~/Downloads directory. You can use that to connect to the cluster and access internal resources.

The installation of the bootstrap:nginx1 kapp may fail due to a bug in cert manager. To solve it, delete the bootstrap:nginx1 kapp and try to reinstall it again.

Finally, disconnect from the VPN and tear the cluster down with:

sugarkube kapps delete stacks/ops.yaml dev-ops workspaces/dev-ops --one-shot --connect --run-actions --ignore-errors -n <UNIQUE_NAME>
aws ec2 delete-key-pair --key-name sugarkube-example
rm ~/.ssh/sugarkube-example
rm ~/.ssh/sugarkube-example.pub

Grab the sample project

If you haven’t already got the sample project from a previous tutorial, you need to clone our sample project. Just run:

git clone https://github.com/sugarkube/sample-project.git
cd sample-project
git checkout tutorial-0.10.0

This time we’ll be using the stacks/ops.yaml stack config file. If you open it you’ll see two clusters are declared - local-ops and dev-ops. Let’s just create workspaces for both of them now:

sugarkube ws create stacks/ops.yaml local-ops workspaces/local-ops/
sugarkube ws create stacks/ops.yaml dev-ops workspaces/dev-ops/

Let’s see what they both run by using the kapps graph command again to save us digging into manifests:

You won’t see an SVG if you’re using our docker image. To avoid seeing an error you should also pass --no-open.

sugarkube kapps graph stacks/ops.yaml local-ops

Should open your SVG application and show you this:

local-ops dependencies

sugarkube kapps graph stacks/ops.yaml dev-ops

Should show you this:

dev-ops dependencies

OK so we can see both clusters are basically the same, except the dev-ops one does some setup around buckets and hosted zones, and also creates a VPN. If you’re not familiar with Keycloak, it can be used as a Single Sign-On provider and and OAuth identity provider (IdP). It’s a monster Java application. To speed things up a bit, let’s show how we can use Sugarkube’s selectors to exclude it from being installed into the cluster, and how we can install it when we choose later.

There are 2 selectors we can use: -i/--include and -x/--exclude. Both can be repeated multiple times. Kapps are selected by their fully-qualified name, which is of the form <manifest ID>:<kapp ID>, e.g. security:keycloak. If we want to select all kapps in a manifest we could use a * in place of a kapp ID, e.g. security:*. This would select all kapps in the security manifest. Many Sugarkube commands support selectors.

Let’s see what our dependency graphs look like if we use them:

sugarkube kapps graph stacks/ops.yaml local-ops -x security:keycloak

local-ops dependencies without keycloak

The security:keycloak kapps has been entirely removed from the graph because it was excluded and had no included children. By contrast, if we repeat the command and also exclude a kapp that does have children – e.g. bootstrap:nginx1 – we’ll see that just that node turns red:

sugarkube kapps graph stacks/ops.yaml local-ops -x bootstrap:nginx1 -x security:keycloak

local-ops dependencies without nginx1

That means its children will still be processed, but the bootstrap:nginx1 node won’t be installed or deleted. Its outputs will still be loaded though. It should be safe to generate a kapp’s outputs provided the best practices are followed – generating outputs should be both idempotent and not perform any destructive changes or side effects.

A local cluster

OK, let’s spin up a local Minikube cluster. In the stack config file stacks/ops.yaml we can see the local-ops cluster is defined like this:

defaults: &defaults
  provider_vars_dirs:   
    - ../providers/  
  # ...

local-ops:        # local stack for ops-type work
  <<: *defaults
  cluster: ops
  provider: local
  provisioner: minikube
  profile: minikube

The Ops cluster runs some large applications. Because of that we need to increase the spec of the Minikube cluster. It also installs Prometheus which needs us to pass certain flags to the minikube binary too.

We can specify extra flags to pass to the minikube binary by understanding how Sugarkube searches for config files to load.

Config loading

Sugarkube will search for config files to load and merge together by searching for directories or YAML files (with a .yaml extension) for all of the parameters of a stack (cluster name, region, provisioner, provider, etc.) plus the special file name values.yaml and the literal strings profiles and clusters. The aws provider also searches for directories called accounts.

Values in files further down the tree (closer to leaf nodes of the directory tree) will take precedence over ones declared closer to the root of the configured provider_vars_dirs.

Let’s walk though how this can be used in practice. Here’s the directory tree under our providers directory:

aws 
├── ...
local
├── profiles
│   └── minikube
│       ├── clusters
│       │   ├── ops
│       │   │   └── values.yaml
│       │   └── standard
│       │       └── values.yaml
│       └── values.yaml
└── values.yaml     # 1

Since the local-ops cluster is configured to use the local provider, we can put settings into a directory called local or file called local.yaml under one of the configured provider_vars_dirs. In our stack only one directory is configured for provider_vars_dirs, so most of the settings for this stack are in providers/local. However, that top-level values.yaml (#1) contains defaults that apply to all clusters regardless of whether they run on AWS or locally.

Profiles are a way of sharing common configurations across different individual clusters. In our example directory tree, we only have a single minikube profile. To be honest, it’s optional having a profiles directory. We could just have a directory for clusters, or even directories for each of the cluster names ops and standard. To keep things simple for the local configs, we always just have files called values.yaml.

In fact, we could achieve the same result by just having this alternative layout:

├── aws
├── local
│   ├── ops.yaml
│   └── standard.yaml
├── minikube.yaml
└── values.yaml

The config loading documentation shows the order of precedence that files get loaded and merged in. First Sugarkube would load values.yaml, then values from minikube.yaml would override it. Next it’d enter thelocal directory because we’re using the local provider. Then because the name of the local-ops cluster is ops, it’d use values from ops.yaml to override any existing values. Note: if the cluster was called aws it’d also enter the aws directory. In general, providing you create top-level directories based on the provider or provisioner you shouldn’t run into conflicts like this.

So that covers which files Sugarkube will load and merge values from. Let’s also cover the contents of those files.

Config merging

The contents of providers/local/profiles/minikube/values.yaml contains this:

kube_context: minikube

provisioner:
  # Values passed to `minikube start`. Underscores are converted to hyphens and
  # values keys prepended with two dashes. So e.g. `disk_size` is passed as `--disk-size`.
  params:
    start:        # parameters for the `minikube start` command
      bootstrapper: kubeadm
      memory: 4096
      cpus: 4
      disk_size: 50g

YAML files in provider vars directories can contain arbitrary keys and values. Only 2 are actually reserved by Sugarkube:

  • kube_context
  • provisioner

All the parameters for the provisioner go under the provisioner key. Different provisioners take different values. In general the keys under the params subkey correspond to commands on the binary used by the provisioner. So in the above we can see that the start key under params is used to pass flags to the minikube start command. When converting key names to command line flags, underscores will be replaced by hyphens, and two leading hyphens are added (again, please refer to the docs for each provisioner).

The standard cluster doesn’t override any of these values, so it’ll use these directly. However, the ops cluster does override the values - providers/local/profiles/minikube/clusters/ops/values.yaml contains this:

# a higher-specced local cluster  
provisioner:
  params:
    start:        # parameters for the `minikube start` command
      memory: 5120
      disk_size: 100g
      extra_config:       # special flag that can be repeated multiple times
        - kubelet.authentication-token-webhook=true   
        - kubelet.authorization-mode=Webhook
        - scheduler.address=0.0.0.0
        - controller-manager.address=0.0.0.0

Values from this config are merged with the values defined in values.yaml. The rule is simple - maps are merged, lists replace values. So the final config for the ops cluster will use 5120 MB of RAM instead of 4096, but will still use 4 CPUs.

It can sometimes be a bit difficult to reason about what the final set of values will be for a cluster after merging values from various files, so Sugarkube provides a sugarkube cluster vars command. Let’s use it:

sugarkube cluster vars stacks/ops.yaml local-ops

This prints out all the provisioner variables for the stack:

...
provisioner:
  params:
    start:
      bootstrapper: kubeadm
      cpus: 4
      disk_size: 100g
      extra_config:
      - kubelet.authentication-token-webhook=true
      - kubelet.authorization-mode=Webhook
      - scheduler.address=0.0.0.0
      - controller-manager.address=0.0.0.0
      memory: 5120
...
sugarkube:
  defaultVars:
  - local
  - minikube
  - ""
  - minikube
  - ops
  - ""

We can see it has the expected number of CPUs and RAM. The values under the sugarkube key show the different basenames used to search for .yaml files – so as explained above it’ll search for YAML files or directories called local, minikube or ops.

Creating the cluster

OK let’s create a local Minikube Ops cluster. For speed, let’s exclude Keycloak like before:

sugarkube kapps install stacks/ops.yaml local-ops workspaces/local-ops -x security:keycloak --one-shot --run-actions

After a while the cluster should come up. If you list the namespaces you should see that there isn’t one for keycloak:

$ kubectl get ns
NAME                  STATUS   AGE
cert-manager          Active   49m
default               Active   51m
jenkins               Active   4m4s
kube-node-lease       Active   51m
kube-public           Active   51m
kube-system           Active   51m
nginx1                Active   48m
prometheus-operator   Active   48m

To access Jenkins in your browser you’ll need to follow the same steps as in the Wordpress on Minikube tutorial.

Now, imagine you wanted to work on Keycloak after all. We’ve already mentioned it’s heavy, so let’s first of all delete Jenkins and Prometheus to free up some resources:

sugarkube kapps delete stacks/ops.yaml local-ops workspaces/local-ops -i ci-cd:jenkins -i 'monitoring:*' --one-shot

From the dependency graph we generated earlier, we can see Sugarkube should delete Jenkins before deleting its dependency Prometheus, and if you check the console, that’s what happens.

Free up resources

Now let’s just install Keycloak:

sugarkube kapps install stacks/ops.yaml local-ops workspaces/local-ops -i 'security:*' --one-shot --parents

Install Keycloak

Ordinarily when using an include selector (-i/--include) Sugarkube will only process directly selected kapps – in this case only security:keycloak. To make Sugarkube process all parents of selected kapps we can pass the --parents flag. This means we don’t need to manually inspect the dependency graph and individually select all the dependencies of the kapp(s) we want to install.

In this particular case all of Keycloak’s dependencies have already been installed. But if we had a kapp that, for example, used another kapp to create an RDS database (similar to the Wordpress on EKS dependencies), the --parents flag would mean we’d only need to select the Keycloak kapp and the kapp to create the database would also be installed. So in this case it’s not strictly necessary to pass --parents, but it’s a handy flag and worth pointing out.

OK so now if you have a look in your Minikube cluster you should see Keycloak is running. When you’re done you can nuke the Minikube cluster with minikube delete.

Into the Cloud

Let’s now create a private Kops cluster running the same appications as above. Creating private clusters is much more secure than just opening up everything to the Internet. We’ll create a private hosted zone on Route53 so DNS names will only resolve inside the VPC we create. This will stop anyone from performing reconnaisance on the services that we’re developing, and means we won’t need to worry about owning domain names in the same way as when we created an EKS cluster previously. On the flip-side though, we’ll need to come up with a way to actually gain access to the cluster ourselves so we can use it and access applications running in it.

Kops has a --bastion flag that will automatically create a bastion EC2 that can be used as a jumpbox into the cluster. It’s possible to whitelist IP ranges that can connect to the bastion’s ELB, and once connected we can jump to master nodes in the Kubernetes cluster itself in order to interact with the K8s API server. A simple, convenient and secure way of doing this is with SSH port-forwarding via the bastion to make requests to the Kubernetes API server. Sugarkube has support for setting up and tearing down SSH port-forwarding for private Kops clusters so you don’t need to worry about the complexity or setup of this yourself.

However, port-forwarding has a couple of limits. First, it’s less easy to control access if we’re relying on SSH keys. Rotating keys on servers isn’t very convenient. Secondly, to access our services running in the private cluster (e.g. Jenkins, Keycloak, etc.) things could soon become complicated and brittle.

Our solution is to automatically set up an AWS VPN. When we launch the Kops cluster, if you keep an eye on your ~/Downloads directory at some point you’ll see a new .ovpn file drop into it. This file is an OpenVPN config file. If you open it in an OpenVPN client (e.g. Tunnelblick on Macs, or natively on Linux) you should then be able to access the K8s cluster without requiring any SSH tunnelling, which is very cool.

Because we’ll be using SSH to set up port-forwarding initially though we’ll need to quickly set up your machine. This will probably be a one-time task, but it depends on how you decide to manage SSH keys within your team.

SSH port-forwarding prerequisites

First, we need to create an SSH keypair in AWS. You’ll need jq installed for the following command, or you can do it through the AWS console. Our demo is set up to run in eu-west-2 so it’d be easier to just create a key there. Export your AWS credentials and run:

export AWS_DEFAULT_REGION=eu-west-2
aws ec2 create-key-pair --key-name sugarkube-example | jq -r '.KeyMaterial' > ~/.ssh/sugarkube-example
chmod 0600 ~/.ssh/sugarkube-example

Next we need to generate a public key for it:

ssh-keygen -y -f ~/.ssh/sugarkube-example > ~/.ssh/sugarkube-example.pub

If you decide to call these keys anything other than sugarkube-example or to put them anywhere other than in ~/.ssh, you’ll need to modify the paths in providers/aws/kops.yaml. Edit the provisioner.ssh_private_key setting and the ssh_public_key setting under provisioner.params.create_cluster.

The final change we need to make is to add an additional entry to /etc/hosts:

127.0.0.1 kubernetes.default.svc.cluster.local

OK that’s the one-time setup done. If you wanted to automate all of that as well you could create a kapp that performed those steps too. That could be quite nice – you’d end up with a different key per ephemeral cluster. Of course, you’d need to take care to manage keys securely for prod clusters though.

Launching a private Kops cluster

We’ve got everything set up so let’s launch a Kops cluster.

Kops doesn’t seem to pick up AWS credentials from a named profile. So make sure to export your AWS creds as environment variables, i.e.

export AWS_DEFAULT_REGION=eu-west-2
export AWS_ACCESS_KEY_ID=<your key>
export AWS_SECRET_ACCESS_KEY=<your secret>

Because S3 bucket names are globally unique and this is a tutorial you’ll need to explicitly name clusters to avoid collisions. In the following commands replace <UNIQUE_NAME> with a name of your choice for the demo cluster.

We’ll do the same as before and exclude Keycloak:

export AWS_DEFAULT_REGION=eu-west-2
sugarkube kapps install stacks/ops.yaml dev-ops workspaces/dev-ops -x security:keycloak --one-shot --run-actions -n <UNIQUE_NAME>

If Sugarkube tells you you’re missing any dependencies, make sure to install them before continuing. You shouldn’t be missing anything if you’re using our docker image.

You should see something like this:

Kops create

The command will take a while, so you can log into the AWS console to monitor its progress form there. Because creating a cluster is an action on the prelaunch:private-hosted-zone kapp, Sugarkube will tell you it’s waiting on that kapp while Kops creates the cluster.

Notice in the above output that Sugarkube prints something like this part way through:

Setting up SSH port forwarding via the bastion to the internal API server...
SSH port forwarding established. Use KUBECONFIG=/var/folders/wx/7y183scj2hs6s4tbp9wgjlc00000gn/T/kubeconfig-devops1-438797326

At this point it’s set up SSH port-forwarding. You can confirm SSH port-forwarding is set up by running ps aux | grep ssh – you should see the ssh command that’s set up the tunnel. Sugarkube will probably set up port-forwarding before the actual Kubernetes API server is ready to serve requests though.

To interact with the Kubernetes API server using kubectl you just need to export the path to this KUBECONFIG file:

KUBECONFIG=/var/folders/wx/7y183scj2hs6s4tbp9wgjlc00000gn/T/kubeconfig-devops1-438797326 kubectl get ns

If you get an error Unable to connect to the server: EOF, it means the API server isn’t ready yet. This is what Sugarkube polls to test for cluster readiness. Once Sugarkube tells you the server is online you should get responses from the above command. Sugarkube will then continue to poll until all pods in the kube-system namespace are ready. That’s the point the cluster will be deemed ready and Sugarkube will continue to install the remaining selected kapps.

Another thing to point out is that the ID of the AMI to use for Kops nodes is grabbed dynamically at run-time by the prelaunch:kops-image kapp. Generally it’s better to pin requirements to specific versions, but by default Kops will download updates, sometimes rebooting nodes to apply updates. With only a single master that can make the API server inaccessible. So to pin to a specific version of a Kops AMI you’d need to disable this default behaviour in Kops. For simplicity, we just use a kapp to grab the latest AMI ID at runtime though.

Connecting to the cluster

Once the cluster has been created you should be able to connect to it using OpenVPN and the .ovpn file in your ~/Downloads directory. Once the VPN has connected, you should be able to use kubectl to interact with the cluster as usual, and you should be able to access your sites, e.g. https://jenkins.devops1.k8s.sugarkube.io. You should also be able to install other kapps with Sugarkube as usual.

Now, please disconnect from the VPN if you’ve connected so we can go over another way to connect to the cluster. Just as Sugarkube set up SSH port-forwarding while creating the cluster, we can make it set up a connection again should we need it. We can use the cluster connect command and it will setup SSH port-forwarding like we saw above. It’ll print out the path to a KUBECONFIG file that can be used to access the cluster. If the path to the temporary KUBECONFIG file is exported as an environment variable, kubectl, helm and even sugarkube itself will use it. So one possibility for interacting with the API server in a private cluster is just to run cluster connect in one shell and in another export the path to the KUBECONFIG file before running commands as usual.

Alternatively, we can make Sugarkube set up SSH port-forwarding for the duration of a command and tear it down again. The kapps install and kapps delete commands both accept a --connect flag which does just this.

Port forwarding will remain set up for as long as the cluster connect command stays running. If you also run kapps install or kapps delete and pass the --connect flag, port-forwarding may not work correctly as multiple SSH processes try to forward packets. If that happens, terminate both Sugarkube commands and make sure to kill ssh as well if ps aux | grep ssh shows it to be running.

Let’s show how we can install Keycloak into the cluster using the --connect flag. As mentioned above, make sure you’re not connected to the VPN.

sugarkube kapps install stacks/ops.yaml dev-ops workspaces/dev-ops -i 'security:*' --one-shot --parents --connect --run-actions -n <UNIQUE_NAME>

Kops install Keycloak

You could check it’s worked by connecting to the VPN and going to keycloak.devops1.k8s.sugarkube.io, or just by examining the keycloak namespace using kubectl.

Deleting the Kops cluster

Now we’ve seen how to create and use a private Kops cluster, let’s tear it all down again. Once again, make sure you’re not connected to the VPN because the AWS VPN endpoint will also be torn down. Just run:

sugarkube kapps delete stacks/ops.yaml dev-ops workspaces/dev-ops --one-shot --connect --run-actions -n <UNIQUE_NAME>
aws ec2 delete-key-pair --key-name sugarkube-example
rm ~/.ssh/sugarkube-example
rm ~/.ssh/sugarkube-example.pub

Kops delete

Notice how Sugarkube deletes both security:keycloak and ci-cd:jenkins in parallel because there are no dependencies between them. With a lot of applications performing tasks in parallel like this can really speed up creating and deleting clusters.

Please verify that all AWS resources created by Sugarkube have in fact been deleted from your AWS account to avoid unexpected charges.

Summary

This tutorial has shown how Sugarkube can install and delete only certain applications into a cluster. It then went on to show you how it can create a private Kops cluster on AWS. Next we covered different ways to connect to it, and finally tore it all down.

From here check out our other tutorials or learn more about Sugarkube’s concepts.