Troubleshooting AWS IAM Authenticator

We recently added the AWS IAM Authenticator to our custom configured (non-EKS) Kubernetes clusters running in AWS. For an automated installation the process involves pre-generating some config and certs, updating a line in the API Server manifest and installing a daemonset.

In this blog I’ll detail how we set things up iteratively and provide some useful commands to help confirm each component works. These same commands can be used when troubleshooting issues later on.

Our motivation for installing the AWS IAM Authenticator was to open up kubectl access on clusters to different groups with more granular permissions. We already manage IAM users for everyone that requires access and decided to use this same identity for Kubernetes cluster access.

Installation

The documentation is really good so I’d recommend reading through this and completing those steps manually first. Then use the information below to compare when automating.

The first piece we automated was the pre-generation of the certs and kubeconfig. We do this to avoid needing to restart the API Server after the daemonset is installed. In Ansible our configuration looks like this.

We set the cluster-id to environment_name.region.environment_type which maps to how we configure DNS for our clusters. The args: creates: on the initialise task means this only runs once.

Now we can run this and make sure we have these files in /var/aws-iam-authenticator.

Also cat the kubeconfig file copied to /etc/kubernetes/aws-iam-authenticator/kubeconfig.yaml to make sure it has some yaml inside. You shouldn’t need to change this kubeconfig file after generation.

Now it’s time to add some configuration to the kubernetes API Server configuration manifest. The API Server is configured to use the pre-generated kubeconfig when webtoken authentication is used.

Here’s a complete manifest used with Kubernetes 1.11.8.

According to the documentation the only additional settings to apply to your current config are –authentication-token-webhook-config-file=/etc/kubernetes/aws-iam-authenticator/kubeconfig.yaml and to mount /etc/kubernetes/aws-iam-authenticator into the API Server container.

As part of this work I also added audit-policy configuration. For now we’re just using the default policy given as an example in the Kubernetes docs.

Changes made to /etc/kubernetes/manifests/kube-apiserver.yaml are automatically picked up by Kubelet and the API Server is restarted. You can troubleshoot any problems with the API Server configuration using journalctl -u kubelet -n –no-pager.

Next we install a deamonset that runs an AWS IAM Authenticator pod on each master node.

I’d recommend for your first attempt keeping this extremely simple so you can test and debug issues. An example daemonset and configmap is shown below.

Some important things to verify are that your clusterID matches the same string you used to pre-generate the config. Also ensure you have server args kubeconfig-pregenerated=true on the container config otherwise the daemonset will attempt to generate on every restart.

The configmap shows I’m simply mapping my [email protected] IAM user account in AWS to the built-in system:masters Kubernetes group. Change this to whatever IAM user you want to test with.

This will let you test the end to end process using your own account. In future iterations you can remove this config and use another option like mapping AWS IAM roles to specific Kubernetes groups.

That should be all of the server side configuration now done. The last step is a kubeconfig file to use locally. Here’s an example.

For this config to work you need the ca.pem for the cluster in the same directory as the kubeconfig file. You’ll also need your ip address whitelisted to the API Server security group. Also, make sure your user matches what’s in the configmap mapping. Finally, the last line of the config needs to match the clusterID specified.

I’ve automated the creation of kubeconfig files with a simple script that whitelists, copies down the ca.pem from the cluster and generates the kubeconfig file and then prints an alias. So we just change directory into the environment dir in our Terraform repo and run the script and it does everything.

Copying and pasting that alias then sets all k commands to use the kubeconfig file. I find this is quite handy for opening up new shell sessions which can be quickly configured with a single command in the relevant environment directory to enable kubectl access.

When all goes well you can run.

And if you check the logs on the master you’ll see.

Now you’re done and you can focus on changing the settings in the configmap to iterate how users are mapped to groups.

Troubleshooting

What steps should you take when it all goes wrong and you can’t work out why? Here’s a quick summary of how I’d systematically work through from client to daemonset to see where the problem lies.

  1. Turn on kubectl debug logs

I wasted a couple of days trying to work out why nothing was showing in my aws-iam-authenticator pod logs when I ran commands. The reason was that I’d forgotten to put user: into my kubeconfig context. This meant kubectl wasn’t sending requests with a bearer token. Adding –v=10 to my kubectl alias immediately showed my requests weren’t authenticating. I wish I’d have done this sooner.

  1. Try to generate a token manually

This should print out a token. If this doesn’t work then you have a problem with your AWS IAM account permissions.

Now grab that k8s-aws-v1.somereallylongstring token from the output of the last command and try to use it on the master directly against the authenticator.

If successful you’ll get some output like.

If this works you know it’s not a token issue and you should get some kind of meaningful error to debug.

  1. Try a curl via the API Server directly

You’ll need to use a token again like you did with the previous command but this time specified in the header. If this works you’ll get some nice json output showing you were authenticated and some server version details.

If this doesn’t work and you get an authentication error check to make sure you’re using the correct keys for the AWS account. It’s quite easy to generate tokens using the wrong AWS keys and then wonder why the API Server 401’s.

  1. Check the AWS IAM Authenticator daemonset container logs

The easiest way to do this is to SSH in to each master and run the docker logs command to pull back the logs for the aws-iam-authenticator

Here we can see an unsuccessful login attempt by [email protected] as that user isn’t mapped followed by a successful login.

  1. Updating the configmap isn’t dynamic

Unfortunately the AWS IAM Authenticator binary running the daemonset container doesn’t automatically pick up configmap changes. You’ll need to orchestrate a container restart when this changes.

  1. Troubleshoot RBAC permissions

There’s a cool kubectl subcommand called auth can-i that you can use to verify permissions for your user.

  1. Enable the API Server audit log

As you can see from the API Server configuration posted above I chose to also enable the audit log. It’s useful to check /var/log/apiserver/audit.log when very weird things are happening. I’ve not yet looked into tuning the policy to highlight errors more visibly but it’s on the backlog as a task.

  1. Ensure the correct kubectl and Kubernetes versions

You need to be running version 1.10.x or higher for both of these. Run kubectl version to check.

Wrapping up

I’ve not gone into a lot of detail surrounding the advanced mapping of users. At work we’ve moved to using a single AWS account solely for IAM identity purposes. Those single user accounts then assume role into other accounts for different environment types. We’re still in the process of rolling this out fully so I’ve not spent a massive amount of time working with AWS role mappings to custom Kubernetes groups.

Our goal is to granularly define what groups of users can access which types of clusters with well defined RBAC permissions and lock certain groups down to only certain namespaces. If there’s interest I’ll do a follow up blog when this work is complete.

Hopefully this is useful for anyone using the AWS IAM Authenticator in their own custom Kubernetes clusters in AWS. As always post any corrections or questions below and I’ll try to answer.

Related

A good resume is akin to successful content marketing for job seekers. Where you are the product being sold. When you…

Read more

This is part 1 of a 2 part blog where we look at Kubernetes cluster creation times on Azure AKS and Google GKE. In this…

Read more

There are so many options to choose from it can be a daunting task to even get started with Kubernetes. Here's some…

Read more

Tell us about a new Kubernetes application

Newsletter

Never miss a thing! Sign up for our newsletter to stay updated.

About

Discover and learn about everything Kubernetes

Navigation
Follow