Sharing a filesystem between multiple Kubernetes workers

Using EFS

Posted by Jeroen Rijks on April 12, 2020 · 8 mins read

Timeout errors and asynchronous jobs

If a client sends a request to a server, and doesn’t receive a response for a long time, it will typically throw a timeout error. This marks the request as failed, and is done to stop users from waiting indefinitely. These timeout errors can happen for a variety of reasons, such as network infrastructure errors or bad error handling on the server side. Sometimes, this error is thrown when a perfectly valid request simply takes too long to process server-side.

To cut down on timeout errors, compute-intensive tasks can be handled asynchronously. I’ve seen this implemented in two ways, and in both cases, it starts with a synchronous response to the request, effectively saying “We’ve received your request, it’s valid, and are dealing with it”.

Polling

The last time I saw asynchronous PDF generation, the requester would then poll the server every second, asking if the PDF is ready yet. The server would give 2xx responses which would say “No, it’s not ready but keep asking me”, until the PDF was generated. The next polling request would receive the PDF as a response.

Websockets

The solution that’s being implemented now uses websockets. These open up a connection between the client and the server, allowing the server to send messages without receiving a request first. When the PDF is generated, the server sends a websocket “PDF is ready” message to the client. The client then sends an HTTPS request requesting the PDF, and the server responds synchronously.

Polling vs Websockets

Polling increases the total amount of network traffic, but the implementation I saw was used between two microservices running in the same Kubernetes cluster. Websockets require an open connection between clients and servers, and I don’t know whether this scales very well. Maybe it’ll be worth another post in the future.

Our problem

To avoid timeout errors in Passenger Assist, Transreport have made our PDF reports asynchronous. This works great in local development, where one machine runs both Rails and Sidekiq. This is because the Rails application has access to the PDF generated by Sidekiq. However, in Kubernetes, Rails is run on different pods to Sidekiq, preventing the API from accessing the generated PDF.

To solve this issue, a developer suggested writing PDFs to S3 in Sidekiq, and reading them from Rails. However, this would increase latency by sending requests all the way to the S3 API. Given the fact that Rails and Sidekiq are both running on the same Kubernetes cluster, this seems like a wasted opportunity. Therefore, I suggested mounting a volume into the worker nodes, to share the PDF between the processes locally. To solve this, I initially suggested that we could simply mount a directory from each EC2 instance straight into all Sidekiq and Rails pods running on it, but because the pods are spread across multiple workers, this would fail in cases where the cooperating Rails and Sidekiq pods weren’t running on the same machine. Therefore, I decided to use AWS EFS, a shared filesystem that can be mounted into all of our worker nodes, and then into all of our Sidekiq and Rails pods.

This AWS blog post offers a solution for managing EFS volumes in EKS clusters.

Solution overview

The solution creates a Kubernetes Persistent Volume Claim, which is available to other resources in the cluster (RBAC-permitting). An efs-provisioner pod (using a Docker image provided by quay.io) “supplies” this PVC with the EFS volume, so that any Kubernetes pod with access to the PVC can mount the EFS volume. This solution uses an existing EFS volume, so we’ll start with that.

Implementation - AWS side

When working with unfamiliar tech, I usually get it to run manually first, and then import it into Terraform afterwards.

First, I created a general purpose, bursting filesystem in the EFS console. After specifying the EFS type, I was prompted to create mount targets, which are used to grant access to the filesystem. Therefore, I created a mount target in each EKS subnet. After translating this to Terraform, it looks like this:

resource "aws_efs_file_system" "efs" {
  creation_token    = "<name>-${terraform.workspace}"
  performance_mode  = "generalPurpose"
  throughput_mode   = "bursting"
  encrypted         = "true"
}

resource "aws_efs_mount_target" "efs-mt" {
  count = length(local.private_subnets[terraform.workspace])  # Create one mount target for each subnet
  file_system_id  = aws_efs_file_system.efs.id
  subnet_id = element(local.private_subnets[terraform.workspace], count.index)
}

However, when I tried creating the efs-provisioner pod, it stalled at ContainerCreating, and its logs revealed that the volume was failing to mount into the container.

Warning FailedMount 1m kubelet, <ec2-instance-name> MountVolume.SetUp failed for volume "pv-volume" : mount failed: exit status 32

Luckily, I’m not the first person to find this issue, and the internet told me that the mount targets need to allow inbound TCP connections on port 2049. After adding this to Terraform, my efs.tf file looked like this:

resource "aws_efs_file_system" "efs" {
  creation_token    = "<name>-${terraform.workspace}"
  performance_mode  = "generalPurpose"
  throughput_mode   = "bursting"
  encrypted         = "true"
}

resource "aws_efs_mount_target" "efs-mt" {
  count = length(local.private_subnets[terraform.workspace])  # Create one mount target for each subnet
  file_system_id  = aws_efs_file_system.efs.id
  subnet_id = element(local.private_subnets[terraform.workspace], count.index)
  security_groups = [aws_security_group.ingress_efs.id]
}

resource "aws_security_group" "ingress_efs" {
  name        = "<name>-${terraform.workspace}-sg"
  description = "Allow EKS nodes to mount EFS storage volumes - Managed by Terraform"
  vpc_id      = local.vpc_id[terraform.workspace]

  ingress {
    security_groups = [local.eks_security_group]
    from_port = 2049
    to_port = 2049
    protocol = "tcp"
  }

  egress {
    security_groups = [local.eks_security_group]
    from_port = 2049
    to_port = 2049
    protocol = "tcp"
  }
}

Implementation - Kubernetes Side

The AWS blog post summarises the Kubernetes implementation, which is pretty simple. To get separate EFS volumes for each environment, I defined the EFS ID in environment-specific values files, and passed these into the configmap:

// charts/pa-config/templates/efs_configmap.yaml
...
data:
  file.system.id: 
  aws.region: 
  provisioner.name: 
  dns.name: ".efs..amazonaws.com"

These values are called by the efs_deployment.yaml file, which links the Terraform-managed EFS volume and the efs_claim.yaml PVC.

Finally, to mount this volume into my application pods, the pods mount the PVC:

volumes:
- name: efs-pvc
  persistentVolumeClaim:
    claimName: "efs-pa-"

and the pod containers mount the volume too:

volumeMounts:
- name: efs-pvc
  mountPath: "/<mount-path>"

Conclusion

Now, using kubectl exec -it -n <namespace> <pod-name> <command>, we can test the solution. First, I created a file in the pod-defined mount path (echo file_contents >> /<mount-path>/test_file) of the PDF-creating service. Then, execing into the PDF-sending service, ls /<mount-path> shows test_file, and cat /<mount-path>/test_file shows file_contents. Pushing the new Helm charts to our Helm repository means that future releases will work for any environment, so long as efs.enabled=true.



-->