In this blog we will demonstrate how to deploy a EKS cluster across AWS regions/local zones with LoxiLB as an auto-scalable and in-cluster network load-balancer. We will provide complete set of steps to bring up such an cluster fully automated with Terraform. We will further elaborate the various benefits of such a deployment from an end-user perspective and compelling business value it can create.
Deployment Architecture and its Benefits
Let's take a quick look at the benefits of this whole use-case including LoxiLB :
Cut down costs and flexibility with LoxiLB and auto-scaler node-groups
LoxiLB can reduce your cost is available by running in-cluster as compared to ELB. ELB services operate independently of your Kubernetes cluster. In this use-case LoxiLB is scheduled on auto-scaling node-group which runs as part of the cluster. With fine-grained policies, the LB nodes can be scaled up or down aligned to the business needs.
Optimized for multi-homed networking
For clusters using Multus (secondary network interfaces), LoxiLB becomes even more effective. LoxiLB manages traffic routing across multiple networks within the cluster. Many workloads like Telco Apps or Kube-virt based apps need multi-networked pods.
Highly Performant
LoxiLB is already highly performant thanks to its efficient eBPF implementation. Here, it provides further optimization by utilizing EKS VPC CNI’s feature which allows podIPs to be directly reachable inside a VPC. Hence, we are able to streamline traffic ingress'ing into the EKS cluster by bypassing unnecessary Kubernetes networking layers.
All LocalZones do not have managed ELB support
AWS Local Zones are a type of AWS infrastructure deployment that places compute, storage, database, and other select AWS services closer to large population centers, industries, and IT hubs. The primary goal of AWS Local Zones is to provide ultra-low latency access to applications and services, improving performance for specific use cases such as real-time gaming, video streaming, augmented/virtual reality (AR/VR), and machine learning at the edge. Not all zones offer ELB services such as NLB. For such local-zones, this provides much relief to the users who need load-balancing for their workloads.
Full integration with Route53
This use-case is based on an active-active HA model. The services created in Kubernetes can get directly updated on Route53 records. Since an instance's elasticIP lives outside EKS, there has been no straight forward way to integrate them in EKS. We have done extensive integration/automation with LoxiLB, external-dns and Route53 to achieve this.
Last but not the least, one also gets an on-prem' style LB in their EKS deployments. The overall deployment topology will be similar to the following figure:
Prerequisites before starting
Make sure you have the latest versions of awscli, eksctl, kubectl and terraform tools configured in the host. The host should also have sufficient IAM privileges to do cluster operations among others.
Create an EKS cluster
We will create an EKS cluster in the main AWS regional zone, having 3 sets of worker node-groups. One will be created in AWS main region(with one node). The other two node-groups ((with two nodes each) will be used to run LoxiLB and workload pods respectively. This has been completely automated with Terraform. The terraform scripts sets up the cluster and also IAM roles/K8s service accounts necessary for cluster access from inside the cluster using OIDC based scheme. Terraform variables are set to create a cluster in "us-east-1" region. Please feel free to check the GitHub repo and change as per your need.
$ git clone https://github.com/loxilb-io/demo-examples
$ cd demo-examples/terraform/eks-inclb
$ terraform init
$ terraform apply
Span the cluster across LocalZone (Optional)
This cluster can also span across AWS local-region as well. If you want to setup LoxiLB in the local zone then local-az in your region needs to be enabled(e.g. "us-east-1-atl-2a" is a local zone in "us-east-1" region). The terraform script in this blog does not create a NodeGroup in the local zone. However one can follow other examples such as the one found here.
Check the EKS cluster status
Let's get back to our original cluster we created. At this point, we can check the status of the cluster and its nodes :
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
ip-192-168-68-85.ec2.internal Ready <none> 84m v1.31.0-eks-a737599
ip-192-168-73-205.ec2.internal Ready <none> 71m v1.31.0-eks-a737599
ip-192-168-81-126.ec2.internal Ready <none> 71m v1.31.0-eks-a737599
ip-192-168-85-237.ec2.internal Ready <none> 125m v1.31.0-eks-a737599
ip-192-168-90-199.ec2.internal Ready <none> 84m v1.31.0-eks-a737599
Create LoxiLB CRD
$ kubectl apply -f https://raw.githubusercontent.com/loxilb-io/kube-loxilb/refs/heads/main/manifest/crds/loxilb-url-crd.yaml
Deploy LoxiLB incluster
$ kubectl apply -f yaml/loxilb.yaml
daemonset.apps/loxilb-lb created
service/loxilb-lb-service created
This is pretty straightforward apart from the fact that it uses an InitContainer to get instance metadata to populate K8s CRDs.
Deploy kube-loxilb component (LoxiLB's operator)
$ kubectl apply -f yaml/kube-loxilb.yaml
serviceaccount/kube-loxilb created
clusterrole.rbac.authorization.k8s.io/kube-loxilb created
clusterrolebinding.rbac.authorization.k8s.io/kube-loxilb created
deployment.apps/kube-loxilb created
Check LoxiLB CRD driven node publicIP registration in EKS
By now, LoxiLB would have updated its node's public IP to Kubernetes via loxi CRDs as can be verified below :
$ kubectl describe loxiurl | grep "Loxi URL"
Loxi URL: 54.234.13.xxx
Loxi URL: 34.229.17.xxx
External-DNS/Route53 Setup
The following steps needs to be followed to make sure external-DNS is able to communicate with Route53
Setup IAM Permissions
Create a policy to set up IAM permissions that will allow ExternalDNS to update Route53 DNS records. route53_policy.json
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"route53:ChangeResourceRecordSets"
],
"Resource": [
"arn:aws:route53:::hostedzone/*"
]
},
{
"Effect": "Allow",
"Action": [
"route53:ListHostedZones",
"route53:ListResourceRecordSets",
"route53:ListTagsForResource"
],
"Resource": [
"*"
]
}
]
}
Use AWS CLI to create the policy with the following command:
$ aws iam create-policy --policy-name "AllowExternalDNSUpdates" --policy-document file://route53_policy.json
{
"Policy": {
"PolicyName": "AllowExternalDNSUpdates",
"PolicyId": "ANPA4CF3XA2FPM25QT3TB",
"Arn": "arn:aws:iam::829322364554:policy/AllowExternalDNSUpdates",
"Path": "/",
"DefaultVersionId": "v1",
"AttachmentCount": 0,
"PermissionsBoundaryUsageCount": 0,
"IsAttachable": true,
"CreateDate": "2024-10-17T07:14:35+00:00",
"UpdateDate": "2024-10-17T07:14:35+00:00"
}
}
$ export POLICY_ARN=$(aws iam list-policies \
--query 'Policies[?PolicyName==`AllowExternalDNSUpdates`].Arn' --output text)
Create an IAM role bound to service account
$ eksctl create iamserviceaccount \
--cluster demo \
--region us-east-1 \
--name "external-dns" \
--namespace "default" \
--attach-policy-arn $POLICY_ARN \
--approve
First we need to check if RBAC is enabled in your cluster with the following command:
$ kubectl api-versions | grep rbac.authorization.k8s.io
rbac.authorization.k8s.io/v1
If RBAC is turned on, get the eks role-arn:
$ kubectl describe sa external-dns
Name: external-dns
Namespace: default
Labels: app.kubernetes.io/managed-by=eksctl
Annotations: eks.amazonaws.com/role-arn: arn:aws:iam::829322364554:role/eksctl-eks-loxilb-lz-cluster-addon-iamserviceaccou-Role1-hHpB9SHbHTUu
Image pull secrets: <none>
Mountable secrets: <none>
Tokens: <none>
Events: <none>
Then, use the manifest file in yaml/external-dns-with-rbac.yaml after replacing the role-arn to deploy ExternalDNS.
If RBAC is not enabled:
Then, we need to use the manifest file yaml/external-dns-with-no-rbac.yaml
Create the externalDNS deployment
$ kubectl apply yaml/external-dns-xxx.yaml
Verify the externalDNS deployment
$ kubectl get deployments
NAME READY UP-TO-DATE AVAILABLE AGE
external-dns 1/1 1 1 13h
Test !!!
We use a test nginx pod with the following yaml :
apiVersion: v1
kind: Service
metadata:
name: nginx-lb1
annotations:
external-dns.alpha.kubernetes.io/hostname: www.multi-xxx-domain.com
loxilb.io/usepodnetwork : "yes"
spec:
externalTrafficPolicy: Local
loadBalancerClass: loxilb.io/loxilb
selector:
what: nginx-test
ports:
- port: 80
targetPort: 80
type: LoadBalancer
---
apiVersion: v1
kind: Pod
metadata:
name: nginx-test
labels:
what: nginx-test
spec:
nodeSelector:
node: wlznode02
containers:
- name: nginx-test
image: nginx
imagePullPolicy: Always
ports:
- containerPort: 80
We need to note a couple of annotations here
external-dns.alpha.kubernetes.io/hostname : This annotation is used by external DNS to pull details of the service only it is in Ready state.
loxilb.io/usepodnetwork : This annotation signals to LoxiLB to directly reach the pod using PodIP as discussed before.
And apply it :
$ kubectl apply -f yaml/nginx.yaml
service/nginx-lb1 created
pod/nginx-test created
Kindly note that one would need to edit cluster security groups to allow the traffic (this is not handled by Terraform). From AWS console: EKS -> Clusters -> <Name> - > Networking -> ClusterSecurityGroup.
Lets check the created K8s services:
$ kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.100.0.1 <none> 443/TCP 49m
nginx-lb1 LoadBalancer 10.100.144.82 34.229.17.XX,52.90.160.XX 80:31800/TCP 28m
So, we are able to list the publicIP of all the nodes that have LoxiLB scheduled. We can reach this via each of this external IPs or use domain-name which is set to auto-failover for an active-active HA setup:
Test Access with Domain-Name
$ curl http://www.multi-xxx-domain.com
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
html { color-scheme: light dark; }
body { width: 35em; margin: 0 auto;
font-family: Tahoma, Verdana, Arial, sans-serif; }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>
<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>
<p><em>Thank you for using nginx.</em></p>
</body>
</html>
Test Access with PublicIP
$ curl http://34.229.17.XX
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
html { color-scheme: light dark; }
body { width: 35em; margin: 0 auto;
font-family: Tahoma, Verdana, Arial, sans-serif; }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>
<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>
<p><em>Thank you for using nginx.</em></p>
</body>
</html>
Performance
For testing the performance we used the same setup as above. Additionally, we launched a EC2 host in the same zone/subnet as loxilb and worker nodes and ran a series of test. We measured the performance of LoxiLB vs NodePort exposed by EKS/Kubernetes. NodePort (although not a production option) was chosen as a baseline because it is available inside Kubernetes cluster via kube-proxy. and supposed to give the best numbers possible for comparison.
All EKS worker node AMIs used m5a.xlarge for these tests
As seen from the above charts, LoxiLB based workloads performed better or equal in almost all the tests performed. LoxiLB performed exceptionally well in requests per second as well as overall latency for the requests which is crucial for various applications.
Conclusion
In this blog, we learned how LoxiLB deployed within an auto-scaled node group in AWS region/Local Zones, integrated with Route 53, offers a robust and scalable solution for low-latency, high-performance applications. This kind of setup ensures seamless traffic distribution and dynamic scaling, allowing your infrastructure to efficiently handle fluctuating workloads. The integration with Route 53 enables intelligent routing and global DNS management, further enhancing application availability and performance.
Additionally, by leveraging AWS Local Zones for proximity to end-users and LoxiLB’s efficient load balancing capabilities, this architecture delivers improved responsiveness, cost optimization through autoscaling, and a highly resilient infrastructure for modern, demanding applications.
Credits
Special thanks to Saravanan Shanmugan for his collaboration, unwavering support, and invaluable feedback on this post. As a Hybrid Cloud and Networking Expert leading innovative solutions at Amazon Web Services, his insights and expertise made this possible.
Commenti