If you completed part 1 on this guide and created the Managed Airflow Environment on AWS, you are ready for the next level -
Part 2 - Enabling EKS scheduling
In Airflow we often use KubernetesPodOperator
DAG class. Airflow contacts k8s API-server and asks it to spawn a pod to perform the task. In this part of the tutorial you will connect MWAA to an EKS cluster on the same VPC, and schedule an example pod on top of it.
Pre-Requests:
You need an EKS cluster. if you are familiar with terraform, you can do it in one click using my EKS module
IAM Considerations:
We will add another permission to the MWAA execution role from part 1.
locals {
region = "xx-xxxx-x"
account_id = "xxxxxxxxxx"
eks_cluster_name = "xxxxxxxxxx"
}
resource "aws_iam_policy" "amazon_mwaa_eks_scheduling_policy" {
name = "amazon_mwaa_eks_scheduling_policy"
path = "/"
policy = <<POLICY
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"eks:DescribeCluster"
],
"Resource": "arn:aws:eks:${local.region}:${local.account_id}:cluster/${local.eks_cluster_name}"
}
]
}
POLICY
}
add the aws_iam_policy.amazon_mwaa_eks_scheduling_policy.arn
to the managed_policy_arns
list in the execution role:
resource "aws_iam_role" "mwaa_role" {
...
managed_policy_arns = [
aws_iam_policy.amazon_mwaa_policy.arn,
# Add the new policy:
aws_iam_policy.amazon_mwaa_eks_scheduling_policy.arn,
]
}
Complete TF source can be found here
EKS Considerations:
MWAA is outside the cluster scope. In order for managed Airflow to reach out for the API service, it will need an entry in the aws-auth
configmap in the kube-system
namespace. we are basically registering the MWAA execution role as a k8s user:
userarn = "arn:aws:iam::<account_id>:role/airflow-mwaa-role"
username = "mwaa-service"
groups = ["system:masters"]
also you will need to kubectl apply
the RBAC Role and RoleBinding:
kind: Role
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: mwaa-role
namespace: default
rules:
- apiGroups:
- ""
- "apps"
- "batch"
- "extensions"
resources:
- "jobs"
- "pods"
- "pods/attach"
- "pods/exec"
- "pods/log"
- "pods/portforward"
- "secrets"
- "services"
verbs:
- "create"
- "delete"
- "describe"
- "get"
- "list"
- "patch"
- "update"
---
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: mwaa-role-binding
namespace: default
subjects:
- kind: User
name: mwaa-service
roleRef:
kind: Role
name: mwaa-role
apiGroup: rbac.authorization.k8s.io
Final Step - Kubeconfig:
Now the last task is to give MWAA a kubeconfig to use. generate one with:
aws eks update-kubeconfig \
--region your-region \
--kubeconfig ./kube_config.yaml \
--name mwaa-eks \
--alias aws
and upload the generated file to the source bucket, to the same path of the dags:
aws s3 cp kube_config.yaml s3://my-mwaa-source/mwaa_source_example/dags
You will be adding this config file path to DAGs using KubernetesPodOperator
. view example
All Done!
Head over to MWAA Airflow UI, enable and trigger the kubernetes_pod_example
DAG. Once the DAG is started, inspect the created pod:
kubectl get po -n default -w
you will see the pod being spawned by airflow.
it didn’t work? troubleshoot