AWS Container series – Blog 3 – Troubleshooting ECS Containers using ECS Exec
Introduction
Troubleshooting Docker containers locally is as simple as executing commands or creating a shell directly to the docker container. This process changes within AWS with increasing complexity levels depending on whether you use traditional Amazon EC2 instances or serverless AWS Fargate to host your Docker containers.
Debugging Fargate containers is notoriously tricky, considering you cannot access the underlying instance or container. A brand new feature released by AWS today makes accessing Fargate containers possible by allowing command execution via docker exec.
Amazon ECS Exec
Amazon ECS Exec utilises AWS System Manager Session Manager to run single commands or access a shell to troubleshoot the running container instances. Session Manager has been established for a while now to SSH into EC2 Instances directly from the SSH console. Unfortunately, console access is still in development, with access only via the SDK and CLI.
This new feature debunks the key reason why some developers show resent towards AWS Fargate; although it is serverless and reduces the management overhead, including patching and scaling, the inability to access the container itself is off-putting for most.
AWS best-practice highly recommends against accessing containers directly via ssh to debug and troubleshoot; the recommended method is that developers should implement monitoring and log analysis to keep engineers at a distance and away from data. ECS Exec does not replace this best practice; instead, it secures it, ensuring only secure, encrypted access is available using temporary credentials.
The ability to access a running container within AWS brings functionality that is useful especially around debugging containers in an early lifecycle. This allows container analysis and retrieval of data which contributes towards future development or allowing for production critical errors to be fully investigated and resolved more efficiently than redeployment.
As a new feature, Amazon ECS Exec requires a few prerequisites:
- Linux Instance, windows is currently not supported.
- Latest ECS Optimised Container AMI
- Container Agent Version 1.50.2 or Fargate Platform Version 1.4. Or later
- SSM Session Manager Plugin to be installed to the instance
Security Considerations
Accessing production docker containers can raise a few questions, especially around the security implications. AWS has closely integrated AWS IAM Roles and Policies to tightly lockdown access to executing ECS Exec commands. Access is locked down via three key principles – ECS Tagging Values, Container Name and ECS Cluster ARN. For example, the above diagram outlines the user access process alongside a role that denies access to the MySQL Container but allows access to NGINX and API. In this instance, the MySQL container is locked down to keep users away from data.
Additionally, a new ECS action has been created as ecs:ExecuteCommand. An example IAM Policy can be viewed below;
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "ecs:ExecuteCommand" ], "Condition": { "StringEquals": { "aws:ResourceTag/tag-key": "<tag-value>”", "StringEquals": { "ecs:container-name": "<container_name>" } } }, "Resource":"arn:aws:ecs:<region>:<aws_account_id>:cluster/<cluster_name>" } ] }
Developers can lock down access to a specific Amazon ECS Cluster and each container running within the cluster, allowing for fine-grained control on a container by container level. All Amazon ECS Exec commands are logged via CloudTrail for auditing.
Logging
Amazon ECS Exec supports commands to be logged to either an S3 Bucket or CloudWatch Log group for analysis, archiving, and auditing purposes, as outlined in the above diagram. Logging options are configured during the execute command configuration settings allowing for the logging type, either S3 / CloudWatch Logs or both, encryption settings and S3 key to be defined.
Executing commands that append /bin/bash and create a shell to the container will log both the commands and output to the chosen logging location. Whereas executing single commands will only log the command output.
Logging can also be disabled to stop any logging to either S3 or CloudWatch, use the existing awslogs configuration within the contained task definition, or override to the configuration specified in the execute command configuration as detailed below when creating the ECS Cluster via the CLI.
executeCommandConfiguration={kmsKeyId=string,\ logging=string,\ logConfiguration={cloudWatchLogGroupName=string,\ cloudWatchEncryptionEnabled=boolean,\ s3BucketName=string,\ s3EncryptionEnabled=boolean,\ s3KeyPrefix=string}}
Executing Commands
As previously mentioned, commands can only be executed via AWS CLI or SDK, with console access aimed for release in the future. Once the Amazon ECS cluster, task definition, service, and task are running with the prerequisites met, a final IAM role is required to allow AWS System Manager Session Manager access and create and put logs to S3 and CloudWatch Logs.
Running Single Commands
Executing single commands only logs the command output to S3 or CloudWatch Logs. For example, if a simple pwd is run, the default directory of the container will be printed. Similarly, to list the directory contents, the following command can be executed:
aws ecs execute-command \ --region $AWS_REGION \ --cluster ecs-exec-demo-cluster \ --task 1234567890123456789 \ --container nginx \ --command "ls" \ --interactive
Creating a Shell
Creating a shell allows for full CLI access to the container, allowing full access to navigate, troubleshoot, and resolve any container errors. Accessing a shell is similar to running a single command such as ls, apart from /bin/bash is passed instead. Importantly, in this mode,Amazon ECS will log all container commands and outputs to S3 or CloudWatch.
aws ecs execute-command \ --region $AWS_REGION \ --cluster ecs-exec-demo-cluster \ --task 1234567890123456789 \ --container nginx \ --command "/bin/bash" \ --interactive
CirrusHQ
CirrusHQ has a wide range of experience in provisioning, deploying, managing, and troubleshooting container instances within AWS. If you would like to find out more from CirrusHQ regarding your container deployments, feel free to contact us via our contact page.