- blog/
Deploying OpenTelemetry Collector as sidecar on AWS Fargate
Observability is a critical aspect of managing and optimizing application performance in today’s cloud-native landscape. Integrating OpenTelemetry with AWS services provides a powerful solution to achieve comprehensive observability. In this blog post, I will show you how you can deploy the OpenTelemetry collector as a sidecar on fargate using AWS CDK and how to configure the collector to push metrics and traces to AWS Cloudwatch.
So what is OpenTelemetry? OpenTelemetry is an open-source observability framework that standardizes the collection, instrumentation, and export of telemetry data from various sources, including applications and infrastructure components. It provides a vendor-agnostic and language-agnostic approach, making it easier to collect metrics, traces, and logs across distributed systems. For more info, look at OpenTelemetry’s website.
I assume you are already familiar with CDK and have a deployed infrastructure and a running ASP.NET Core Web API.
So how to collect these metrics and traces? Well, there are two ways to do this:
- Using a sidecar deployment
- Using a service deployment
The difference is that for the sidecar, you will run an instance of the collector together with your application. Which, in general, is a little easier since you don’t have to deal with extra security-related things. The downside is it will consume resources of your fargate instance, so keep that in mind. In a service deployment, you will run one or multiple instances of the collector accessed from the different applications in your system. This is more complex since you need to deal with security and networking.
Adding the configuration to the container #
In this blog post, we will add the AWS OpenTelemtry collector as a sidecar en configure the web API to push metrics and traces to the collector. The AWS OpenTelemetry collector is a fork of the OpenTelemetry collector with some additional features. It has been optimized to be used on AWS. For more info, have a look at AWS OpenTelemetry collector Prerequisites
- Have AWS CDK installed
- Have an AWS Fargate cluster running
- Have an ASP.NET Core Web API running
The first thing to do is add the collector to your current task definition.
You can do this by adding the following code:
const collectorContainer = taskDefinition.addContainer('OpenTelemetryCollector', {
containerName: 'OpenTelemetryCollector',
image: ecs.ContainerImage.fromRegistry('opentelemetry/opentelemetry-collector:latest'),
command: ['--config=/etc/ecs/ecs-cloudwatch-xray.yaml', '--set=service.telemetry.logs.level=DEBUG'],
environment: {
// Environment variables for OpenTelemetry Collector
},
// Other container configuration options
});
This pulls the latest version of the AWS OpenTelemetry collector from the AWS registry and adds it to the task definition. It will start the collector with the default ecs-cloudwatch-xray configuration.
The next thing to do is add the port mapping to the container. So the collector can start receiving data from the application. This is done by adding the port mapping configuration to the container:
|
|
This opens the default ports for the collector. You are free to change the ports to your liking. Remember then to use the correct ports in the application configuration.
Changing the configuration #
Changing the configuration of the collector is a little tricky. To do this, you must set an environment variable to the container. You need to put the actual configuration in the SSM Parameter store and then reference it in the container. This is done by adding the following to the code above:
So for adding the configuration to the SSM Parameter store
const configString =
'extensions:\n' +
' health_check:\n' +
'\n' +
'receivers:\n' +
' otlp:\n' +
' protocols:\n' +
' grpc:\n' +
' endpoint: 0.0.0.0:4317\n' +
' http:\n' +
' endpoint: 0.0.0.0:4318\n' +
'\n' +
'processors:\n' +
' batch/traces:\n' +
' timeout: 1s\n' +
' send_batch_size: 50\n' +
' batch/metrics:\n' +
' timeout: 60s\n' +
' resourcedetection:\n' +
' detectors:\n' +
' - env\n' +
' - ecs\n' +
' - ec2\n' +
' resource:\n' +
' attributes:\n' +
' - key: TaskDefinitionFamily\n' +
' from_attribute: aws.ecs.task.family\n' +
' action: insert\n' +
' - key: aws.ecs.task.family\n' +
' action: delete\n' +
' - key: InstanceId\n' +
' from_attribute: host.id\n' +
' action: insert\n' +
' - key: host.id\n' +
' action: delete\n' +
' - key: TaskARN\n' +
' from_attribute: aws.ecs.task.arn\n' +
' action: insert\n' +
' - key: aws.ecs.task.arn\n' +
' action: delete\n' +
' - key: TaskDefinitionRevision\n' +
' from_attribute: aws.ecs.task.revision\n' +
' action: insert\n' +
' - key: aws.ecs.task.revision\n' +
' action: delete\n' +
' - key: LaunchType\n' +
' from_attribute: aws.ecs.launchtype\n' +
' action: insert\n' +
' - key: aws.ecs.launchtype\n' +
' action: delete\n' +
' - key: ClusterARN\n' +
' from_attribute: aws.ecs.cluster.arn\n' +
' action: insert\n' +
' - key: aws.ecs.cluster.arn\n' +
' action: delete\n' +
' - key: cloud.provider\n' +
' action: delete\n' +
' - key: cloud.platform\n' +
' action: delete\n' +
' - key: cloud.account.id\n' +
' action: delete\n' +
' - key: cloud.region\n' +
' action: delete\n' +
' - key: cloud.availability_zone\n' +
' action: delete\n' +
' - key: aws.log.group.names\n' +
' action: delete\n' +
' - key: aws.log.group.arns\n' +
' action: delete\n' +
' - key: aws.log.stream.names\n' +
' action: delete\n' +
' - key: host.image.id\n' +
' action: delete\n' +
' - key: host.name\n' +
' action: delete\n' +
' - key: host.type\n' +
' action: delete\n' +
'\n' +
'exporters:\n' +
' logging:\n' +
' verbosity: detailed\n' +
' awsxray:\n' +
' awsemf/application:\n' +
' namespace: ECS/AWSOTel/MyApplication\n' +
' log_group_name: \'/aws/ecs/application/metrics\'\n' +
' dimension_rollup_option: NoDimensionRollup\n' +
' resource_to_telemetry_conversion:\n' +
' enabled: true\n' +
'\n' +
'service:\n' +
' extensions: [health_check]\n' +
' pipelines:\n' +
' traces:\n' +
' receivers: [otlp]\n' +
' processors: [resourcedetection, batch/traces]\n' +
' exporters: [logging,awsxray]\n' +
' metrics/application:\n' +
' receivers: [otlp]\n' +
' processors: [resourcedetection, resource, batch/metrics]\n' +
' exporters: [awsemf/application]\n'
var myParameterStore = new ssm.StringParameter(this, 'myParameterStore', {
parameterName: 'ecs-cloudwatch-xray.yaml',
stringValue: configString,
});
Retrieve the configuration from the SSM Parameter store.
const config = ssm.StringParameter.fromStringParameterAttributes(this, 'config', {
parameterName: 'ecs-cloudwatch-xray.yaml',
});
And then, you need to add the environment variable to the container. The task definition will look like this:
const collectorContainer = taskDefinition.addContainer('OpenTelemetryCollector', {
containerName: 'OpenTelemetryCollector',
image: ecs.ContainerImage.fromRegistry('opentelemetry/opentelemetry-collector:latest'),
command: ['--config=/etc/ecs/ecs-cloudwatch-xray.yaml', '--set=service.telemetry.logs.level=DEBUG'],
environment: {
'AOT_CONFIG_CONTENT': config.stringValue
},,
//port mapping
portMappings: [
{ containerPort: 13133, hostPort: 13133, protocol: aws_ecs.Protocol.TCP, appProtocol: aws_ecs.AppProtocol.http, name: 'health' },
{ containerPort: 4317, hostPort: 4317, protocol: aws_ecs.Protocol.TCP, appProtocol: aws_ecs.AppProtocol.grpc, name: 'grpc' },
{ containerPort: 4318, hostPort: 4318, protocol: aws_ecs.Protocol.TCP , appProtocol: aws_ecs.AppProtocol.http, name: 'http' },
],
});
Adding the correct IAM permissions #
Adding the IAM permissions to the task definition is the last thing to get the sidecar collector to work. First, you need to retrieve the task role, and then you can assign the correct policies. This is done by adding the following code to the task definition:
const taskRole = FargateDefaultComponents.getDefaultTaskRole(this);
taskRole.addToPrincipalPolicy(new PolicyStatement({
actions: [
'logs:PutLogEvents',
'logs:CreateLogGroup',
'logs:CreateLogStream',
'logs:DescribeLogStreams',
'logs:DescribeLogGroups',
'xray:PutTraceSegments',
'xray:PutTelemetryRecords',
'xray:GetSamplingRules',
'xray:GetSamplingTargets',
'xray:GetSamplingStatisticSummaries'
],
resources: ['*'],
}));
Now you can deploy the stack, and the sidecar collector should be up and running. You need to adjust the web API to use the sidecar collector to get the metrics and traces visible. Depending on your programming language, this can be done in different ways. Here, you can find the documentation on configuring the OpenTelemetry collector exporters. For example, in an ASP.NET Core Web API, we have to do the following.
Adjust the web API to use the sidecar collector #
First thing todo is install the folowing NuGet packages:
dotnet add package OpenTelemetry
dotnet add package OpenTelemetry.Exporter.OpenTelemetryProtocol
dotnet add package OpenTelemetry.Instrumentation.AspNetCore --prerelease
After that, modify the Program.cs file to use the OpenTelemetry collector:
// ...
// Add OpenTelemetry
builder.Services.AddOpenTelemetry()
.ConfigureResource(x => x
.AddService("my-service"))
.WithTracing(x => x
.SetResourceBuilder(ResourceBuilder.CreateDefault().AddService("my-service"))
.AddSource("MyCompany.MyProduct.MyCategory")
.AddAspNetCoreInstrumentation()
.AddOtlpExporter(exporter => exporter.Endpoint = new Uri("http://localhost:4317")))
.WithMetrics(x => x
.SetResourceBuilder(ResourceBuilder.CreateDefault().AddService("my-service"))
.AddMeter("MyCompany.MyProduct.MyMeter")
.AddAspNetCoreInstrumentation()
.AddOtlpExporter(exporter => exporter.Endpoint = new Uri("http://localhost:4317")));
// ...
Next, create a new controller and add the following code:
using system;
using system.Collections.Generic;
using system.Linq;
using system.Threading.Tasks;
using Microsoft.AspNetCore.Mvc;
using Microsoft.Extensions.Logging;
using OpenTelemetry;
using OpenTelemetry.Metrics;
namespace MyWebApi.Controllers
{
[ApiController]
[Route("[controller]")]
public class WeatherForecastController : ControllerBase
{
private static readonly Meter MyMeter = MeterProvider.Default.GetMeter("MyCompany.MyProduct.MyMeter");
private static readonly Counter<long> RequestsCounter = MyMeter.CreateCounter<long>(
"requests_total",
"The total number of requests",
"method", "status");
private static readonly Gauge<double> RequestsDurationGauge = MyMeter.CreateGauge<double>(
"requests_duration_seconds",
"The duration of requests in seconds",
"method", "status");
private static readonly Histogram RequestsDurationHistogram = MyMeter.CreateHistogram<double>(
"requests_duration_seconds",
"The duration of requests in seconds",
"method", "status");
private readonly ILogger<WeatherForecastController> _logger;
private readonly Tracer _tracer;
public WeatherForecastController(ILogger<WeatherForecastController> logger, Tracer tracer)
{
_logger = logger;
_tracer = tracer;
}
[HttpGet]
public IEnumerable<WeatherForecast> Get()
{
using var activity = _tracer.StartActivity("WeatherForecastController.Get");
activity?.SetTag("method", "GET");
RequestsCounter.Add(1, "GET", "200");
var rng = new Random();
var result = Enumerable.Range(1, 5).Select(index => new WeatherForecast
{
Date = DateTime.Now.AddDays(index),
TemperatureC = rng.Next(-20, 55),
Summary = Summaries[rng.Next(Summaries.Length)]
})
.ToArray();
RequestsDurationGauge.Set(DateTime.UtcNow.Ticks - activity.StartTimeUtc.Ticks, "GET", "200");
RequestsDurationHistogram.Record(DateTime.UtcNow.Ticks - activity.StartTimeUtc.Ticks, "GET", "200");
return result;
}
}
}
After you deploy the stack and run the application, you should see the metrics in CloudWatch and the traces in X-Ray.
In the AWS portal, head over to CloudWatch and select Metrics. In the left pane, select All metrics and then choose the namespace you used in the OpenTelemetry collector configuration.
Search for ECS/AWSOTel/MyApplication
as defined in the OpenTelemetry collector configuration. You should see the metrics there.
For traces, head over to X-Ray and select Traces. You should see the traces there, search for service(id(name: "MyCompany.MyProduct.MyCategory" , type: "AWS::ECS::Fargate" ))
and the traces should show up in the list.