Open Telemetry
What is OpenTelemetry?
A short explanation of what OpenTelemetry is, and is not.
OpenTelemetry is an Observability framework and toolkit designed to create and manage telemetry data such as traces, metrics, and logs. Crucially, OpenTelemetry is vendor- and tool-agnostic, meaning that it can be used with a broad variety of Observability backends, including open source tools like Jaeger and Prometheus, as well as commercial offerings. OpenTelemetry is a Cloud Native Computing Foundation (CNCF) project.
OpenTelemetry is the mechanism by which application code is instrumented, to help make a system observable.
Telemetry refers to data emitted from a system, about its behavior. The data can come in the form of traces, metrics, and logs.
Distributed Tracing: A Mental Model
Microservices provides a powerful architecture, but not without its own challenges, especially with regards to debugging and observing distributed transactions across complex networks.
Distributed tracing as described in Google’s Dapper paper includes anomaly detection, diagnosing steady-state problems, distributed profiling, resource attribution, and workload modeling of microservices.
- Trace: The description of a transaction as it moves through a distributed system.
- Span: A named, timed operation representing a piece of the workflow. Spans accept key:value tags as well as fine-grained, timestamped, structured logs attached to the particular span instance.
- Span context: Trace information that accompanies the distributed transaction, including when it passes the service to service over the network or through a message bus. The span context contains the trace identifier, span identifier, and any other data that the tracing system needs to propagate to the downstream service.
Components
The main components that make up OpenTelemetry
Specification
Describes the cross-language requirements and expectations for all implementations. Beyond a definition of terms, the specification defines the following:
- API: Defines data types and operations for generating and correlating tracing, metrics, and logging data.
- SDK: Defines requirements for a language-specific implementation of the API. Configuration, data processing, and exporting concepts are also defined here.
- Data: Defines the OpenTelemetry Protocol (OTLP) and vendor-agnostic semantic conventions that a telemetry backend can provide support for.
For more information, see the specifications.
Collector
The OpenTelemetry Collector is a vendor-agnostic proxy that can receive, process, and export telemetry data. It supports receiving telemetry data in multiple formats (for example, OTLP, Jaeger, Pro
Instrumentation
Without being required to modify the source code you can collect telemetry from an application using automatic instrumentation.
To facilitate the instrumentation of applications even more, can manually instrument your applications by coding against the OpenTelemetry APIs.
Exporters
In order to visualize and analyze your telemetry, you will need to export your data to an OpenTelemetry Collector or a backend such as Jaeger, Zipkin, Prometheus or a vendor-specific one.
OpenTelemetry Operator for Kubernetes
An implementation of a Kubernetes Operator, that manages collectors and auto-instrumentation of the workload using OpenTelemetry instrumentation libraries
The operator manages:
- OpenTelemetry Collector
- auto-instrumentation of the workloads using OpenTelemetry instrumentation libraries
- Install cert-manager
kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.10.0/cert-manager.yaml
kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.13.2/cert-manager.crds.yaml
2. Install K8S Open telemetry Operator
kubectl apply -f https://github.com/open-telemetry/opentelemetry-operator/releases/latest/download/opentelemetry-operator.yaml
3. Open Collector Operator
apiVersion: opentelemetry.io/v1alpha1
kind: OpenTelemetryCollector
metadata:
name: otel-collector
namespace: opentelemetry
labels:
app: opentelemetry
component: otel-collector
spec:
mode: deployment
config: |
receivers:
otlp:
protocols:
grpc:
http:
otlp/jaeger:
protocols:
grpc:
http:
exporters:
logging:
loglevel: debug
otlphttp:
endpoint: jaeger-all-in-one:14250
tls:
insecure: true
insecure_skip_verify: true
otlp:
endpoint: jaeger-all-in-one:4317
tls:
insecure: true
insecure_skip_verify: true
prometheus:
endpoint: ":9090"
processors:
batch:
resource:
attributes:
- key: test.key
value: "test-value"
action: insert
extensions:
health_check:
zpages:
endpoint: :55679
service:
pipelines:
traces:
receivers: [otlp]
processors: [batch, resource]
exporters: [logging, otlp]
metrics:
receivers: [otlp]
processors: [batch, resource]
exporters: [prometheus, logging]
4. Enable Instrumentation
Instrumentation is based on language. Currently, Apache HTTPD, DotNet, Go, Java, Nginx, NodeJS, and Python are supported.
apiVersion: opentelemetry.io/v1alpha1
kind: Instrumentation
metadata:
name: java-instrumentation
spec:
propagators:
- tracecontext
- baggage
- b3
sampler:
type: always_on
java:
env:
- name: OTEL_TRACES_EXPORTER
value: otlp
5. Sidecar
# sidecar.yml
apiVersion: opentelemetry.io/v1alpha1
kind: OpenTelemetryCollector
metadata:
name: sidecar
spec:
mode: sidecar
config: |
receivers:
otlp:
protocols:
grpc:
http:
processors:
batch:
exporters:
logging:
otlp:
endpoint: "http://otel-collector-collector.opentelemetry.svc.cluster.local:4317"
tls:
insecure: true
service:
telemetry:
logs:
level: "debug"
pipelines:
traces:
receivers: [otlp]
processors: []
exporters: [otlp]
metrics:
receivers: [otlp]
processors: []
exporters: [otlp]
6. Pet Clinic Application
apiVersion: apps/v1
kind: Deployment
metadata:
name: petclinic
labels:
app: petclinic
spec:
replicas: 1
selector:
matchLabels:
app: petclinic
template:
metadata:
annotations:
instrumentation.opentelemetry.io/inject-java: 'true'
sidecar.opentelemetry.io/inject: 'sidecar'
labels:
app: petclinic
spec:
containers:
- name: petclinic
image: arey/springboot-petclinic
ports:
- containerPort: 8080
kubectl port-forward petclinic-*** 8080:8080
7. Jaeger
# jaeger.yml
apiVersion: v1
kind: Service
metadata:
name: jaeger-all-in-one
namespace: opentelemetry
labels:
app: opentelemetry
component: otel-collector
spec:
ports:
- name: collector
port: 14250
protocol: TCP
targetPort: 14250
- name: otlpcollector
port: 4317
protocol: TCP
targetPort: 4317
selector:
component: otel-collector
---
apiVersion: v1
kind: Service
metadata:
name: jaeger-all-in-one-ui
namespace: opentelemetry
labels:
app: opentelemetry
component: otel-collector
spec:
ports:
- name: jaeger
port: 16686
protocol: TCP
targetPort: 16686
selector:
component: otel-collector
type: LoadBalancer
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: jaeger-all-in-one
namespace: opentelemetry
labels:
app: opentelemetry
component: otel-collector
spec:
replicas: 1
selector:
matchLabels:
app: opentelemetry
component: otel-collector
template:
metadata:
labels:
app: opentelemetry
component: otel-collector
spec:
containers:
- image: jaegertracing/all-in-one
name: jaeger
ports:
- containerPort: 16686
- containerPort: 14268
- containerPort: 14250
- containerPort: 4317
- containerPort: 4318
env:
- name: COLLECTOR_OTLP_ENABLED
value: "true"
8. Prometheus
---
apiVersion: v1
kind: Namespace
metadata:
name: monitoring
apiVersion: v1
kind: ConfigMap
metadata:
name: prometheus-config
labels:
name: prometheus-config
namespace: monitoring
data:
prometheus.yml: |-
global:
scrape_interval: 10s
scrape_configs:
- job_name: 'sample-job'
static_configs:
- targets: ['otel-collector-collector.opentelemetry.svc.cluster.local:9090']
apiVersion: apps/v1
kind: Deployment
metadata:
name: prometheus
namespace: monitoring
labels:
app: prometheus
spec:
replicas: 1
strategy:
rollingUpdate:
maxSurge: 1
maxUnavailable: 1
type: RollingUpdate
selector:
matchLabels:
app: prometheus
template:
metadata:
labels:
app: prometheus
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "9090"
spec:
containers:
- name: prometheus
image: prom/prometheus
args:
- '--storage.tsdb.retention=6h'
- '--storage.tsdb.path=/prometheus'
- '--config.file=/etc/prometheus/prometheus.yml'
ports:
- name: web
containerPort: 9090
volumeMounts:
- name: prometheus-config-volume
mountPath: /etc/prometheus
restartPolicy: Always
volumes:
- name: prometheus-config-volume
configMap:
defaultMode: 420
name: prometheus-config
---
apiVersion: v1
kind: Service
metadata:
name: prometheus-service
namespace: monitoring
annotations:
prometheus.io/scrape: 'true'
prometheus.io/port: '9090'
spec:
selector:
app: prometheus
type: NodePort
ports:
- port: 8080
targetPort: 9090
nodePort: 30000
Reference :