Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
72 changes: 72 additions & 0 deletions content/en/ninja-workshops/19-agent-primer/1-types-of-telemetry.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
---
title: Types of Telemetry
linkTitle: 1. Types of Telemetry
weight: 1
time: 5 minutes
---

## Supported telemetry

OTel officially supports 4 types of telemetry, or signals.

We'll use the definitions provided by the [OTel docs](https://opentelemetry.io/docs/concepts/signals/):

### Metrics

> A metric is a measurement of a service captured at runtime. The moment of capturing a measurement is known as a metric event, which consists not only of the measurement itself, but also the time at which it was captured and associated metadata.

> Application and request metrics are important indicators of availability and performance. Custom metrics can provide insights into how availability indicators impact user experience or the business. Collected data can be used to alert of an outage or trigger scheduling decisions to scale up a deployment automatically upon high demand.

Good examples of metrics are:
- System
- cpu utilization
- memory utilization
- disk usage
- Application Performance
- response time
- error count
- Business
- total sales
- count of customers

### Traces

> Traces give us the big picture of what happens when a request is made to an application. Whether your application is a monolith with a single database or a sophisticated mesh of services, traces are essential to understanding the full “path” a request takes in your application.

Traces are made up of spans. They are typically shown as a waterfall to easily see where time is spent in a given service or method.

### Logs

> A log is a timestamped text record, either structured (recommended) or unstructured, with optional metadata. Of all telemetry signals, logs have the biggest legacy. Most programming languages have built-in logging capabilities or well-known, widely used logging libraries.

There are a lot of benefits of having structured logs, and in a lot of ways a span acts as a very structured log. Application and host logs will contain metadata that allows it to easily be related to the other signals.

On the flip side logs are able to be unstructured. Splunk is well-recognized as a solution that can collect any kind of data.

### Baggage

> In OpenTelemetry, Baggage is contextual information that resides next to context. Baggage is a key-value store, which means it lets you propagate any data you like alongside context.

> Baggage means you can pass data across services and processes, making it available to add to traces, metrics, or logs in those services.

A lot of times we don't think about baggage because the instrumentation handles this for us. However it's important to understand this telemetry type, as there are times when we need to deal with it directly to address a challenge that the out of the box capabilities are not able to address.

## Other Telemetry

Other signals are still at the proposal stage for OpenTelemetry. But both of these are being used by Splunk Observability today and supported.

### Events

> Events are OpenTelemetry’s standardized format for LogRecords. All semantic conventions defined for logs SHOULD be formatted as Events. Requirements and details for the Event format can be found in the semantic conventions.

> Events are intended to be used by OpenTelemetry instrumentation. It is not a requirement that all LogRecords are formatted as Events.

Splunk Observability Cloud can receive events via APIs, and these can be particularly useful in displaying data on charts.

### Profiles

> A profile is a collection of samples and associated metadata that shows where applications consume resources during execution. A sample records values encountered in some program context (typically a stack trace), optionally augmented with auxiliary information like the trace ID corresponding to a higher-level request.

> The moment of capturing a sample is known as a sample event and consists not only of the observation data point, but also the time at which it was captured.

Profiles are used by Splunk Observability Cloud to get deeper understanding of where time is being spent in services -- in particular large services that have many more lines of code than a typical microservice.
50 changes: 50 additions & 0 deletions content/en/ninja-workshops/19-agent-primer/2-what-is-an-agent.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
---
title: What is an Agent
linkTitle: 2. What is an Agent
weight: 2
time: 5 minutes
---

## Agent Language

Before we go into the different types of agents it's important that we establish some language we use, so as to not cause confusion.

## OpenTelemetry

In OpenTelemetry, there are different components that are used to collect data from a system.

### Collector

OpenTelemetry defines the collector as:
> The OpenTelemetry Collector offers a vendor-agnostic implementation of how to receive, process and export telemetry data. It removes the need to run, operate, and maintain multiple agents/collectors. This works with improved scalability and supports open source observability data formats (e.g. Jaeger, Prometheus, Fluent Bit, etc.) sending to one or more open source or commercial backends.
- Source: [Open Telemetry Docs](https://opentelemetry.io/docs/collector/)

**Is a collector an agent itself?** In a sense, yes. It contains **receivers** which are used to collect data, has **processors** to -- you guessed it -- process it, and then **exporters** to send that data to one or more detinations (like gateways for routing, or observability backends).

#### Receivers

**Receivers** can work by pushing or pulling data, despite their name. For example they can collect the host's CPU, memory, and disk information by scraping that information. Or they can leave an endpoint that other systems can push information into.

#### Processors

**Processors** process the data. For example processors can:
* [redact](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/processor/redactionprocessor)
* [filter](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/processor/filterprocessor)
* [sample (tail)](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/processor/tailsamplingprocessor)
* [sample (probabilistic)](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/processor/probabilisticsamplerprocessor)
* [transform](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/processor/metricstransformprocessor)

There are a few core processors that are common to use like the [memory limit processor](https://github.com/open-telemetry/opentelemetry-collector/tree/main/processor/memorylimiterprocessor) (which can help mitigate out-of-memory situations) and the [batch processor](https://github.com/open-telemetry/opentelemetry-collector/tree/main/processor/batchprocessor) (which puts telemetry into batches by compressing and sending data in fewer connections).

#### Exporters

Now we need to send the data somewhere, and the **exporters** do that. Logs to Splunk Platform typically use a [hec exporter](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/exporter/splunkhecexporter). Other telemetry use different exporters to get to their destination.

#### Pipelines

These receivers, processors, and exports are all brought together in [pipelines](https://opentelemetry.io/docs/collector/configuration/#basics).

### Instrumentation Agents

So the collector provides a great backbone for collection, but how do we then collect data from the application side? That's where instrumentation agents come in.

34 changes: 34 additions & 0 deletions content/en/ninja-workshops/19-agent-primer/_index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
---
title: Splunk Observability Agent Primer
linkTitle: Splunk Observability Agent Primer
weight: 19
archetype: chapter
time: 2 minutes
authors: ["Bill Grant", "Others TBD"]
description: This workshop will provide a backdrop for understanding the different agents used within Splunk. Then it will go on a deeper dive on Open Telemetry. The workshop may be a good primer to use before going into other ninja workshops tackling a specific challenge.
draft: false
hidden: true
---

**Splunk** has a number of agents that represent the different ways to collect data for **Observability**.

In some sense an observability backend can be impartial; as long as the data is provided in a format that is expected, the backend technically doesn't care what agent sent the data.

Knowing which agent to use is an important decision, and one that is often made at the beginning of an Observability implementation.

Splunk has been the largest contributor to OpenTelemetry, which sets out to democratize the collection of data. When more vendors and customers standardize on this plane, the users of Observability win, by benefitting from the collective development of software. It also means users can switch between vendors with less disruption to their business.

**OpenTelemetry** is certainly the most common technology **Splunk** recommends for observability use cases. But is it the only one? And which **OpenTelemetry** agent (or agents)? This primer will help to make sense of that.

It's accurate up until April of 2026, but this is a landscape that changes, and with that it will be important to ensure you have the latest recommendations from **Splunk**.

This workshop will provide a lens into these different agents and when to use them.

The first few pages will offer a primer on signals and OpenTelemetry. You can skim past these sessions if you are already familiar with these concepts.

{{% notice title="Tip" style="primary" icon="lightbulb" %}}
The easiest way to navigate through this workshop is by using:

* the left/right arrows (**<** | **>**) on the top right of this page
* the left (◀️) and right (▶️) cursor keys on your keyboard
{{% /notice %}}