Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
---
title: Deploy Elasticsearch on Azure Cobalt 100 Arm virtual machines

description: Learn how to deploy Elasticsearch on an Azure Cobalt 100 Arm virtual machine, validate the service, and run a baseline ESRally benchmark.

draft: true
cascade:
draft: true

minutes_to_complete: 30

who_is_this_for: This Learning Path is for developers who want to deploy and benchmark Elasticsearch on Azure Cobalt 100 Arm virtual machines.

learning_objectives:
- Provision an Arm-based Azure Cobalt 100 virtual machine via Azure
- Install and validate Elasticsearch on the Cobalt 100 VM
- Run a baseline ESRally benchmark and interpret key performance metrics

prerequisites:
- A [Microsoft Azure](https://azure.microsoft.com/) account with access to Cobalt 100 instances (Dpsv6)
- Basic familiarity with SSH
- Familiarity with Elasticsearch and ESRally


author: Doug Anson

### Tags
skilllevels: Introductory
subjects: Performance and Architecture
cloud_service_providers:
- Microsoft Azure

armips:
- Neoverse

tools_software_languages:
- Elasticsearch
- ESRally
- Bash

operatingsystems:
- Linux

# ================================================================================
# FIXED, DO NOT MODIFY
# ================================================================================
further_reading:
- resource:
title: Azure Virtual Machines documentation
link: https://learn.microsoft.com/en-us/azure/virtual-machines/
type: documentation
- resource:
title: Elasticsearch documentation
link: https://www.elastic.co/docs/reference/elasticsearch
type: documentation
- resource:
title: ESRally documentation
link: https://esrally.readthedocs.io/en/stable/index.html
type: documentation

weight: 1 # _index.md always has weight of 1 to order correctly
layout: "learningpathall" # All files under learning paths have this same wrapper
learning_path_main_page: "yes"
---
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
---
# ================================================================================
# FIXED, DO NOT MODIFY THIS FILE
# ================================================================================
weight: 21 # Set to always be larger than the content in this path to be at the end of the navigation.
title: "Next Steps" # Always the same, html page title.
layout: "learningpathall" # All files under learning paths have this same wrapper for Hugo processing.
---
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
---
title: Understand Azure Cobalt 100 VMs and Elasticsearch benchmarking with ESRally

weight: 2

layout: "learningpathall"
---

## Learning Path overview

In this Learning Path, you deploy Elasticsearch on an Arm-based Azure Cobalt 100 virtual machine and run a baseline benchmark with ESRally. You then review key latency and throughput metrics so you can assess initial performance on Arm.

## What you will do

You will complete one end-to-end developer task:

1. Create an Azure Cobalt 100 Arm virtual machine.
2. Install Elasticsearch and ESRally.
3. Run the geonames track and review benchmark results.

## Azure Cobalt 100 Arm-based processor

Azure’s Cobalt 100 is Microsoft’s first-generation, in-house Arm-based processor. Built on Arm Neoverse N2, Cobalt 100 is a 64-bit CPU that delivers strong performance and energy efficiency for cloud-native, scale-out Linux workloads such as web and application servers, data analytics, open-source databases, and caching systems. Running at 3.4 GHz, Cobalt 100 allocates a dedicated physical core for each vCPU, which helps ensure consistent and predictable performance.

To learn more, see the Microsoft blog [Announcing the preview of new Azure VMs based on the Azure Cobalt 100 processor](https://techcommunity.microsoft.com/blog/azurecompute/announcing-the-preview-of-new-azure-vms-based-on-the-azure-cobalt-100-processor/4146353).

## Elasticsearch

Elasticsearch is a distributed search and analytics engine built on Apache Lucene that is used to index, store, search, and analyze large volumes of structured and unstructured data in near real time. It is commonly used for full-text search, log and event analytics, observability, security workloads, and increasingly as a vector database for AI-powered retrieval use cases.

## ESRally benchmarking tools for Elasticsearch

ESRally is Elastic's benchmarking tool for Elasticsearch, designed to measure indexing, query, and cluster performance under realistic workloads so teams can compare configurations, detect regressions, and evaluate tuning changes. It works by running repeatable benchmark races against Elasticsearch using predefined or custom tracks, making it useful for both local testing and larger-scale performance validation.

## What you've learned and what's next

Next, you'll create a Cobalt 100 Azure VM and prepare it to install the Elasticsearch runtime and ESRally benchmark tool.
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
---
title: Benchmark Elasticsearch using ESRally on the Cobalt 100 virtual machine instance

weight: 5

layout: "learningpathall"
---

## Introduction

ESRally is designed to benchmark Elasticsearch instances. It uses a race model where a selected track exercises specific indexing and query patterns. In this section, you run the geonames track to benchmark your Elasticsearch deployment.

## The geonames track in ESRally

The geonames track is a general-purpose Elasticsearch workload based on the GeoNames dataset. ESRally indexes millions of location records and runs representative search operations so you can measure indexing throughput and query latency under mixed ingest and search conditions.

## Run the benchmark

Open an SSH shell on the virtual machine and run the benchmark command:

```bash
esrally race --distribution-version=9.3.0 --track=geonames --kill-running-processes
```

The benchmark starts and runs through the geonames "racetrack".

{{% notice Note %}}
This benchmarking test will take 15-20 minutes to complete. Please do not interrupt or pause the benchmark because doing so will skew the results.
{{% /notice %}}

## Interpreting the benchmark results

The following sample output shows a baseline geonames run on an Azure Cobalt 100 E4pds_v6 virtual machine. Indexing
sustained about 27,530 docs/s with 0% errors, while common read-path workloads such as default search, term
search, phrase search, and cached aggregation stayed in an excellent 3-4 ms p50 range. The system also completed
the run without any old-generation garbage collections, suggesting healthy JVM behavior under this benchmark. The
main latency costs appeared in heavier workloads such as uncached aggregation, scroll, expression queries, and
script-based scoring, which is consistent with Elasticsearch performance expectations for compute-intensive query
patterns.

### Example performance summary

![Screenshot of ESRally geonames summary output with throughput and latency metrics for a Cobalt 100 Arm VM. Use this table to confirm your run completed and produced baseline results.#center](images/performance-1.png "Performance summary for the E4pds_v6 Arm64 Cobalt 100 VM")

### Example detailed metrics

![Screenshot of detailed ESRally geonames metrics including query types, p50 latency, and GC data. Review these values to identify the heaviest operations in your workload.#center](images/performance-2.png "Performance details on the E4pds_v6 Arm64 Cobalt 100 VM")

## Key findings

1. Indexing throughput averaged 27,530 docs/s and completed without reported errors.
2. Common search workloads were consistently fast, with default, term, and phrase queries all clustered around 3-4
ms p50 latency.
3. Caching made a major difference for aggregation: cached country aggregation was about 32.7x faster than
uncached at p50 latency.
4. Scripted scoring remained expensive: field_value_script_score was about 1.33x slower than
field_value_function_score at p50 latency, and painless_static was one of the slowest tasks in the run.
5. Scroll and uncached aggregation were the most notable non-script latency costs at about 199 ms and 103 ms p50
latency, respectively.
6. JVM behavior looked stable because 844 young-generation collections consumed only 5.76 seconds total and
there were no old-generation collections.
7. Merge work totaled 5.76 minutes with 2.36 minutes of throttle time, indicating some ingest-side background
pressure but not a throughput collapse.
8. The final store footprint matched the dataset size at 2.68 GB, suggesting low additional storage overhead in this
run.

## Conclusions

This benchmark result supports the view that Azure Cobalt 100 E4pds_v6 is capable of delivering strong Elasticsearch baseline performance for the geonames track, especially for ordinary search, sort, and cached aggregation paths. The run also suggests good operational stability under this workload because the benchmark completed successfully with zero errors, no old-generation GC, and sustained ingest throughput. The main practical limitation is query complexity: scripting, score computation, scroll, and uncached aggregation create a clear latency step-up relative to fast-path queries, so these workloads should be isolated, cached, or minimized when low latency matters.

## What you've learned and what's next

In this section, you ran ESRally on Elasticsearch and interpreted the main throughput and latency indicators. Next, use the related Learning Paths in the next steps page to continue tuning or evaluating Arm-based deployments.
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
---
title: Create an Azure Cobalt 100 Arm virtual machine for Elasticsearch deployment
weight: 3
### FIXED, DO NOT MODIFY
layout: learningpathall
---

## Prerequisites and setup

There are several common ways to create an Arm-based Cobalt 100 virtual machine, and you can choose the method that best fits your workflow or requirements:

- The Azure Portal
- The Azure CLI
- An infrastructure as code (IaC) tool

In this section, you will launch the Azure Portal to create a virtual machine with the Arm-based Azure Cobalt 100 processor.

This Learning Path focuses on general-purpose virtual machines in the Dpsv6 series. For more information, see the [Microsoft Azure guide for the Epdsv6 size series](https://learn.microsoft.com/en-us/azure/virtual-machines/sizes/memory-optimized/epdsv6-series?tabs=sizebasic).

While the steps to create this instance are included here for convenience, you can also refer to the [Deploy a Cobalt 100 virtual machine on Azure Learning Path](/learning-paths/servers-and-cloud-computing/cobalt/).

## Create an Arm-based Azure virtual machine

Creating a virtual machine on Azure Cobalt 100 follows the standard Azure VM flow. You specify basic settings, select an operating system image, configure authentication, and set up networking and security options.

For more information, see the [Azure VM creation documentation](https://learn.microsoft.com/en-us/azure/virtual-machines/linux/quick-create-portal).

To create a VM using the Azure Portal, follow these steps:

- In the Azure portal, go to **Virtual machines**.

- Select **Create**, then choose **Virtual machine** from the drop-down.

- On the **Basics** tab, enter **Virtual machine name** and **Region**.

- Under **Image**, choose your OS (for example, *Ubuntu Pro 24.04 LTS*) and set **Architecture** to **Arm64**.

- In **Size**, select **See all sizes**, choose the **E-series v6** series, then select **E4pds_v6**.

![Screenshot of Azure portal VM creation options showing the Arm64 architecture and E4pds_v6 size selection for Cobalt 100. Confirm these fields so your VM runs on Arm hardware.#center](images/instance.png "Select the Epsv6 series and E4pds_v6")

- Under **Authentication type**, choose **SSH public key**. Azure can generate a key pair and store it for future use. For **SSH key type**, **ED25519** is recommended (RSA is also supported).

- Enter the **Administrator username**.

- If generating a new key, select **Generate new key pair**, choose **ED25519** (or **RSA**), and provide a **Key pair name**.

- In **Inbound port rules**, select **HTTP (80)** and **SSH (22)**.

![Screenshot of Azure networking settings showing inbound rules for SSH on port 22 and HTTP on port 80. Verify these rules to allow remote access and service checks during setup.#center](images/instance1.png "Allow inbound port rules")

- Select **Review + create** and review your configuration. It should look similar to:

![Screenshot of the Review and create page for an Arm64 Ubuntu VM on Cobalt 100. Check this summary to confirm region, image, architecture, and VM size before deployment.#center](images/ubuntu-pro.png "Review and create an Arm64 VM on Cobalt 100")

When you’re ready, select **Create**, then **Download private key and create resources**.

![Screenshot of the key pair prompt during VM creation. Download the private key at this step so you can connect to the VM over SSH after deployment.#center](images/instance4.png "Download private key and create resources")

Your virtual machine should be ready in a few minutes. You can then connect over SSH using your private key and the VM public IP address.

![Screenshot of the Azure portal deployment result showing the VM in a running state with public IP information. Use this page to confirm deployment completed successfully before continuing.#center](images/final-vm.png "VM deployment confirmation in the Azure portal")

{{% notice Note %}}To learn more about Arm-based virtual machines on Azure, see the section *Getting Started with Microsoft Azure* within the Learning Path [Get started with Arm-based cloud instances](/learning-paths/servers-and-cloud-computing/csp/azure).{{% /notice %}}

## What you've learned and what's next

In this section, you created an Arm-based Azure Cobalt 100 virtual machine and confirmed the deployment details needed for SSH access. Next, you will install Elasticsearch and ESRally on the VM.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Original file line number Diff line number Diff line change
@@ -0,0 +1,97 @@
---
title: Install Elasticsearch and ESRally on the Cobalt 100 virtual machine instance

weight: 4

layout: "learningpathall"
---

## Introduction

In this section, you prepare your Cobalt 100 virtual machine for Elasticsearch and ESRally. After the base packages are installed, you install Elasticsearch and the benchmarking tool.

## Prepare the virtual machine

Start by updating the virtual machine and installing required dependencies.

```bash
sudo apt update
sudo apt install -y build-essential net-tools curl wget python3-dev python3-venv python3-pip openjdk-21-jdk apt-transport-https ca-certificates curl software-properties-common
sudo apt -y dist-upgrade
sudo apt -y autoremove
sudo apt -y autoclean
sudo add-apt-repository -y ppa:deadsnakes/ppa
sudo apt update
sudo apt install -y python3.10-dev python3.10 python3.10-venv
python3.10 -m venv rally
echo "source $HOME/rally/bin/activate" >> $HOME/.bashrc
echo "export JAVA21_HOME=/usr/lib/jvm/java-21-openjdk-arm64" >> $HOME/.bashrc
sudo reboot
```

Open a new SSH shell and confirm that Java is available and points to OpenJDK 21.

```bash
which java
java --version
```

Output should be similar to:

```output
/usr/bin/java

openjdk 21.0.10 2026-01-20
OpenJDK Runtime Environment (build 21.0.10+7-Ubuntu-124.04)
OpenJDK 64-Bit Server VM (build 21.0.10+7-Ubuntu-124.04, mixed mode, sharing)
```

## Install Elasticsearch and ESRally

Now install Elasticsearch from the official package repository, then install ESRally into the virtual environment.

```bash
curl -fsSL https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo gpg --dearmor -o /usr/share/keyrings/elasticsearch-keyring.gpg
echo "deb [signed-by=/usr/share/keyrings/elasticsearch-keyring.gpg] https://artifacts.elastic.co/packages/9.x/apt stable main" | sudo tee /etc/apt/sources.list.d/elastic-9.x.list
sudo apt update && sudo apt install elasticsearch -y
sudo systemctl enable elasticsearch
sudo systemctl start elasticsearch
pip install --upgrade pip
pip install esrally
```

Confirm that Elasticsearch is installed and running.

```bash
sudo systemctl status elasticsearch
```

Output should be similar to:

```output
elasticsearch.service - Elasticsearch
Loaded: loaded (/usr/lib/systemd/system/elasticsearch.service; enabled; preset: enabled)
Active: active (running) since Thu 2026-04-23 15:18:13 UTC; 2min 3s ago
Docs: https://www.elastic.co
Main PID: 722 (java)
Tasks: 92 (limit: 38379)
Memory: 16.5G (peak: 16.5G)
CPU: 43.282s
CGroup: /system.slice/elasticsearch.service
```

Confirm that ESRally is installed.

```bash
esrally --version
```

Output should be similar to:

```output
esrally 2.13.0
```

## What you've learned and what's next

In this section, you prepared the VM and installed both Elasticsearch and ESRally. In the next section, you will run the geonames benchmark and review the baseline performance results.
Loading