Skip to content
Open
4 changes: 3 additions & 1 deletion platform-enterprise_docs/enterprise-sidebar.json
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,9 @@
"enterprise/advanced-topics/custom-launch-container",
"enterprise/advanced-topics/firewall-configuration",
"enterprise/advanced-topics/seqera-container-images",
"enterprise/advanced-topics/content-security-policy"
"enterprise/advanced-topics/content-security-policy",
"enterprise/advanced-topics/jvm-memory-tuning",
"enterprise/advanced-topics/monitoring"
]
},
"enterprise/general_troubleshooting"
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
---
title: "JVM memory tuning"
description: Configure JVM memory parameters for Seqera Platform Enterprise deployments
date created: "2025-12-17"
tags: [configuration, jvm, memory, tuning]
---

# JVM memory tuning

:::warning
JVM memory tuning is an advanced topic that may cause instability and performance issues.
:::

Seqera Platform scales memory allocation based on resources allocated to the application. To best inform available memory, set memory requests and limits on your deployments. We recommend increasing memory allocation before manually configuring JVM settings.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"increasing memory allocation" -- I assume this means requests / limits in the K8s manifests? Vertical scaling on a docker compose node?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This applies to docker compose as-well.

We should be setting these values

backend:
    image: cr.seqera.io/private/nf-tower-enterprise/backend:v25.3.0
    platform: linux/amd64
    command: -c '/wait-for-it.sh db:3306 -t 60; /tower.sh'
    networks:
      - frontend
      - backend
    expose:
      - 8080
    deploy:
      resources:
        limits:
          memory: 4G        # <---- Limit 
        reservations:
          memory: 2G        # <---- Reservations
    restart: always
    depends_on:
      - db
      - redis
      - cron

Copy link
Member

@gwright99 gwright99 Dec 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not currently defined in the docker-compose template (maybe we should add it so things are aligned?)

# https://docs.seqera.io/assets/files/docker-compose-0655848af8f21b6e6211d1a9c8ebc702.yml
  backend:
    image: cr.seqera.io/private/nf-tower-enterprise/backend:v25.3.0
    platform: linux/amd64
    command: -c '/wait-for-it.sh db:3306 -t 60; /tower.sh'
    networks:
      - frontend
      - backend
    expose:
      - 8080
    volumes:
      - $PWD/tower.yml:/tower.yml
      # Data studios RSA key is required for the data studios functionality. Uncomment the line below to mount the key.
      #- $PWD/data-studios-rsa.pem:/data-studios-rsa.pem
    env_file:
      # Seqera environment variables — see https://docs.seqera.io/platform-enterprise/enterprise/configuration/overview for details
      - tower.env
    environment:
      # Micronaut environments are required. Do not edit these values
      - MICRONAUT_ENVIRONMENTS=prod,redis,ha
    restart: always
    depends_on:
      - db
      - redis
      - cron

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a scenario when we expect a client would need to start tinkering with the JVM settings? When is it? How would it be identified?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of this is answered by #954 where both of these will need revisions.

JVM monitoring links back to overall system monitoring.


If you wish to manually configure JVM memory, use the following baseline recommendations.

## Memory parameters

Set JVM memory parameters using the `JAVA_OPTS` environment variable. The following parameters control memory allocation:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is this being set? Given what's been pulled out, looks like as an environment variable in tower.env / an env in the K8s manifests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something like

JVM memory parameters can be configured using the JAVA_OPTS environment variable. The following parameters control memory allocation:


| Parameter | Description |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there an external source we can direct folks to to see more fulsome examples? Or is the idea "you should NOT be messing with this if you don't understand what you are doing"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could not find an easily accessible external source which did not go deep into compiler and runtime behaviors

| -------------------------- | ----------------------------------------------------------------------------------------------------------------------------------- |
| `-Xms` / `-Xmx` | Set the initial (`Xms`) and maximum (`Xmx`) heap size. The heap stores Java objects and should be 50-70% of total allocated memory. |
| `-XX:MaxDirectMemorySize` | Set the maximum direct (off-heap) memory. Used for NIO operations, network buffers, and file I/O. |
| `-XX:ActiveProcessorCount` | Set the number of CPUs available to the JVM. Should match the number of vCPUs allocated to the container. |

## Resource allocation guidelines

- **Heap (`-Xmx`)**: 50-70% of total allocated memory
- **Direct memory**: 10-20% of total allocated memory
- **Overhead** (metaspace, thread stacks, native memory): ~10% of total allocated memory

Ensure total JVM memory (heap + direct memory + overhead) does not exceed container memory limits.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand what this means for K8s. Less sure on first glance how I'd do it with docker compose.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Answered above we can configure limits on compose it is a recommended pattern especially when running other services on the cluster

As a general compose best practice we should be allocating memory limits to all pods and ensuring that total limit gives enough headroom for the host OS for example 0.5 -2Gb


## Example configurations

The following table provides example configurations for common deployment sizes. These are starting points and may need to be tuned based on your specific usage patterns.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See earlier question about "What scenario would trigger a need to tinker?"


| vCPU | RAM | Heap (`-Xmx`) | Direct Memory | `JAVA_OPTS` |
| :--: | :---: | :-----------: | :-----------: | ------------------------------------------------------------------------------- |
| 1 | 2 GB | 1 GB | 512 MB | `-XX:ActiveProcessorCount=1 -Xms500M -Xmx1000M -XX:MaxDirectMemorySize=512m` |
| 1 | 4 GB | 2.5 GB | 800 MB | `-XX:ActiveProcessorCount=1 -Xms1000M -Xmx2500M -XX:MaxDirectMemorySize=800m` |
| 2 | 2 GB | 1 GB | 512 MB | `-XX:ActiveProcessorCount=2 -Xms500M -Xmx1000M -XX:MaxDirectMemorySize=512m` |
| 2 | 4 GB | 2 GB | 800 MB | `-XX:ActiveProcessorCount=2 -Xms1000M -Xmx2000M -XX:MaxDirectMemorySize=800m` |
| 2 | 8 GB | 5 GB | 1.5 GB | `-XX:ActiveProcessorCount=2 -Xms2000M -Xmx5000M -XX:MaxDirectMemorySize=1500m` |
| 3 | 2 GB | 1 GB | 512 MB | `-XX:ActiveProcessorCount=3 -Xms500M -Xmx1000M -XX:MaxDirectMemorySize=512m` |
| 3 | 4 GB | 2 GB | 800 MB | `-XX:ActiveProcessorCount=3 -Xms1000M -Xmx2000M -XX:MaxDirectMemorySize=800m` |
| 3 | 8 GB | 5 GB | 1.5 GB | `-XX:ActiveProcessorCount=3 -Xms2000M -Xmx5000M -XX:MaxDirectMemorySize=1500m` |
| 3 | 16 GB | 11 GB | 2.5 GB | `-XX:ActiveProcessorCount=3 -Xms4000M -Xmx11000M -XX:MaxDirectMemorySize=2500m` |

## When to adjust memory settings
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, it's at the bottom. Based on my questions, I think this would be more useful nearer to the top.


Adjust your JVM memory settings if you observe the following issues in your deployment:

**Increase heap memory (`-Xmx`)** if you see:

- `OutOfMemoryError: Java heap space` errors in logs
- Garbage collection pauses affecting performance
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How are we expecting these metrics to be visible. IIRC you dont get memory metrics on the standards EC2 monitoring package. Do we expect the client to upgrade their monitoring system / be using an aggregating agent like Datadog?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above they would need to be monitoring via prometheus / or another agent and monitoring JVM stats.

- Steadily growing memory usage under sustained load

**Increase direct memory (`MaxDirectMemorySize`)** if you see:

- `OutOfMemoryError: Direct buffer memory` errors in logs
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Increase relative to what?

  • Grant more memory at the expense of heap?
  • Grant more memory at the expense of overhead?
  • Something else?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should not be at the expense of the other.

There is a typical expected ratio of heap vs direct memory.

If the heap is hitting 100% useage that can be scaled on it's own you can then review your direct memory usage and opt to reduce if you have overhead or increase memory allocated to the pod.

- High concurrent workflow launch rates (more than 100 simultaneous workflows)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is 100 a known pain point when using the default options or was this just chosen because it's a nice number?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes workflows is bad here as it's not nextflow it's Java task allocation related.

- Large configuration payloads or extensive API usage
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Large" ==?
"Extensive" == ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to drop these

Loading