Fix EGL context creation on headless NVIDIA (EGL_BAD_ACCESS) by sam-kpm · Pull Request #13332 · Comfy-Org/ComfyUI

sam-kpm · 2026-04-09T00:47:09Z

Problem

On headless Linux with an NVIDIA GPU and no display server (no $DISPLAY / $WAYLAND_DISPLAY), the GLSL shader node fails with:

RuntimeError: Failed to create OpenGL context.
Backend errors:
  GLFW: glfw.create_window() failed
  EGL: EGLError(err=EGL_BAD_ACCESS, baseOperation=eglInitialize, ...)
  OSMesa: 'GLXPlatform' object has no attribute 'OSMesa'

This is a common setup: cloud VMs, remote GPU servers, and Docker containers with NVIDIA GPUs typically have no display server.

Root cause: eglInitialize(EGL_DEFAULT_DISPLAY) requires a running X or Wayland compositor. On a bare headless system, NVIDIA's EGL returns EGL_BAD_ACCESS. The correct approach for headless GPU rendering is the EGL_EXT_platform_device extension — enumerate EGL devices and obtain a display from a specific device handle.

There are two additional complications:

eglInitialize raises EGLError rather than returning False in some PyOpenGL versions/EGL vendor combinations — the original code only checked the return value.
PyOpenGL's egl_get_devices() wrapper does not reliably resolve the eglQueryDevicesEXT function pointer in headless NVIDIA scenarios, so the fallback must call libEGL.so.1 directly via ctypes.

Fix

When eglInitialize(EGL_DEFAULT_DISPLAY) fails (either by returning False or raising EGLError), fall back to device enumeration:

Load eglQueryDevicesEXT and eglGetPlatformDisplayEXT directly from libEGL.so.1 via ctypes (bypassing PyOpenGL's broken wrapper)
Enumerate available EGL devices
Obtain a display via eglGetPlatformDisplayEXT(EGL_PLATFORM_DEVICE_EXT, device, NULL)
Proceed with normal EGL context setup

Testing

Verified on Ubuntu 24.04, NVIDIA driver 580.65.06, no display server, using the built-in GLSL shader node.

On headless Linux with NVIDIA GPUs and no display server, eglInitialize() with EGL_DEFAULT_DISPLAY fails with EGL_BAD_ACCESS. The fix falls back to EGL_EXT_platform_device: enumerate EGL devices and obtain a display via eglGetPlatformDisplayEXT(EGL_PLATFORM_DEVICE_EXT, ...). PyOpenGL's egl_get_devices() wrapper doesn't reliably resolve the eglQueryDevicesEXT function pointer in this scenario, so both functions are called directly from libEGL.so.1 via ctypes. Also handles the case where eglInitialize raises EGLError rather than returning False, which varies by PyOpenGL version and EGL vendor.

coderabbitai · 2026-04-09T00:52:01Z

📝 Walkthrough

Walkthrough

Added ctypes and a new helper _egl_device_display(eglInitialize) that uses eglGetProcAddress to load eglQueryDevicesEXT and eglGetPlatformDisplayEXT from libEGL, enumerates EGL devices, obtains an EGLDisplay per device via EGL_EXT_platform_device, and attempts eglInitialize on each device, returning the first successful (display, major, minor). _init_egl() now tries eglGetDisplay(EGL_DEFAULT_DISPLAY) and eglInitialize first but treats failures as non‑fatal and falls back to _egl_device_display; debug logging was added for missing entry points, empty device lists, and per‑device init outcomes.

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately describes the main fix: resolving EGL context creation failures on headless NVIDIA systems with the EGL_BAD_ACCESS error.
Description check	✅ Passed	The description comprehensively explains the problem, root cause, solution approach, and testing verification, all directly related to the changeset.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

🧹 Nitpick comments (1)

comfy_extras/nodes_glsl.py (1)
256-259: Consider using c_uint32 instead of c_bool for the EGLboolean return type of eglQueryDevicesEXT.

EGLboolean is defined as unsigned int (32-bit) in the EGL specification, whereas ctypes.c_bool maps to C99 _Bool (typically 1 byte). While this works in practice due to calling conventions, using c_uint32 is more semantically correct and matches the actual EGL header definition.
🔧 Suggested fix
             _query_devices = ctypes.CFUNCTYPE(
-                ctypes.c_bool,
+                ctypes.c_uint32,
                 ctypes.c_int32, ctypes.POINTER(ctypes.c_void_p), ctypes.POINTER(ctypes.c_int32),
             )(_query_devices_ptr)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@comfy_extras/nodes_glsl.py` around lines 256 - 259, The EGL function wrapper
_query_devices currently uses ctypes.c_bool for the EGLboolean return, but
EGLboolean is a 32-bit unsigned int; update the CFUNCTYPE signature for
_query_devices (and any similar wrappers like eglQueryDevicesEXT) to use
ctypes.c_uint32 as the return type instead of ctypes.c_bool so the ctypes
signature matches the EGL header and avoids size/mapping mismatches.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@comfy_extras/nodes_glsl.py`:
- Around line 256-259: The EGL function wrapper _query_devices currently uses
ctypes.c_bool for the EGLboolean return, but EGLboolean is a 32-bit unsigned
int; update the CFUNCTYPE signature for _query_devices (and any similar wrappers
like eglQueryDevicesEXT) to use ctypes.c_uint32 as the return type instead of
ctypes.c_bool so the ctypes signature matches the EGL header and avoids
size/mapping mismatches.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 9b95a465-80c0-4b95-bb53-9e32ac03729b

📥 Commits

Reviewing files that changed from the base of the PR and between b615af1 and 8e0558c.

📒 Files selected for processing (1)

comfy_extras/nodes_glsl.py

- Extract device enumeration into _egl_device_display() helper - Use ctypes.util.find_library("EGL") instead of hardcoded libEGL.so.1 - Fix eglGetDisplay(EGL_DEFAULT_DISPLAY) failure also falling through to device enumeration (previously raised immediately, skipping the fallback) - Two-pass eglQueryDevicesEXT to avoid arbitrary device count cap

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@comfy_extras/nodes_glsl.py`:
- Around line 245-254: The current code uses raw_devices[0] and only attempts
the first EGL device; change this to iterate over raw_devices and try each
device in turn by calling _get_platform_display(EGL_PLATFORM_DEVICE_EXT, device,
None), casting result to ctypes.c_void_p as display, then calling
eglInitialize(display, major, minor) for each until one returns true; on first
successful eglInitialize stop and use that display, and if none succeed raise a
RuntimeError indicating initialization failed for all enumerated devices
(include device info if available) — update references in this block for
EGL_PLATFORM_DEVICE_EXT, _get_platform_display, raw_devices, display, and
eglInitialize.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 91f5bc7b-70e5-4eea-9db3-bc404500666a

📥 Commits

Reviewing files that changed from the base of the PR and between 8e0558c and e24d0f0.

📒 Files selected for processing (1)

comfy_extras/nodes_glsl.py

- Use c_uint32 for EGLboolean return type (unsigned int per EGL spec, not _Bool) - Try all enumerated EGL devices in order rather than only the first; skip devices where eglGetPlatformDisplayEXT or eglInitialize fails

coderabbitai

🧹 Nitpick comments (1)

comfy_extras/nodes_glsl.py (1)
200-264: Well-implemented EGL device enumeration fallback.

The implementation correctly addresses the headless NVIDIA issue by:

Loading EGL extensions directly via ctypes when PyOpenGL's wrapper is unreliable

Using c_uint32 for EGLboolean return type per EGL spec

Iterating all enumerated devices rather than just the first (addresses the prior review feedback)

One minor suggestion: the exception handler at lines 260-261 silently discards the exception. Logging it at debug level would help diagnose edge cases where eglInitialize raises instead of returning False.

,
Optional: Log caught exception for debugging
         try:
             if eglInitialize(display, major, minor):
                 logger.debug(f"_egl_device_display: device {i} succeeded, EGL version {major.value}.{minor.value}")
                 return display, major, minor
-        except Exception:
-            pass
+        except Exception as e:
+            logger.debug(f"_egl_device_display: device {i} eglInitialize raised {type(e).__name__}: {e}")
         logger.debug(f"_egl_device_display: device {i} eglInitialize failed, skipping")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@comfy_extras/nodes_glsl.py` around lines 200 - 264, In _egl_device_display,
don't silently swallow exceptions from the eglInitialize call; change the except
block that currently catches Exception and passes to log the exception at debug
level (include the device index and exception info) so you can diagnose failures
where eglInitialize raises instead of returning False; update the except
Exception handler around the call to eglInitialize(display, major, minor) to
capture the exception as e and call logger.debug (or logger.debug(...,
exc_info=True)) with a short message referencing the device index and the
exception.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@comfy_extras/nodes_glsl.py`:
- Around line 200-264: In _egl_device_display, don't silently swallow exceptions
from the eglInitialize call; change the except block that currently catches
Exception and passes to log the exception at debug level (include the device
index and exception info) so you can diagnose failures where eglInitialize
raises instead of returning False; update the except Exception handler around
the call to eglInitialize(display, major, minor) to capture the exception as e
and call logger.debug (or logger.debug(..., exc_info=True)) with a short message
referencing the device index and the exception.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: caf4ff88-d37f-45b1-b643-ac94e116fb43

📥 Commits

Reviewing files that changed from the base of the PR and between e24d0f0 and 9e28569.

📒 Files selected for processing (1)

comfy_extras/nodes_glsl.py

sam-kpm requested review from Kosinkadink, comfyanonymous and guill as code owners April 9, 2026 00:47

coderabbitai bot reviewed Apr 9, 2026

View reviewed changes

Comment thread comfy_extras/nodes_glsl.py Outdated

Address CodeRabbit review comments

9e28569

- Use c_uint32 for EGLboolean return type (unsigned int per EGL spec, not _Bool) - Try all enumerated EGL devices in order rather than only the first; skip devices where eglGetPlatformDisplayEXT or eglInitialize fails

coderabbitai bot reviewed Apr 9, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix EGL context creation on headless NVIDIA (EGL_BAD_ACCESS)#13332

Fix EGL context creation on headless NVIDIA (EGL_BAD_ACCESS)#13332
sam-kpm wants to merge 3 commits intoComfy-Org:masterfrom
sam-kpm:fix/egl-headless-nvidia-device-enumeration

sam-kpm commented Apr 9, 2026

Uh oh!

coderabbitai bot commented Apr 9, 2026 •

edited

Loading

Walkthrough

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

sam-kpm commented Apr 9, 2026

Problem

Fix

Testing

Uh oh!

coderabbitai bot commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

coderabbitai bot commented Apr 9, 2026 •

edited

Loading