JS Translator and gremlin-mcp improvements#3336
Conversation
The translator only converts canonical Gremlin to variants.
Some cleanup after rebase from gremlin-javascript http driver changes
|
|
||
| visitStringLiteral(ctx: any): void { | ||
| // Preserve original quote style and escape Groovy GString interpolation | ||
| this.sb.push(ctx.getText().replace(/\$/g, '\\$')); |
Check failure
Code scanning / CodeQL
Incomplete string escaping or encoding High
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 1 day ago
In general, to fix this kind of issue you must ensure that when you introduce backslash-based escaping for some meta-character (here $), you also escape any existing backslashes in the input first. That way, an input like "\$foo" does not turn into "\\$foo" in a way that changes which character is actually escaped.
Concretely, for this file, we should update the two places where ctx.getText().replace(/\$/g, '\\$') is used (in visitStringLiteral and visitStringNullableLiteral) to escape backslashes before dollar signs. A minimal, behavior-preserving modification is:
- First replace every single backslash
\with a double backslash\\using a global regex/\\/g. - Then replace every
$with\$using the existing/\$/g.
So ctx.getText().replace(/\$/g, '\\$') becomes ctx.getText().replace(/\\/g, '\\\\').replace(/\$/g, '\\$') in both methods. No new imports or helper functions are required, and this only changes escaping behavior to be more robust while maintaining the original goal of preventing Groovy GString interpolation.
| @@ -133,16 +133,18 @@ | ||
| } | ||
|
|
||
| visitStringLiteral(ctx: any): void { | ||
| // Preserve original quote style and escape Groovy GString interpolation | ||
| this.sb.push(ctx.getText().replace(/\$/g, '\\$')); | ||
| // Preserve original quote style and escape Groovy GString interpolation. | ||
| // First escape backslashes, then escape '$' so that any existing escapes are preserved safely. | ||
| this.sb.push(ctx.getText().replace(/\\/g, '\\\\').replace(/\$/g, '\\$')); | ||
| } | ||
|
|
||
| visitStringNullableLiteral(ctx: any): void { | ||
| if (ctx.getText() === 'null') { | ||
| this.sb.push('null'); | ||
| } else { | ||
| // Preserve original quote style and escape Groovy GString interpolation | ||
| this.sb.push(ctx.getText().replace(/\$/g, '\\$')); | ||
| // Preserve original quote style and escape Groovy GString interpolation. | ||
| // First escape backslashes, then escape '$' so that any existing escapes are preserved safely. | ||
| this.sb.push(ctx.getText().replace(/\\/g, '\\\\').replace(/\$/g, '\\$')); | ||
| } | ||
| } | ||
|
|
| this.sb.push('null'); | ||
| } else { | ||
| // Preserve original quote style and escape Groovy GString interpolation | ||
| this.sb.push(ctx.getText().replace(/\$/g, '\\$')); |
Check failure
Code scanning / CodeQL
Incomplete string escaping or encoding High
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 1 day ago
In general, to fix incomplete escaping when targeting a specific metacharacter like $, you must also correctly handle backslashes that participate in escape sequences, so that you don’t create strings where a backslash is consumed while leaving the metacharacter active. Practically, that means either using a robust, well-tested escaping library, or carefully ordering and structuring your replacements so that backslashes are handled before the metacharacter, and that you escape all occurrences, not just the first.
For this specific code, the goal is to emit a Groovy string literal that preserves the original quote style while disabling Groovy interpolation ($ sequences). The current implementation:
ctx.getText().replace(/\$/g, '\\$')turns every $ into \$, but leaves existing backslashes untouched. To make the escaping robust while retaining behavior, we can enhance the replacement to also escape any backslash that appears immediately before a $, so that each original \$ becomes \\\$ in the output Groovy code. This ensures that when Groovy processes the string, the $ is always treated as a literal character, and we avoid cases where a backslash is interpreted as escaping the $ rather than being preserved. A minimal and clear fix is to use a single regex that matches an optional backslash followed by $ and rewrites it to a safe, literal \$ form.
Concretely, in gremlin-javascript/src/main/javascript/gremlin-javascript/lib/language/translator/GroovyTranslateVisitor.ts, update visitStringLiteral and visitStringNullableLiteral so that both use the same improved escaping expression. One suitable approach is:
ctx.getText().replace(/\\?\$/g, match => match === '$' ? '\\$' : '\\\\$')This leaves all other backslashes untouched and only modifies those adjacent to $, ensuring that every $ ends up escaped and that any original escape backslash is preserved rather than consumed. No new imports or helper methods are needed; the change is local to the two replace calls.
| @@ -133,16 +133,22 @@ | ||
| } | ||
|
|
||
| visitStringLiteral(ctx: any): void { | ||
| // Preserve original quote style and escape Groovy GString interpolation | ||
| this.sb.push(ctx.getText().replace(/\$/g, '\\$')); | ||
| // Preserve original quote style and escape Groovy GString interpolation, | ||
| // ensuring that backslashes immediately preceding `$` are preserved. | ||
| this.sb.push( | ||
| ctx.getText().replace(/\\?\$/g, (match: string) => match === '$' ? '\\$' : '\\\\$') | ||
| ); | ||
| } | ||
|
|
||
| visitStringNullableLiteral(ctx: any): void { | ||
| if (ctx.getText() === 'null') { | ||
| this.sb.push('null'); | ||
| } else { | ||
| // Preserve original quote style and escape Groovy GString interpolation | ||
| this.sb.push(ctx.getText().replace(/\$/g, '\\$')); | ||
| // Preserve original quote style and escape Groovy GString interpolation, | ||
| // ensuring that backslashes immediately preceding `$` are preserved. | ||
| this.sb.push( | ||
| ctx.getText().replace(/\\?\$/g, (match: string) => match === '$' ? '\\$' : '\\\\$') | ||
| ); | ||
| } | ||
| } | ||
|
|
Adds
Translatorinfrastructure to gremlin-javascript and adds that capability to gremlin-mcp. Might be best to take the changes one commit at at time for review as they build on each other mostly. I'm hoping that this change is temporary and that a follow-on refactor during code freeze week will clean that up and make the javascript setup much cleaner. If that does not get done there will be some extra steps required during release.Here's the big ticket changes to consider in review:
Ten new TypeScript files under lib/language/translator/ implement a complete ANTLR-visitor-based translation pipeline that converts canonical Gremlin to JavaScript, Python, Go, Java, Groovy, .NET, and an anonymized form. This
is the largest and most novel code surface. Reviewers should check: correctness of each language's output formatting (especially DotNetTranslateVisitor.ts at 891 lines), the translations.json corpus (44,031 lines, generated
from Groovy test data), and whether the corpus test coverage actually exercises edge cases meaningfully.
The translate_gremlin_query MCP tool now has a two-stage pipeline: mechanical normalization (stripping gremlingo. prefixes, PascalCase→camelCase, etc.) followed by LLM normalization via MCP sampling. The LLM fallback silently
swallows errors and falls back to the mechanically-normalized input. Reviewers should scrutinize: the silent catch in normalizeAndTranslate (could mask translation failures), the LLM system prompt's canonical-format rules for
completeness and accuracy, and whether there are injection risks in passing user-supplied Gremlin directly to the LLM.
The server now checks GREMLIN_MCP_ENDPOINT at startup and dynamically imports connection.ts (and thus the Gremlin driver) only when an endpoint is configured. When offline, a stub GremlinClient layer is used that fails every
operation immediately. This is a correctness-critical change — the memory file notes this was still broken as of last session. Reviewers should verify: that the dynamic import actually prevents the static module graph from
loading the driver, that the GremlinClientOfflineLive stub surfaces errors cleanly to MCP tool callers rather than crashing the server, and that config.ts properly models GREMLIN_MCP_ENDPOINT as optional.
VOTE +1