[FEATURE] Reduce boilerplate for adding new PPL commands and functions

**Is your feature request related to a problem?**

Adding new PPL commands or functions requires extensive boilerplate code across multiple layers of the codebase, making it error-prone and time-consuming. Developers must manually:

1. Update ANTLR grammar files (lexer and parser)
2. Create AST node classes and implement AST/RelNode visitor pattern methods
3. Register functions in multiple locations (PPLBuiltinOperators, PPLFuncImpTable, BuiltinFunctionName)
4. Create unit tests that validate the translation pipeline and Spark SQL generated
5. Create multiple integration test classes (IT, Yaml IT, ExplainIT, AnonymousIT, CrossClusterIT)
6. Remember engine-specific setup (e.g., `enableCalcite()` for V3-only functions)
7. Update documentation following the latest doc structure

This repetitive process leads to:
- **Human errors**: Easy to miss required steps or files - some are subtle even if reference similar PRs.
- **Inconsistency**: Different developers may structure code differently.
- **Slow onboarding**: New contributors face a steep learning curve.
- **Maintenance burden**: Changes to architecture require updating many files.

**What solution would you like?**

Below are several ideas that address the problem from different levels (tooling, specs, and architecture):

1. **Interactive scaffolding tool**: Add a Gradle task that generates the required files and code insertions from a few prompts, with a strict “expected files changed” checklist to prevent missed steps (human or AI) and copy-paste drift.

2. **Spec-driven code generation**: Define a YAML/Markdown spec as the single source of truth for commands/functions (metadata, signatures, engine support, tests/docs) and generate the repetitive glue at build time—AI-friendly by design.

3. **Better abstractions**: Introduce a small DSL for AST → RelNode translation and shared test base classes/mixins to standardize setup. This reduces long-term maintenance and makes both scaffolding and codegen simpler and safer.

**What alternatives have you considered?**

- Better documentation only: Document the process better, but doesn't prevent human errors
- Copy-paste from examples: Current approach, leads to inconsistencies and forgotten steps
- Code review checklists: Helpful but reactive - catches errors after they're made

**Do you have any additional context?**

- This issue was motivated by where backport PR failure below due to missing `enableCalcite()`. However, more broadly, it highlights a recurring friction point in PPL development: adding new PPL features requires too much scattered, easy-to-miss boilerplate, and we should improve the end-to-end developer workflow to make contributions safer and more consistent.
    - #4947 
    - #4959
- Success metrics may include:
    - Time to add new PPL command/function reduced by X%+
    - New contributors can onboard and ship first PR within X days
    - Consistent code structure across all commands/functions

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[FEATURE] Reduce boilerplate for adding new PPL commands and functions #4960

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[FEATURE] Reduce boilerplate for adding new PPL commands and functions #4960

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions