- This file is for code contributors to
pyegeria(SDK + CLI), not end users. - Start with
CONTRIBUTING.md, then use this as a repo-specific implementation map.
pyegeria/: foundational Python SDK for Egeria; includes transport/auth/config, OMVS clients, and report generation/spec registry.commands/(hey_egeria): CLI + simple textual UI layer that invokespyegeriaclients and reporting utilities.md_processing/+commands/cat/dr_egeria.py(Dr.Egeria): markdown language/parser/dispatcher/processors that executepyegeriaoperations and produce reports.- v2 engine (the only engine):
md_processing/v2/— async-first, spec-driven.
- v2 engine (the only engine):
- Explicit dependency flow:
pyegeria->commands/(hey_egeria)pyegeria->md_processing/+commands/cat/dr_egeria.py(Dr.Egeria)
- Change impact (
pyegeria/): SDK API, config, report formatting/spec changes can break bothhey_egeriaand Dr.Egeria flows. - Change impact (
commands/): CLI/TUI argument and output behavior changes must remain compatible with existing command names and snake_case params. - Change impact (
md_processing/): markdown attribute/routing changes must stay aligned withmd_processing/data/compact_commandsand generated report specs.
pyegeria/core/_base_platform_client.py→_base_server_client.py→_server_client.py: layered transport/auth/GUID-resolution stack used by all OMVS clients.pyegeria/core/config.py: Pydantic-settings–based config; precedence = explicit args > OS env >.env>config.json> defaults.pyegeria/egeria_tech_client.py: lazy facade (__getattr__) across OMVS subclients; token propagation happens viaset_bearer_token().pyegeria/egeria_client.py(Egeria) /pyegeria/egeria_cat_client.py(EgeriaCat): additional role-based facades;EgeriaCatuses MI overProjectManager,GlossaryManager,AssetCatalog,MyProfile.pyegeria/omvs/*: service implementations (40+ modules); new API behavior should usually be added here first.pyegeria/models/: Pydantic models for Egeria API request/response bodies (models.py,collection_models.py).pyegeria/view/_output_format_models.py: Pydantic models (Column/Format/FormatSet/ActionParameter) backing the report spec registry.pyegeria/view/output_formatter.py+pyegeria/view/base_report_formats.py: materialization/rendering and report spec registry; generated specs live inbase_report_formats.pyundergenerated_format_sets.pyegeria/view/format_set_executor.py: side-effect-free FormatSet runner used by the MCP adapter and theViewProcessorin Dr.Egeria.pyegeria/core/mcp_server.py+pyegeria/core/mcp_adapter.py: MCP server entry point and its thin adapter helpers (list_reports,describe_report,run_report,run_find_report_specs); keep adapter logic inmcp_adapter.py, server wiring inmcp_server.py.commands/cli/*:hey_egeriacommand groups and TUI entrypoints over SDK operations.md_processing/+commands/cat/dr_egeria.py: Dr.Egeria v2 parser/dispatcher/processor pipeline over SDK operations;commands/cat/dr_egeria.pyis the CLI wrapper only.md_processing/v2/extraction.py(UniversalExtractor): identifies DrE command blocks (## Verb Object) and their### Labelattributes; handles horizontal-rule delimiters (---,___).md_processing/v2/parsing.py:AttributeFirstParser.parse()is asynchronous and utilizes a class-level cache (_valid_values_cache) for metadata validation results from Egeria.md_processing/v2/dispatcher.py(V2Dispatcher): routesDrECommandobjects toAsyncBaseCommandProcessorsubclasses; uses fuzzy preposition-stripping and dynamic fallback toCollectionManagerProcessor/ProjectProcessorfor registered subtypes.md_processing/v2/processors.py(AsyncBaseCommandProcessor): base class for all v2 processors; standard flow is Parse → Validate → Fetch-As-Is → Action Dispatch.md_processing/v2/rewriters.py(CommandRewriter): invoked byV2Dispatcherbefore the processor fires; rewritesCreate→Updatewhen the element already exists andUpdate→Createwhen it is absent.md_processing/md_processing_utils/md_processing_constants.py:COLLECTION_SUBTYPESandPROJECT_SUBTYPESlists drive dynamic dispatcher fallback; adding a subtype here enables it automatically. Also defines standardized verb groups:LINK_VERBS = ("Link", "Attach", "Add", "Detach", "Unlink", "Remove"),CREATION_LINK_VERBS, andREMOVAL_LINK_VERBS.
- Dev setup:
uv sync, activate.venv, run tests from repo root. - Tests are split into three sub-folders:
tests/functional-tests/(per-OMVS live tests),tests/scenario-tests/(lifecycle scenario tests),tests/micro-tests/(unit/formatter/MCP tests). - Default tests are unit/fake; live mode is opt-in via
pytest --live-egeriaorPYEG_LIVE_EGERIA=1(tests/micro-tests/conftest_full.py). - Pytest markers in
tests/pytest.ini:unit,integration,slow,auth,format_sets,mcp. - For command/report schema changes, regenerate and merge specs with
hey_egeria tech gen-report-specs md_processing/data/compact_commands --merge. - Validate compatibility by exercising at least one report in
DICT,LIST, andREPORToutput modes after formatter/spec edits. - Generate Markdown command templates:
gen_md_cmd_templates(basic) orgen_md_cmd_templates --advanced→ outputs tosample-data/templates/. - Generate Dr.Egeria help docs:
gen_dr_help→ writes to configured Egeria Inbox path. - Validate compact command JSON specs:
validate_compact_specs. - Debug Dr.Egeria API calls:
dr_egeria <file> --process --debug(monkey-patches_async_make_requestfor the duration of the call; auto-restored on exit). - Dr.Egeria directive shortcuts:
--validateand--processare shorthand for--directive validate/process;--summary-onlysuppresses per-command diagnostic output;--advancedshows non-Basic attributes. Resolution order:--process>--validate>--directive> default ("validate"). - Async tests:
asyncio_mode = autointests/pytest.ini— writeasync def test_*functions without explicit event-loop boilerplate. - Docstrings: use NumPy/SciPy format; every public OMVS method must have a
Notessection with a sample JSON body where the wire format is non-obvious.
- Prefer snake_case for new report
Column.keyvalues even though camelCase is supported. - For nested Egeria responses with repeated names, use namespaced extractor keys (for example
template_display_name,target_display_name). - Report specs (FormatSets) are now Pydantic models (
FormatSet/Format/Columnfrompyegeria/view/_output_format_models.py); define new specs using these models, not raw dicts. - Generated report specs follow the
{Type}-DrE-{Basic|Advanced}naming pattern and carry an optionalfamilystring (e.g.,"Data Designer","Governance Officer") used for discovery viareport_spec_list(show_family=True)andreport_specs.filter_by_family(...). - Mermaid output is normalized by default (
PYEGERIA_NORMALIZE_MERMAID=true); set tofalseto preserve native Egeria graph syntax for modern renderers. - Compact command specs in
md_processing/data/compact_commands/useattribute_definitions+bundles(single-inheritance via"inherits") +commands; drop a new*.jsonfile there to add a family — picked up automatically at startup without code changes (seecompact_loader.py+parse_compact_export.py). Control loading viaUSE_COMPACT_RESOURCES/COMPACT_FAMILIESinmd_processing_constants.py. - Use
attr_labelsin compact spec attribute definitions to declare user-facing label synonyms that map to the canonical attribute name. - Link-verb commands use standardized groups from
md_processing_constants.py; do not hardcode raw verb strings in processors — compare againstLINK_VERBS,CREATION_LINK_VERBS, orREMOVAL_LINK_VERBSinstead. - See
High-signal rulesfor async-wrapper, snake_case param, and lazy-loading requirements.
- [HIGH SIGNAL] New OMVS operations should follow async-core + sync-wrapper pattern (
_async_*+ public sync method). - [HIGH SIGNAL] Keep CLI/report params in snake_case (
page_size,start_from); adapters handle wire camelCase. - [HIGH SIGNAL] Preserve lazy-loading behavior in facade clients; do not eagerly instantiate all subclients unless clearly justified.
- [HIGH SIGNAL] For config-sensitive changes, respect precedence: explicit args > env vars >
.env> JSON config > defaults. - [HIGH SIGNAL] Use the hyphen (
-) as the default Markdown bullet character for all generated output and documentation to ensure compatibility across editors. - [HIGH SIGNAL] When changing command attributes, verify both markdown processing (
dr_egeria) and generated report specs still agree on keys/labels. - [HIGH SIGNAL] Use the fuzzy-matching preposition stripping rule in
v2/dispatcher.pyto allow natural Markdown headers without adding redundantalternate_namesto the JSON spec. - [HIGH SIGNAL] To prevent command mis-mapping, always check that the header verb matches its intended verb group (CREATE, LINK, VIEW) in
find_alternate_names. - [HIGH SIGNAL] To add a new collection or project subtype, add it to
COLLECTION_SUBTYPESorPROJECT_SUBTYPESinmd_processing/md_processing_utils/md_processing_constants.py; the dispatcher will automatically route it toCollectionManagerProcessororProjectProcessorwithout a new registration. - [HIGH SIGNAL] New "View" commands need only a report spec (FormatSet) — the
ViewProcessorauto-routes anyViewverb to the report engine; no custom processor required. - [HIGH SIGNAL]
CommandRewriterauto-rewritesCreate→Updatewhen an element already exists andUpdate→Createwhen it is absent, before the processor fires; do not re-implement this upsert logic inside individual processors. - [HIGH SIGNAL] Processors that create new elements must register their GUID in
context["planned_elements"]so subsequent commands in the samedispatch_batchcall can resolve them without requiring a multi-pass workflow. - [HIGH SIGNAL] Use the relationship mapping layer in
md_processing/v2/glossary.py(and similar processors) to bridge user-friendly Markdown aliases (likeISA,HASA) to canonical Egeria relationship names.
- Many flows assume a live HTTPS Egeria view server; self-signed localhost behavior is common in this repo.
- Dr. Egeria v2 depends on compact command JSON in
md_processing/data/compact_commands; routing is data-driven through dispatcher logic. - Reporting and Dr. Egeria are coupled by design: generated report specs should stay aligned with compact command labels/attributes.
pyegeria/core/mcp_server.pyexposes report discovery/run interfaces (pyegeria-mcp), so report contract changes affect MCP clients.pyegeria/core/mcp_adapter.pyis the programmatic API layer betweenmcp_server.pyand pyegeria internals; keep adapter logic inmcp_adapter.pyand server wiring inmcp_server.py— do not intermingle them.
- Prefer extending existing specs/processors over introducing parallel abstractions.
- Keep CLI compatibility in
commands/unless migration is explicit and documented. - Apply the config-precedence and command-attribute alignment requirements from
High-signal rules.