Native Vensim MDL parser #211

bpowers · 2026-01-26T03:26:25Z

Summary

Pure Rust implementation of Vensim MDL file parsing, replacing the C++ xmutil dependency for MDL-to-datamodel conversion. This eliminates the need for Bison/Flex and a C++ toolchain, and enables MDL import in WASM builds.

Implements a complete MDL lexer, LALRPOP parser, and multi-pass AST-to-datamodel converter
Handles all equation types, subscripts/arrays, built-in functions, views/diagrams, groups, macros (parsing), and settings
Full view element comparison (UIDs, coordinates, link shapes, polarity) passes across all 12 test models
453+ unit tests for parsing and conversion, plus integration equivalence tests against xmutil output

Key components

src/simlin-compat/src/mdl/ - Lexer, normalizer, LALRPOP parser, AST types, reader
src/simlin-compat/src/mdl/convert/ - AST-to-datamodel conversion (variables, stocks/flows, dimensions, groups)
src/simlin-compat/src/mdl/view/ - Sketch section parsing and view element conversion
src/simlin-compat/src/mdl/xmile_compat.rs - XMILE-compatible expression formatting
src/simlin-compat/src/mdl/settings.rs - Settings section parsing (integration type, unit equivalences)
src/simlin-compat/tests/mdl_equivalence.rs - Element-by-element equivalence tests against xmutil

Test plan

All 453+ MDL parser unit tests pass (cargo test -p simlin-compat mdl::)
All 12 equivalence test models produce identical output between native and xmutil paths, including full view element comparison
All existing simulation tests continue to pass (cargo test)
Pre-commit checks pass (fmt, clippy, tests, lint, tsc, WASM, pysimlin)

This implements the TokenNormalizer layer that sits between the context-free RawLexer and the LALRPOP parser for Vensim MDL files. Changes to lexer.rs: - Renamed Token to RawToken and Lexer to RawLexer - Removed context-tracking state (in_units_comment, in_units, saw_explicit_eq_end) - Added Newline token for tabbed array row tracking - Removed UnitsSymbol variant (now in normalized Token enum) - RawLexer returns None at EOF instead of implicit EqEnd - Added support for $ to start symbols (for unit symbols like $/Year) - Fixed colon keyword span calculation to use actual consumed byte length instead of keyword string length (handles underscores/tabs/spaces) New builtins.rs module: - Implements to_lower_space() for XMUtil-compatible name canonicalization - Provides is_builtin() to check if a symbol is a known Vensim function - Includes is_tabbed_array() for keyword detection New normalizer.rs module: - TokenNormalizer tracks section state (equation/units/comment) - Transforms Symbol to UnitsSymbol in units section with units_mode active - Transforms Symbol to Function for known builtin function names - Detects exact "WITH LOOKUP" match (case-insensitive, single space) - Parses TABBED ARRAY constructs using Newline tokens for row boundaries - Handles GET XLS/VDF special functions with to_lower_space canonicalization so GET_XLS, GET XLS, etc. are all recognized - Sign tokens in TABBED ARRAY allow newlines before the number (C++ treats newline as whitespace after a sign) - GET XLS placeholders include closing ) and use actual source spans The architecture now separates concerns clearly: - RawLexer: pure tokenization, no context - TokenNormalizer: context-sensitive transformations - EquationReader (future): implicit EqEnd at EOF, comment extraction This is Phase 1 of the MDL parser implementation. The types are re-exported publicly from mod.rs for use by the LALRPOP parser in the next phase. The parser will consume TokenNormalizer output and the EquationReader will handle looping over equations and extracting comments.

Implement Rust AST types for the Vensim MDL parser. These types will be produced by the LALRPOP parser from the normalized tokens and later converted to simlin_core::datamodel. Key design decisions: - Single Expr type with validation during conversion to datamodel - Cow<'input, str> for zero-copy string references tied to input lifetime - Own Loc type in mdl/ast.rs to keep the module self-contained - Rust enums following simlin-engine's pattern - Generic App for all calls since parser cannot distinguish lookups from functions/macros; resolution happens during conversion - f64 for table values (NA/NAN don't appear in lookup/tabbed array data) - Box<FullEquation> and Box<MacroDef> in MdlItem to satisfy clippy's large_enum_variant lint The type hierarchy covers: - Location tracking (Loc) - Expressions (Expr, Subscript, UnaryOp, BinaryOp) - Lookup tables (LookupTable) - Left-hand sides (Lhs, ExceptList, InterpMode) - Equations (Equation, SubscriptDef, SubscriptElement, SubscriptMapping) - Units (UnitExpr, UnitRange, Units) - Top-level items (FullEquation, Group, MacroDef, MdlItem)

Previously, only the exact "WITH LOOKUP" (single space) form was recognized as Token::WithLookup. Forms like "WITH_LOOKUP" or "WITH LOOKUP" fell through to Token::Function. Since WITH LOOKUP has special syntax (inline table as second argument) that requires grammar support, all spacing variants should produce the same token type. This uses the existing to_lower_space canonicalization to handle underscore, multiple spaces, and tabs. Also verified that all expression lists in the test corpus are numeric (patterns like `a[DimA] = 1, 2, 3`), so the existing Equation::NumberList variant covers all real-world cases.

Fix several issues identified during comparison with xmutil C++: 1. Legacy XY table semantics (High): - Fixed documentation: legacy format is `x1, x2, ..., xN, y1, y2, ..., yN` (flat vector split in half), NOT evenly spaced X from range - Added `transform_legacy()` method matching xmutil's `TransformLegacy()` - Added `add_raw()` method for legacy table parsing 2. Equation::Implicit behavior (High): - Added detailed documentation explaining that conversion must create a default table with points (0,1),(1,1) and TIME as input - Matches xmutil's AddTable() behavior when tbl is NULL 3. Extrapolation flag (Medium): - Added `extrapolate: bool` field to LookupTable - Set during conversion when LOOKUP EXTRAPOLATE or TABXL is used - Affects XMILE output (type="extrapolate" on gf element) 4. TabbedArray row boundaries (Medium): - Changed from Vec<Vec<f64>> to Vec<f64> (flat vector) - Matches xmutil's ExpressionNumberTable which ignores row boundaries - Updated normalizer to flatten tabbed array data 5. Trailing comma function calls (Low): - Documented that Expr::Literal("?") is produced for trailing comma - Matches xmutil's vpyy_literal_expression("?") behavior The number list unary ops issue is already handled by the normalizer for TABBED ARRAY, and will be handled by the parser for regular number lists during parsing (fold +/- into numeric values).

Add a LALRPOP-based parser that consumes tokens from TokenNormalizer and produces the AST types defined in ast.rs. The grammar structure mirrors xmutil's VYacc.y for compatibility. Key components: - parser.lalrpop: Full grammar with expression precedence matching VYacc.y, support for all equation types (regular, lookup, data, implicit, subscript definitions, equivalence, tabbed array) - reader.rs: EquationReader wrapper that drives the parser and handles raw comment text capture between second ~ and |, plus macro state tracking for :MACRO:/:END OF MACRO: blocks - parser_helpers.rs: Helper functions for number parsing and equation construction (number lists vs regular equations) - ast.rs: Added SectionEnd and ExprListResult types for parser returns The EquationReader collects tokens until the second tilde or pipe, then passes them to the LALRPOP parser. After the parser returns with SectionEnd::Tilde, it captures the raw comment text from the source and constructs the FullEquation with equation, units, and comment. Includes 40 comprehensive tests covering expressions, number lists, function calls, lookup tables, subscripts, units, and special markers.

Add the core conversion logic to transform parsed MDL AST directly to simlin_core::datamodel::Project, bypassing the C++ xmutil dependency. This completes Phase 6 of the MDL parser implementation plan documented in src/simlin-compat/src/mdl/CLAUDE.md. Key components: convert.rs - Multi-pass AST to datamodel conversion: - Symbol collection from parsed equations - Dimension building with range expansion (A1-A10) and equivalences - Variable type detection (stocks via top-level INTEG only) - Flow linking using is_all_plus_minus algorithm (checks all equations per xmutil) - Synthetic net flow generation for non-decomposable INTEG rates - PurgeAFOEq logic for handling A FUNCTION OF placeholders - Sim specs extraction from control variables (TIME STEP, etc.) - TABXL detection for graphical function extrapolation marking - Multi-dimensional number list conversion with Cartesian product - Element-specific equation handling with subscript expansion xmile_compat.rs - XMILE-compatible expression formatter: - Function renames: IF THEN ELSE, LOG->LOG10/LN, ELMCOUNT->SIZE - Argument reordering: DELAY N, SMOOTH N, RANDOM NORMAL - Name formatting: spaces to underscores, TIME->STARTTIME mappings - Complex transformations: PULSE, PULSE TRAIN, QUANTUM, ZIDZ/XIDZ - Lookup invocation: Symbol calls become LOOKUP(table, input) test_equivalence.rs - Integration tests: - SIR.mdl parsing with stock/flow verification - Simple model parsing with control variable filtering The open_vensim_native() function now produces valid Project structures with properly linked stocks/flows and extracted simulation specs. Remaining work (Phases 7, 9-10) covers view/diagram parsing, model post-processing, and settings section parsing.

Split the large convert.rs file (3402 lines) into a directory structure with focused submodules for better maintainability: - mod.rs (214 lines): Public API, ConversionContext struct, core methods - types.rs (100 lines): ConvertError, VariableType, SymbolInfo, etc. - helpers.rs (231 lines): Standalone helper functions - dimensions.rs (553 lines): Dimension building and subscript expansion - stocks.rs (1206 lines): Stock/flow linking and synthetic flow generation - variables.rs (1200 lines): Variable building and equation construction All 49 convert module tests preserved and distributed to their respective modules. This reorganization improves code navigation and keeps individual files within reasonable context window sizes for AI-assisted development.

Implements equivalence testing comparing open_vensim() (xmutil-based: MDL -> XMILE -> datamodel) vs open_vensim_native() (pure Rust: MDL -> datamodel directly) for MDL files under test/. Uses static lists following the pattern in simulate.rs for curated control over which tests run, easy tracking of known failures, and clear progression documentation. Key features: - Normalizes projects for comparison, handling legitimate differences like identifier casing, spacing in expressions/units, model flattening - Uses canonicalize() for proper variable sorting with quoted names - Sorts arrayed equation elements by subscript key - Uses name-based model/variable matching instead of positional zip - Documents known feature gaps in native parser (unit definitions, views, loop_metadata, model-level sim_specs) with TODO comments Currently 12 models pass equivalence testing. Models with known issues are documented with explanations (ACTIVE INITIAL representation, subscript handling differences, xmutil segfaults, etc.). Deletes old test_equivalence.rs in favor of the new tests/mdl_equivalence.rs structure.

The native MDL parser now correctly handles ACTIVE INITIAL by storing the equation and initial value in separate fields, matching xmutil's behavior: - ACTIVE INITIAL(expr, initial) produces equation=expr, initial_equation=initial - Works for scalar, apply-to-all, and arrayed equations Previously, the native parser emitted INIT(expr, initial) as a single equation string with initial_equation=None. The xmutil path via XMILE correctly split these into separate fields. Also fixes a pre-existing bug where the Arrayed equation element tuples were incorrectly using the comment field instead of initial_equation.

The :SUPPLEMENTARY (or :SUP) flag was previously captured as raw text in the documentation field. Now it is parsed as a structured boolean field on FullEquation, since it has semantic meaning in Vensim. The flag indicates that a variable is intentionally not connected to feedback loops, suppressing Vensim's warning about unused variables. MDL syntax example: profit = revenue - cost ~ $ ~ Companies Profits ~ :SUP | The parser now: - Detects :SUP or :SUPPLEMENTARY after the third tilde (case-insensitive) - Strips the flag from the comment text - Sets FullEquation.supplementary = true The equivalence test's normalize_doc() function is updated to strip the supplementary suffix from xmutil output for consistent comparison.

Port the supplementary flag parsing from the native Rust MDL parser to the xmutil C++ code. Both parsers now handle the :SUP/:SUPPLEMENTARY flag consistently: - VensimLex::GetComment() detects the third tilde followed by :SUP or :SUPPLEMENTARY (case-insensitive) - The flag is stripped from the comment text - VensimLex tracks the flag via LastCommentHadSupplementary() - VensimParse sets Variable::_supplementary when the flag is found - Variable class gains _supplementary field with getter/setter With both parsers handling supplementary the same way, the strip_supplementary_suffix workaround in mdl_equivalence.rs tests is no longer needed and has been removed.

Enable the as_sectors flag in xmutil_convert_mdl_to_xmile, which puts all variables into a single main model instead of creating submodels for Vensim groups. This matches the native parser's output structure. With both parsers producing the same model structure, the equivalence test's normalize_project function is simplified significantly: - Remove the loop that collected variables from multiple models - Remove skipping of Module variable references - Remove synthetic main model creation - Just filter stdlib models and normalize in place

…ion type Add PostEquationParser in settings.rs to parse the settings section that appears after the views/sketch section in MDL files. The parser: - Finds the `:L<%^E!@` block marker (supports optional \x7F byte) - Parses type 15 lines for integration method (comma-separated, 4th value is method code: 0,2 = Euler, 1,5 = RK4) - Parses type 22 lines for unit equivalences (comma-separated: $,Name,Aliases) The EquationReader now exposes a `remaining()` method that returns the unparsed source after the equations section. ConversionContext uses this to parse settings and populate: - SimSpecs.sim_method from type 15 - Project.units from type 22 SimSpecsBuilder now accepts an optional sim_method field rather than hardcoding Euler.

Add support for the Runge-Kutta 2 (RK2) integration method throughout the codebase. Vensim uses method codes 3 and 4 for RK2, which was previously being mapped to Euler. Key changes: - Add RungeKutta2 variant to SimMethod enum in core datamodel - Propagate RK2 through protobuf schema, serialization, JSON/SDAI, and XMILE import/export - Update MDL settings parser to map codes 3/4 to RK2 - Add parse_int_like_atoi() for C-compatible integer parsing that handles leading whitespace and stops at non-digits - Add split_lines() helper for handling all line ending styles (LF, CRLF, and CR-only for old Mac files) - Fix find_settings_block() to handle empty or short input without panicking - Document the permissive marker detection strategy The simulation engine still only implements Euler integration, but the method is now correctly stored and serialized for models that specify RK2. 31 new tests cover RK2 method codes, atoi-like parsing edge cases, and various line ending formats.

Remove unnecessary normalization from the MDL equivalence tests by systematically testing each normalization step. The parsers now produce sufficiently similar output that many normalizations are no longer needed. Removed normalization (parsers match directly): - Unit equivalence case normalization and alias sorting - Unit sorting in project - Model sorting in project - Variable sorting in models - Arrayed element sorting in equations - Inflow/outflow sorting in stocks - Graphical function scale normalization - Lookup equation placeholder normalization - Element comment clearing in arrayed equations - Documentation control comment clearing (~ :SUPPLEMENTARY) Still required normalization: - Dimension case normalization (xmutil lowercases, native preserves) - Dimension sorting (order differs between parsers) - sim_specs save_step normalization (xmutil sets None when equal to dt) - Documentation line continuation removal and whitespace collapsing - Unit space/underscore removal and parentheses normalization - Expression case normalization (xmutil lowercases function names) - Expression operator space removal - Equation dimension sorting in ApplyToAll/Arrayed - Arrayed->ApplyToAll conversion for repeated equations This reduces the test file by ~125 lines while maintaining full coverage.

Move unit simplification logic from test code (mdl_equivalence.rs) into the production MDL module. The new implementation mirrors xmutil's UnitExpression approach using numerator/denominator lists. The format_unit_expr() function now: - Flattens UnitExpr AST into separate numerator/denominator lists - Cancels matching terms (e.g., "A/A" becomes "1") - Outputs canonical form with "/(X*Y)" for compound denominators This allows removing the normalize_unit_parens() and simplify_units() helper functions from the test code since both parsers now produce identical output.

When a Vensim variable has units like `[0, 100]` (just a range, no unit expression), the MDL parser now correctly treats this as dimensionless units ("1") rather than no units (None). This affects both regular variables and control variables (TIME STEP, FINAL TIME), ensuring sim_specs.time_units is correctly set to "1" for range-only units. When extracting time_units from control variables, explicit units are preferred over range-only "1". For example, if FINAL TIME has `[0, 100]` (range-only) but TIME STEP has `Months` (explicit), the result is "Months" not "1". When both have explicit units, FINAL TIME's units are used (it is processed first in the control_vars order). Added extract_units() helper to convert/helpers.rs for shared use by both variables.rs and stocks.rs. Added comprehensive tests for: - Range-only units producing "1" (scalar, arrayed, time_units) - Units from real equation when A FUNCTION OF placeholder is present - TIME STEP vs FINAL TIME precedence for time_units extraction This matches xmutil's behavior where UnitsRange() creates a "1" unit expression when the input expression is NULL.

The mdl module in simlin-compat is a derivative work based on Bob Eberlein's xmutil (https://github.com/bobeberlein/xmutil). This commit adds the MIT license from xmutil and a README explaining the provenance of the MDL parsing code, aligning the licensing with the original work.

The feature that enables Vensim MDL import via the xmutil C++ library is now named "xmutil" to reflect what it actually enables. This is more accurate since the feature controls whether the xmutil dependency is included, not the format being parsed. The function names like `open_vensim` and `simlin_project_open_vensim` remain unchanged since they describe the format being opened, not the implementation mechanism.

Add support for semantic/organizational groups that categorize model variables into named sections like "Control Panel" or "Financial Sector". This is distinct from visual diagram groups already in ViewElement::Group. Phase 1 (datamodel): - Add ModelGroup struct with name, doc, parent, members, run_enabled - Add protobuf message and JSON serialization - Add TypeScript ModelGroup class with Immutable.js Record pattern Phase 2 (XMILE): - Add SemanticGroup struct for parsing <group> elements in <views> section - Handle groups in XMILE import/export when no diagram views exist - Groups use <var> children with space_to_underbar formatted names Phase 3 (MDL Import): - Track groups during collect_symbols() with current_group_index - Assign variables to groups on first equation only - Group hierarchy via parent_index (each new group's parent is previous) - AdjustGroupNames ensures unique names that don't conflict with symbols - Support both {**name**} and ***\nname\n***| group marker formats Phase 4 (Integration Tests): - Test MDL -> datamodel -> XMILE -> datamodel preserves groups - Test variables after macros still assigned to groups The implementation follows xmutil's VensimParse.cpp algorithm for group membership assignment and hierarchy determination.

The CLAUDE.md plan was out of sync with actual implementation progress. Key updates: - Phase 9 Groups: Marked as COMPLETE (was NOT STARTED) - Phase 10 Settings: Already marked complete, added RK2 method support - Phase 5 range-only units: Marked as implemented - Test count: Updated from 221 to 430 tests - Module structure: Updated to show convert/ as directory with submodules - Removed view.rs from tree (doesn't exist yet) - Fixed Implementation Order section (Settings was incorrectly marked NOT STARTED) - Renumbered advanced features list to avoid duplicate item 9 The parser implementation is further along than documented - only Phase 7 (Views/Diagrams) and some Phase 9 post-processing remain.

This adds Phase 7 of the MDL parser implementation - view/diagram parsing. The sketch section follows the equations section in Vensim MDL files and starts with the `\\\---///` marker. The implementation parses: - View headers (V300/V364 versions, titles, font lines) - Type 10: Variable elements (stocks, flows, auxiliaries) - Type 11: Valve elements (flow gates) - Type 12: Comment elements (clouds and annotations) - Type 1: Connector elements (links between variables) Key features: - Handles scratch_name flag (bits & 4) where comment text is on next line - atoi-style integer parsing matching xmutil behavior exactly - Quoted string parsing with escape handling - AngleFromPoints algorithm for curved connectors - Flow point computation using connected stocks/clouds - Ghost/primary variable tracking for aliases with cross-view offset support - Multi-view merging into a single StockFlow view with group wrappers The view parser handles both cases where the EqEnd marker (`\\\---///`) is present in remaining source and where it was already consumed by the EquationReader. This is done by looking for either the marker or a version line (V300/V364) when starting to parse the sketch section.

Two bugs in the native MDL parser's view conversion were causing element count mismatches with xmutil: 1. Ghost/alias detection: The is_primary check only compared view index, not the specific UID. This caused all variables in the same view to be considered "primary", preventing ghost variables from being converted to Alias elements. Fixed by checking both view_idx AND uid match. 2. Time variable inclusion: The built-in "Time" variable was being included in views along with its connectors, but xmutil filters these out since XMILE handles Time automatically. Added filtering for built-in system variables. Added two new tests following TDD: - test_ghost_variable_becomes_alias: verifies duplicate variable names create one Aux and one Alias - test_time_variable_filtered_out: verifies Time and its connectors are excluded All tested models now match xmutil output in element counts and types: - SIR.mdl: 29 elements (alias=5, aux=6, flow=2, link=13, stock=3) - initial.mdl: 10 elements (aux=5, link=5) - was 12 before fix - teacup.mdl, active_initial.mdl, etc: all match Also added diagnostic output to mdl_equivalence.rs to show element type breakdowns during testing.

Implement comprehensive fixes for native MDL view parsing to match xmutil's C++ implementation: - Preserve (0,0) control point sentinel during coordinate transforms - Fix ghost/alias promotion with two-pass algorithm and duplicate detection - Add anchor snapping for flow endpoints (vertical vs horizontal alignment) - Sequential UID assignment matching XMILE import - Complete variable filtering (Time, Unwanted control variables) - View title normalization with symbol namespace deduplication - Stock ghost geometry offset (-22, -17) - Flow position logic requiring both attached flags - Group geometry matching exact xmutil formula - Link detection with wrap-around and tight epsilon (0.01 degrees) - Alias label_side defaults to Bottom Flow endpoint computation now uses connector control points and applies xmutil's anchor snapping algorithm: if control point X coordinates are equal (vertical flow), snap Y to target centers; otherwise snap X to target centers. This matches XMILEGenerator.cpp lines 987-1072. These changes enable passing equivalence tests against open_vensim.

Six fixes for remaining divergences between native MDL view conversion and the xmutil C++ reference implementation: 1. UID padding off by one: VensimView::insert now resizes to idx + 26 (matching xmutil's len + 1 = uid + 26) instead of idx + 25. This ensures subsequent views have correct UID offsets. 2. Lettered polarity preservation: Track whether connectors used S/O letter notation vs +/- symbols. Added use_lettered_polarity flag to StockFlow (datamodel, proto, serde, json) matching xmutil's bLetterPolarity. The flag is computed per-model from all connectors. 3. View name dedup parity: Removed group names from the symbol namespace used for view title deduplication (groups are adjusted separately in xmutil). Changed used_names comparison to case-sensitive, matching xmutil's std::set<std::string> behavior. 4. Array variable filtering: Added documentation that subscript/dimension variables are implicitly filtered from views because they live in the dimensions map, not the symbols map. 5. JSON polarity round-trip: Added polarity field to LinkViewElement in json.rs so link polarity survives JSON/WASM round-trips. 6. Documentation: Added note about future module-style view splitting requirements to CLAUDE.md.

xmutil's XMILEGenerator uses a simple priority chain for time_units: TIME STEP > FINAL TIME > INITIAL TIME > "Months". The native parser had the wrong priority order (FINAL TIME first) and added unnecessary "prefer explicit over range-only" logic that doesn't match xmutil. Also uses select_equation() instead of equations.first() so that AFO-first control variables still have their values extracted, and extract_first_units() checks ALL equations rather than just the first.

Vensim allows \_ in identifiers as a literal underscore. The lexer's symbol() method previously exited when it hit \ (not in is_symbol_char), and the main next() dispatch produced UnrecognizedToken for \ not followed by a newline. The fix adds an outer loop in symbol() that checks for \_ after the normal symbol char loop exits, and handles \_ at token start in next(). The trailing-strip logic is also updated to not strip _ that is part of an escaped \_ sequence. Also adds a comment clarifying that we intentionally trim leading whitespace from comments, unlike xmutil which only trims trailing.

Previously the equivalence tests cleared views before comparison. This commit enables element-by-element view comparison across all 12 test models by fixing three root causes of divergence between the native MDL path and the xmutil/XMILE path: 1. Sequential UID assignment: the native path preserved sparse Vensim UIDs while the XMILE path assigned sequential UIDs from 1. Added reassign_uids_sequential() post-processing to the native path. 2. Flow point stock-edge adjustment: the XMILE path snaps flow endpoints from stock centers to stock edges. Added fixup_flow_takeoffs() to the native path with matching geometry. 3. XMILE polarity parsing: the Link struct was missing the polarity attribute, silently dropping it during deserialization. Added the field and conversion functions. Also fixed UID gaps in the XMILE path caused by assign_uids() numbering elements before filtering unresolvable links. The retain() step now re-sequences UIDs to close gaps. Arc angle comparison uses epsilon tolerance (1e-10) to account for inherent floating-point differences between the two computation paths (C++ -> string -> Rust vs direct Rust).

claude · 2026-01-26T03:28:04Z

Pull Request Review - Native Vensim MDL Parser

This is an exceptionally well-executed PR that replaces the C++ xmutil dependency with a pure Rust implementation for parsing Vensim MDL files. The implementation demonstrates strong software engineering practices and careful attention to correctness.

✅ Strengths

Architecture & Design

Clean separation of concerns: The multi-phase parsing pipeline (lexer → normalizer → parser → AST → datamodel) is well-architected and mirrors the reference implementation appropriately
Excellent module organization: Breaking convert.rs into focused submodules (dimensions.rs, stocks.rs, variables.rs, helpers.rs) keeps files manageable and improves maintainability
Zero-copy design: Use of Cow<'input, str> in the AST for string references tied to input lifetime shows performance awareness
LALRPOP integration: Leveraging a proven parser generator rather than hand-rolling a parser reduces bugs

Testing & Correctness

Outstanding test coverage: 453+ unit tests plus comprehensive equivalence tests against the reference implementation (xmutil)
Rigorous validation: Element-by-element comparison of outputs between native and xmutil parsers across 12 test models ensures behavioral parity
TDD approach evident: Tests like test_ghost_variable_becomes_alias and test_time_variable_filtered_out show bugs were fixed with tests first
Edge case handling: Tests cover escaped underscores, range-only units, ACTIVE INITIAL, supplementary flags, etc.

Code Quality

Comprehensive documentation: Each module has clear explanations, the CLAUDE.md plan documents implementation phases, and complex algorithms reference the C++ source
Proper error handling: Custom error types (ConvertError, LexError, ReaderError) with structured variants
Follows project conventions: Adheres to commit message style, uses .unwrap() appropriately per guidelines, runs pre-commit hooks
Licensing compliance: Properly attributes xmutil derivation with MIT license and README

Integration

Backward compatible: Maintains existing open_vensim() function while adding open_vensim_native()
Feature-gated: Uses "xmutil" feature flag to control C++ dependency
Cross-language: Updates both Rust and TypeScript/protobuf schemas for new datamodel fields (ModelGroup, SimMethod::RungeKutta2, polarity)

🔍 Areas for Consideration

1. Test Coverage Gaps (Minor)
The equivalence tests filter out several model categories with known limitations:

Models using advanced builtins (ALLOCATE AVAILABLE, DELAY FIXED, NPV, QUANTUM, SAMPLE IF TRUE)
GET XLS/VDF data functions
Complex subscript features (EXCEPT, mappings)

While these are documented as future work, consider adding tracking issues for each category to ensure they don't get forgotten.

2. Floating-Point Comparison (Minor)
The ANGLE_EPSILON tolerance (1e-10) for arc angle comparisons is well-justified, but consider adding a comment explaining that the epsilon value is chosen based on empirical observation of maximum divergence rather than being theoretically derived.

3. Performance Considerations (Info)
No benchmarks are included to compare performance vs the C++ implementation. While this is a correctness-first PR (appropriate!), consider:

Adding criterion benchmarks for common MDL files
Profiling the multi-pass conversion to identify any bottlenecks
This can be follow-up work.

4. Error Messages (Minor)
Some error paths use generic ConvertError with format strings. Consider adding specific error variants for common cases (e.g., undefined dimension reference, invalid control variable) to improve debuggability.

5. Dead Code (Trivial)
The integration_method field in ConversionContext is marked #[allow(dead_code)] - if it's truly unused after Phase 10 completion, it should be removed.

🎯 Specific Code Observations

Good Practices:

Use of unreachable!() and todo!() macros rather than comments (per guidelines)
Proper use of std::mem::take() to avoid borrow checker issues while maintaining clarity
Consistent naming: canonical_name() for spaces, canonical_ident() for underscores
Smart buffering strategy in EquationReader to handle macro state tracking

Minor Issues:

In src/simlin-compat/src/mdl/convert/mod.rs:56, the #[allow(dead_code)] on integration_method should be revisited
The test at line 747 has a comment about parent being "1-1 Sub" but the assertion expects a different group - verify this is correct
Consider extracting magic numbers like 26 (UID padding offset) into named constants

🛡️ Security Considerations

✅ No security concerns identified:

Parser handles untrusted input appropriately with proper error handling
No unsafe code in the new modules
String operations use safe Rust APIs
No file I/O beyond the public API boundary

📊 Impact Assessment

Lines Changed: ~14,000 additions across 65 files
Risk Level: Low - well-tested, feature-gated, backward compatible
Value: High - eliminates C++ toolchain requirement, enables WASM builds

✨ Summary

This is exemplary work that sets a high bar for contributions to this codebase. The implementation:

Achieves full parity with the reference C++ implementation across tested models
Maintains excellent code organization and documentation
Demonstrates thorough testing with both unit and integration tests
Enables the project's strategic goal of WASM support

The minor suggestions above are mostly about long-term maintainability and don't block merging. This PR is strongly recommended for approval.

Recommendation: ✅ APPROVE and merge

Review conducted via automated analysis of code structure, tests, documentation, and adherence to project guidelines.

codecov · 2026-01-26T03:33:15Z

Codecov Report

❌ Patch coverage is 80.55172% with 564 lines in your changes missing coverage. Please review.
✅ Project coverage is 76.76%. Comparing base (d44d3ae) to head (22b4164).

Files with missing lines	Patch %	Lines
src/simlin-compat/src/mdl/convert/variables.rs	63.79%	147 Missing ⚠️
src/simlin-compat/src/mdl/lexer.rs	76.75%	106 Missing ⚠️
src/simlin-compat/src/mdl/normalizer.rs	63.04%	85 Missing ⚠️
src/simlin-compat/src/mdl/convert/stocks.rs	79.41%	63 Missing ⚠️
src/simlin-compat/src/mdl/reader.rs	79.51%	34 Missing ⚠️
src/simlin-compat/src/mdl/view/convert.rs	89.32%	30 Missing ⚠️
src/simlin-compat/src/mdl/view/mod.rs	79.77%	18 Missing ⚠️
src/simlin-compat/src/mdl/view/types.rs	80.85%	18 Missing ⚠️
src/simlin-compat/src/mdl/view/processing.rs	92.61%	15 Missing ⚠️
src/simlin-compat/src/mdl/convert/helpers.rs	84.61%	14 Missing ⚠️
... and 7 more

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #211      +/-   ##
==========================================
+ Coverage   75.99%   76.76%   +0.77%     
==========================================
  Files          48       69      +21     
  Lines       15533    19138    +3605     
==========================================
+ Hits        11804    14691    +2887     
- Misses       3729     4447     +718

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6a6a9152b9

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-01-26T03:38:21Z

src/libsimlin/src/lib.rs

+#[cfg(feature = "xmutil")]
 #[no_mangle]
 pub unsafe extern "C" fn simlin_project_open_vensim(


Expose native MDL import in non-xmutil builds

The new native MDL parser is still unreachable in WASM builds because simlin_project_open_vensim is compiled only when the xmutil feature is enabled (and it still routes through simlin_compat::open_vensim). When libsimlin is built without xmutil (the default for WASM), this export is missing and engine2 will throw for MDL imports, so the “native parser” path never becomes available. Consider exporting a non-xmutil variant that calls open_vensim_native or relaxing the feature gate so MDL import actually works in WASM.

Useful? React with 👍 / 👎.

simlin_project_open_vensim was gated behind #[cfg(feature = "xmutil")] and routed through the C++ xmutil path. This meant WASM builds (which use --no-default-features) had no MDL import capability, defeating the purpose of the native parser. Switch the FFI function to call open_vensim_native directly and remove the feature gate so it is always available. Also remove the xmutil feature gates from the corresponding tests and clean up stale comments in the TypeScript WASM bindings that referenced the xmutil requirement. The native parser uses canonical underscore-separated identifiers (e.g. "heat_loss_to_room") rather than space-separated names from the xmutil path, so the engine2 and pysimlin test expectations are updated to match.

claude · 2026-01-26T05:45:29Z

Pull Request Review: Native Vensim MDL Parser

This is an impressive and well-executed PR that replaces the C++ xmutil dependency with a pure Rust implementation. The scope is massive (35k+ additions) but the implementation demonstrates strong engineering practices.

Summary

✅ Recommendation: APPROVE with minor suggestions

This PR successfully implements a complete Vensim MDL parser in Rust, achieving equivalence with the existing xmutil-based implementation across 12 test models. The code quality is high, testing is comprehensive, and the architectural decisions are sound.

Strengths

1. Excellent Documentation

Comprehensive CLAUDE.md provides clear implementation roadmap
Proper attribution and licensing (MIT) for derivative work
Well-structured commit history with descriptive messages
Inline documentation explains complex algorithms (e.g., flow endpoint computation, angle calculation)

2. Strong Testing Strategy

453+ unit tests across all modules (lexer, parser, converter, views)
Integration tests compare native vs xmutil output element-by-element
TDD approach evident in commit history (tests added before implementation)
Test files include edge cases and error conditions

3. Architectural Excellence

Clean separation of concerns: lexer → normalizer → parser → AST → converter
Proper use of Result types for error handling
Minimal .unwrap() usage (162 instances across 14 files, most with context)
No unwrap_or_default() anti-patterns found in production code

4. Code Quality

Follows Rust idioms and best practices
Uses appropriate data structures (Cow for zero-copy strings)
Modular design with focused submodules (convert/ split into dimensions, stocks, variables, etc.)
Proper handling of edge cases (escaped underscores, various line endings, etc.)

5. Security & Correctness

No unsafe code detected
Proper bounds checking (no get_unchecked usage)
UTF-8 validation handled correctly
All panic!/expect() calls confined to test code

6. Completeness

Handles all MDL format features: equations, subscripts, macros, views, settings
Proper XMILE compatibility layer for expression formatting
Support for RK2/RK4/Euler integration methods
Full view/diagram parsing with coordinate transforms and angle calculations

Issues & Suggestions

Critical: None

Major: None

Minor Suggestions

1. Test Coverage Visibility

Consider adding code coverage reporting to track which edge cases are tested. With 453+ tests, it would be valuable to see coverage metrics.

# Suggestion: Add to CI
cargo tester --all-features --workspace

2. Error Messages Could Be More Specific

In mod.rs, generic error messages could be more helpful:

// Current:
Some(format!("Failed to parse MDL: {:?}", e))

// Suggestion: Include line numbers when available
Some(format!("Failed to parse MDL at line {}: {:?}", line_num, e))

3. Consider Adding Fuzzing

With a complex parser handling untrusted input, fuzzing could reveal edge cases:

# Add to Cargo.toml
[dev-dependencies]
cargo-fuzz = "0.11"

4. unwrap() Usage Pattern

While most .unwrap() calls are justified, consider documenting why each is safe:

// Instead of:
.unwrap()

// Consider:
.expect("dimension_map always contains entries from collect_symbols phase")

This helps future maintainers understand invariants.

5. Module Organization Documentation

The convert/ directory split is excellent, but could benefit from a module-level doc comment explaining the multi-pass conversion strategy:

//! Multi-pass MDL to datamodel conversion:
//! 
//! 1. **Symbol Collection** (mod.rs): Gather all variable/dimension names
//! 2. **Dimension Building** (dimensions.rs): Expand ranges, handle equivalences
//! 3. **Variable Type Detection** (variables.rs): Classify stocks/flows/aux
//! 4. **Stock/Flow Linking** (stocks.rs): Connect flows to stocks via inflow/outflow
//! 5. **Expression Formatting** (via xmile_compat.rs): XMILE-compatible output

Performance Considerations

Positive

Zero-copy string references with Cow<'input, str>
Efficient parsing with LALRPOP (LR parser generator)
No unnecessary allocations in hot paths

Potential Optimization Opportunities

None critical, but for very large models:

Consider arena allocation for AST nodes if profiling shows allocation overhead
View element processing could be parallelized with rayon

Security Assessment

✅ No security concerns identified

No unsafe code
Proper input validation
No command injection vectors
No file path traversal issues
Appropriate error handling prevents panics on malformed input

Test Coverage Assessment

Based on mdl_equivalence.rs:

✅ 12 models passing full equivalence tests
✅ View element comparison (UIDs, coordinates, link shapes, polarity)
✅ Known issues documented (xmutil segfaults, feature gaps)
✅ Clear separation of working vs pending models

Suggestion: Add a test that deliberately provides malformed input to verify graceful error handling rather than panics.

Compatibility & Migration Path

The PR maintains backward compatibility:

✅ Existing open_vensim() function preserved (uses xmutil)
✅ New open_vensim_native() function available
✅ FFI layer updated to use native parser
✅ WASM builds now support MDL import (major win!)

Future consideration: Document deprecation timeline for xmutil dependency once native parser is battle-tested in production.

Commit Quality

Excellent commit hygiene:

✅ Logical atomic commits
✅ Descriptive commit messages following project style
✅ Body text explains "why" not just "what"
✅ Co-authored attribution to Claude

Alignment with CLAUDE.md Guidelines

Checking against project guidelines:

✅ Test Driven Development: Evidence of tests-first approach
✅ No backwards compatibility shims: Clean implementation
✅ Prefer .unwrap() over .unwrap_or_default(): Followed correctly
✅ Use unreachable!() for impossible cases: Pattern followed
✅ No placeholder comments: All TODOs removed
✅ Proper error types: Result types used throughout
✅ Commit message style: Matches component: description format

Final Recommendations

Before Merge

✅ Pre-commit checks pass (mentioned in PR description)
✅ All tests pass (453+ unit tests, 12 equivalence tests)
⚠️ Recommended: Run full simulation test suite to verify models parse AND simulate correctly
⚠️ Recommended: Manual review of panic! locations to confirm all are test-only

Post-Merge

Monitor production usage of native parser vs xmutil
Add telemetry to track which code path is used (if applicable)
Consider adding performance benchmarks to prevent regressions
Document migration guide for external users if any exist

Conclusion

This is exemplary work that:

✅ Eliminates complex C++ build dependency
✅ Enables WASM support for MDL imports
✅ Maintains full compatibility with existing behavior
✅ Includes comprehensive test coverage
✅ Follows project conventions meticulously

The 35k line addition is justified by the complexity of the task (replacing a 15k line C++ parser) and the inclusion of extensive tests and documentation.

Approved with high confidence. Minor suggestions above are truly optional improvements, not blockers.

Great work! 🎉

Reviewed by: Claude Sonnet 4.5
Review Date: 2026-01-26
PR: #211 - Native Vensim MDL parser

The native Rust MDL parser is now always available via engine2's openVensim(), so the xmutil-js fallback path in NewProject.tsx is no longer needed. Remove the convertMdlToXmile import, the xmutil package dependency, the tsconfig project reference, and the rsbuild alias. The .mdl file type was already accepted by the file picker; this just simplifies the import handler to always use the native path.

claude · 2026-01-26T05:54:22Z

PR #211 Review: Native Vensim MDL Parser

Summary

This PR implements a comprehensive pure Rust parser for Vensim MDL files, replacing the C++ xmutil dependency. This is a massive and impressive undertaking with 35,493 additions across 71 files, including a complete lexer, parser, AST, and conversion pipeline.

Overall Assessment: ✅ APPROVE with Minor Recommendations

This is excellent work that successfully achieves its goals. The implementation is thorough, well-tested (453+ tests), and carefully designed to match xmutil's behavior while being maintainable.

Strengths

1. Exceptional Architecture & Design ⭐⭐⭐⭐⭐

Clean separation of concerns: RawLexer → TokenNormalizer → Parser → AST → Datamodel
Well-organized module structure with focused submodules (convert/, view/)
Comprehensive documentation in CLAUDE.md explaining implementation phases and decisions
Proper use of Rust idioms (Cow<'input, str> for zero-copy, Result/Option error handling)

2. Outstanding Test Coverage ⭐⭐⭐⭐⭐

453+ unit tests across lexer, parser, normalizer, converter, view processing
12 full-model equivalence tests comparing xmutil output vs native output
Tests verify UIDs, coordinates, link shapes, polarity across all test models
Well-documented known gaps and unsupported features in tests/mdl_equivalence.rs

3. Proper Error Handling ⭐⭐⭐⭐

Custom error types (LexError, NormalizerError, ReaderError, ConvertError)
Consistent use of Result types throughout
Good use of unwrap() over unwrap_or_default() per project guidelines
Appropriate use of unreachable!() and panic!() for truly impossible cases

4. Licensing & Attribution ⭐⭐⭐⭐⭐

Proper MIT license attribution to Bob Eberlein's xmutil
Clear README explaining derivative work provenance
Maintains original copyright notices

5. C-Compatible Parsing ⭐⭐⭐⭐⭐

parse_int_like_atoi() function with comprehensive edge case tests
Handles whitespace, signs, non-digit characters exactly like C's atoi()
31 tests for RK2 method codes, atoi parsing, line ending formats

Code Quality Issues

Critical Issues: None ✅

High Priority Issues

1. Excessive `#[allow(dead_code)]` Usage ⚠️

Location: src/simlin-compat/src/mdl/lexer.rs:5, and 7 other files

Issue: Multiple files have blanket #[allow(dead_code)] directives, which can hide legitimately unused code that should be removed.

Recommendation:

// Instead of:
#![allow(dead_code)]

// Use targeted suppression:
#[allow(dead_code)]
pub struct OnlyUsedInTests { ... }

Files affected:

lexer.rs, convert/dimensions.rs, convert/mod.rs, convert/types.rs, view/processing.rs

2. High unwrap() Count ⚠️

Count: 162 unwraps across 14 files

Issue: While project guidelines prefer unwrap() over unwrap_or_default() for debugging, this many unwraps could cause panics on malformed input.

Recommendations:

Add fuzzing tests for malformed MDL files
Consider converting some unwraps in hot paths to proper error propagation
Document which unwraps are "safe" (based on previous validation) vs which assume valid input

Example locations to review:

src/simlin-compat/src/mdl/convert/stocks.rs (30 unwraps)
src/simlin-compat/src/mdl/convert/variables.rs (28 unwraps)

Medium Priority Issues

3. Missing Documentation on Complex Algorithms 📝

Locations:

is_all_plus_minus() algorithm in stocks.rs
reassign_uids_sequential() in view/processing.rs
fixup_flow_takeoffs() in view/processing.rs

Issue: While implementation comments exist, high-level documentation of why these algorithms work this way is sparse.

Recommendation: Add module-level or function-level doc comments explaining the mathematical/logical basis, especially for algorithms ported from xmutil C++.

4. Panic Count Acceptable but Worth Monitoring ℹ️

Count: 170 panics across 12 files

Context: Most appear to be in unreachable!() or assertion macros which is appropriate.

Recommendation: Audit panic! calls in production code paths (not tests) to ensure they're only for truly impossible states.

Performance Considerations

Positive

Zero-copy string references with Cow<'input, str> ✅
Efficient HashMap/HashSet usage for lookups ✅
Lazy parsing (EquationReader iterator) ✅

Potential Concerns

Large enum variants: Fixed with Box and Box ✅
String allocations: Canonical name conversions create new strings frequently, but acceptable for parsing workload ✅
No obvious performance issues ✅

Security Considerations

✅ No Unsafe Code

Zero unsafe blocks found in mdl/ module ✅

✅ Input Validation

Lexer handles malformed input with proper error types ✅
Parser uses LALRPOP which provides bounds checking ✅
Settings parser is "intentionally permissive" (documented in code) ✅

⚠️ Potential DoS Vectors

Issue: Large or deeply nested MDL files could cause stack overflow or memory exhaustion

Recommendation:

Add maximum recursion depth limits for expression parsing
Consider fuzzing with large malformed MDL files
Document any practical file size limits

Example: Deep nesting in format_expr_inner() is recursive without depth limit

Testing Quality

Strengths

Equivalence tests compare against known-good xmutil output ✅
Element-by-element view comparison with coordinate/UID validation ✅
Comprehensive edge case coverage (escaped underscores, line continuations, etc.) ✅
TDD approach evident (test_ghost_variable_becomes_alias, test_time_variable_filtered_out) ✅

Gaps

❌ No fuzzing tests for malformed input
❌ No performance benchmarks for large models
⚠️ Some models excluded from equivalence tests due to unsupported features (documented)

Integration & Dependencies

✅ Clean Integration

Properly exposed via parse_mdl() public API ✅
FFI bindings updated for non-xmutil builds ✅
WASM compatibility achieved (main goal) ✅
TypeScript bindings updated ✅

⚠️ Feature Flag Complexity

The xmutil feature flag logic is a bit convoluted
cfg(feature = "xmutil") appears in multiple places
Consider documenting the feature flag strategy in CLAUDE.md

Specific Code Review Findings

1. settings.rs - Permissive Parsing Strategy

Lines 76-88: Good documentation of intentional deviation from xmutil strictness
✅ Good: Well-documented design decision with clear rationale

2. lexer.rs - Line Continuation Handling

Lines 156-184: Complex lookahead/pushback logic
✅ Good: Handles edge cases correctly
⚠️ Consider: Add diagram or state machine documentation

3. convert/mod.rs - Multi-pass Conversion

Lines 123-150: Well-structured conversion pipeline
✅ Excellent: Clear separation of concerns with numbered passes

4. xmile_compat.rs - Function Transformations

Lines 126-150: Complex function renaming logic
✅ Good: Comprehensive handling of XMILE function differences
⚠️ Minor: Consider extracting transformation rules to a data structure

Commit Quality

✅ Excellent Commit Messages

Follows project style guide perfectly ✅
Component prefixes used correctly (compat:, xmutil:, app:) ✅
Present tense, lowercase, descriptive ✅
Bodies explain "why" not just "what" ✅

Example:

compat: fix view ghost/alias detection and filter Time variable

Two bugs in the native MDL parser's view conversion were causing
element count mismatches with xmutil:
1. Ghost/alias detection...
2. Time variable inclusion...

Recommendations

Required Before Merge: None

This PR is ready to merge as-is.

Strongly Recommended for Follow-up

Add fuzzing tests for malformed MDL input
Audit and reduce #[allow(dead_code)] directives
Add recursion depth limits to prevent stack overflow
Document complex algorithms (is_all_plus_minus, flow takeoff computation)

Optional Improvements

Extract function transformation rules to data structures in xmile_compat.rs
Add performance benchmarks for large models
Consider streaming parser for memory efficiency on huge files
Add code coverage metrics to CI

Conclusion

This is exceptional engineering work that successfully:

✅ Eliminates C++ toolchain dependency
✅ Enables WASM builds with MDL import
✅ Maintains full compatibility with xmutil
✅ Provides comprehensive test coverage
✅ Follows project conventions and Rust best practices

The implementation is production-ready with only minor recommendations for follow-up work.

Verdict: ✅ APPROVE

Additional Notes

The 35K+ line addition is appropriate for a complete parser implementation
The derivative work licensing is handled correctly
The refactoring into submodules (convert/ directory) shows good judgment
The equivalence testing approach is rigorous and clever
The CLAUDE.md documentation will be valuable for future maintenance

Great work! 🎉

bpowers · 2026-01-26T05:58:21Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a8bc71f7c9

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-01-26T06:08:03Z

src/app/NewProject.tsx

      if (file.name.endsWith('.mdl')) {
-        // For Vensim MDL files, try direct import first if available
-        const hasVensim = await Engine2Project.hasVensimSupport();
-        if (hasVensim) {
-          engine2Project = await Engine2Project.openVensim(contents);
-        } else {
-          // Fall back to xmutil conversion when direct Vensim support is not available
-          const [xmileContents, conversionLogs] = await convertMdlToXmile(contents, true);
-          logs = conversionLogs;
-          if (xmileContents.length === 0) {
-            throw new Error('Vensim converter: ' + (logs || 'unknown error'));
-          }
-          engine2Project = await Engine2Project.open(xmileContents);
-        }
+        engine2Project = await Engine2Project.openVensim(contents);
      } else {


Gate MDL import on Vensim support

This path now unconditionally calls Engine2Project.openVensim for .mdl files. The engine2 API still exposes Project.hasVensimSupport() and its tests explicitly handle the case where openVensim must throw when the WASM build lacks simlin_project_open_vensim. In that scenario, users will now hit a runtime exception and lose MDL import entirely (there’s no fallback anymore), whereas the previous code avoided that by checking support and falling back. Consider restoring the support check (or another fallback) so MDL import doesn’t regress on builds that omit Vensim support.

Useful? React with 👍 / 👎.

The native MDL import was setting view_box to the bounding box of diagram elements (e.g. {x: 50, y: 70, width: 431, height: 168}). But the Editor expects view_box to be a viewport descriptor where {0, 0, 0, 0} means "not yet configured, please center automatically." When Canvas.tsx sees a non-zero view_box.width, it skips the simple centering path and instead treats view_box.x/y as pan offsets, causing the diagram to appear off-screen. The XMILE import path correctly uses Default::default() for view_box. This change makes the MDL path match, so the Editor's centering logic works as intended.

The native Rust MDL parser (open_vensim_native) is always available -- it has no feature gate. The FFI export simlin_project_open_vensim() calls it directly and unconditionally. The WASM build includes it. So hasVensimSupport() always returned true and was dead code. This removes the hasVensimSupport function from import-export.ts and the Project class, removes the conditional guards wrapping MDL tests so they run unconditionally, and cleans up stale "requires xmutil feature" references in doc comments, simlin.h, and README.md.

bpowers · 2026-01-26T06:29:20Z

@codex review

claude · 2026-01-26T06:32:56Z

PR Review: Native Vensim MDL Parser

Summary

This is an exceptionally well-executed PR that replaces the C++ xmutil dependency with a pure Rust implementation for parsing Vensim MDL files. The implementation is comprehensive, well-tested, and follows Rust best practices. The code quality is outstanding and this represents a significant architectural improvement for the project.

Strengths

Architecture & Design ✅

Smart Design Decisions: Direct MDL → datamodel conversion (bypassing XMILE intermediate) is elegant and efficient
Clean Module Organization: Well-structured codebase with clear separation of concerns:
- Lexer/normalizer pipeline with context-sensitive transformations
- LALRPOP-based parser with comprehensive grammar
- Multi-pass AST-to-datamodel conversion
- Excellent documentation in CLAUDE.md providing implementation context
Comprehensive Feature Coverage:
- All equation types, subscripts/arrays, built-in functions
- View/diagram parsing with coordinate transformation
- Settings and groups parsing
- Macro parsing (output not yet implemented but captured)

Code Quality ✅

Excellent Error Handling:
- Proper Result types throughout
- No unsafe code
- Strategic use of .unwrap() over .unwrap_or_default() per project guidelines (only 4 uses of unwrap_or_default, all justified)
- Only 9 uses of .expect() in total, all with clear panic messages
Strong Type Safety:
- Leverages Rust's type system effectively
- Proper use of lifetimes (Cow<'input, str> for zero-copy parsing)
- No use of todo!() or unimplemented!() macros
Clean Code Patterns:
- Minimal clone operations (44 in convert/ module for ~15k+ lines)
- No panics in production code paths (parse_number panic is defensive - lexer guarantees validity)
- Clear, descriptive function and variable names

Testing ✅

Outstanding Test Coverage: 453+ unit tests covering:
- Lexer (29+ tests), Normalizer (30+ tests), Parser helpers (15+ tests)
- Builtins (15+ tests), Reader (60+ tests), Convert (55+ tests)
- XMILE compat (30+ tests), Settings (44+ tests), Views (51+ tests)
Integration Testing:
- Equivalence tests comparing native parser vs xmutil output
- 12 models pass full equivalence tests with view element comparison
- Element-by-element validation of conversion correctness
Test Quality: Tests follow TDD principles with clear assertions and edge case coverage

Documentation ✅

Exceptional Documentation:
- 746-line CLAUDE.md provides complete implementation context
- Clear inline comments explaining non-obvious logic
- Module-level documentation for each component
- Comprehensive checklist tracking feature completeness

Areas for Improvement

Minor Issues

Panic in parse_number (parser_helpers.rs:18)
- While the comment says this "should never happen", consider returning a Result for more graceful error handling
- This is a minor concern since the lexer does guarantee validity
View Composition Edge Cases
- The CLAUDE.md notes that module/level-structured models may need enhancement
- Consider adding TODO comments in the code near compose_views() to flag this for future work

Security Considerations ✅

No Unsafe Code: Excellent - no unsafe blocks found
Input Validation: Lexer properly handles malformed input with error codes
Buffer Overflows: Not possible - Rust's safety guarantees prevent this

Potential Bugs

None identified - the code appears robust and well-tested.

Performance Considerations

Memory Usage: Parser holds entire source in memory - acceptable for typical model files (< 10MB)
Computational Complexity: View processing has some O(n²) operations, but acceptable for typical model sizes

Consistency with Project Guidelines ✅

Follows commit message style exactly
Proper use of .unwrap() over .unwrap_or_default()
No compatibility shims or backwards compatibility concerns
Comprehensive test coverage with no stubbed tests
Uses proper error types (Result, not panics)

Recommendations

Before Merge

✅ Verify cargo test -p simlin-compat passes
✅ Verify cargo clippy -p simlin-compat has no warnings
✅ Verify cargo fmt
⚠️ Verify all CI checks are passing

Optional Improvements (Post-Merge)

Consider making parse_number return Result instead of panicking
Add inline TODO comments for known limitations
Profile memory usage on large models
Consider adding fuzzing tests with cargo-fuzz

Conclusion

APPROVE WITH ENTHUSIASM ✅

This PR is a model of excellent engineering:

Solves real problems (build complexity, WASM compatibility)
High code quality with strong typing and error handling
Comprehensive test coverage (453+ tests)
Outstanding documentation
No blocking issues identified
Follows all project conventions

Recommendation: Merge after CI passes. This is production-ready code.

Detailed Metrics: 35,434 lines added, 71 files changed, 425+ test functions, 0 unsafe blocks, 0 TODO/unimplemented macros

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 22b416447d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-01-26T06:33:34Z

src/simlin-engine/src/json.rs

    pub multi_points: Vec<LinkPoint>,
+    #[serde(skip_serializing_if = "Option::is_none", default)]
+    pub polarity: Option<String>,


Preserve new link polarity metadata in TS models

The JSON schema now emits LinkViewElement.polarity (and View.useLetteredPolarity nearby), but the TypeScript JSON types and core datamodel still omit these fields (JsonLinkViewElement/JsonView in src/engine2/src/json-types.ts and LinkViewElement.fromJson/toJson in src/core/datamodel.ts). When a project containing link polarities or lettered polarity settings is opened in the app and saved back, those fields are dropped, so Vensim imports with polarity metadata will not round-trip. Consider wiring these fields through the TS types and core datamodel so they are preserved on load/save.

Useful? React with 👍 / 👎.

bpowers added 29 commits January 25, 2026 18:28

simlin-compat: start on mdl conversion module

f2d5a1f

chatgpt-codex-connector bot reviewed Jan 26, 2026

View reviewed changes

bpowers added 2 commits January 25, 2026 22:15

chatgpt-codex-connector bot reviewed Jan 26, 2026

View reviewed changes

Native Vensim MDL parser #211

Are you sure you want to change the base?

Native Vensim MDL parser #211

Conversation

bpowers commented Jan 26, 2026

Summary

Key components

Test plan

Uh oh!

claude bot commented Jan 26, 2026

Pull Request Review - Native Vensim MDL Parser

✅ Strengths

🔍 Areas for Consideration

🎯 Specific Code Observations

🛡️ Security Considerations

📊 Impact Assessment

✨ Summary

Uh oh!

codecov bot commented Jan 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Jan 26, 2026

Choose a reason for hiding this comment

Uh oh!

claude bot commented Jan 26, 2026

Pull Request Review: Native Vensim MDL Parser

Summary

Strengths

1. Excellent Documentation

2. Strong Testing Strategy

3. Architectural Excellence

4. Code Quality

5. Security & Correctness

6. Completeness

Issues & Suggestions

Critical: None

Major: None

Minor Suggestions

1. Test Coverage Visibility

2. Error Messages Could Be More Specific

3. Consider Adding Fuzzing

4. unwrap() Usage Pattern

5. Module Organization Documentation

Performance Considerations

Positive

Potential Optimization Opportunities

Security Assessment

Test Coverage Assessment

Compatibility & Migration Path

Commit Quality

Alignment with CLAUDE.md Guidelines

Final Recommendations

Before Merge

Post-Merge

Conclusion

Uh oh!

claude bot commented Jan 26, 2026

PR #211 Review: Native Vensim MDL Parser

Summary

Overall Assessment: ✅ APPROVE with Minor Recommendations

Strengths

1. Exceptional Architecture & Design ⭐⭐⭐⭐⭐

2. Outstanding Test Coverage ⭐⭐⭐⭐⭐

3. Proper Error Handling ⭐⭐⭐⭐

4. Licensing & Attribution ⭐⭐⭐⭐⭐

5. C-Compatible Parsing ⭐⭐⭐⭐⭐

Code Quality Issues

Critical Issues: None ✅

High Priority Issues

1. Excessive #[allow(dead_code)] Usage ⚠️

2. High unwrap() Count ⚠️

Medium Priority Issues

3. Missing Documentation on Complex Algorithms 📝

4. Panic Count Acceptable but Worth Monitoring ℹ️

Performance Considerations

codecov bot commented Jan 26, 2026 •

edited

Loading

1. Excessive `#[allow(dead_code)]` Usage ⚠️