Groundwork for DB Version 3.1 and Unit Checking #124

jeff-ws · 2025-09-23T03:22:37Z

This PR sets some groundwork for version 3.1 of the DB and includes a unit checking module

Establishes v3.1 schema which adds 'units' to many tables
Provides a migration tool from v3.0 to v3.1 of the database
Updates many of the requirements to current versions, including adding pint as a requirement for future use with units
Adds a simple comparison tool to utilities that compares a DB to an arbitrary Temoa schema that is helpful in migration tasks

Also included is the basic Unit Checking package in temoa/temoa_model/unit_checking

Unit checks can be run on a DB by:

Use the utility to convert the DB to version 3.1
Select unit checking in the config file (or leave it blank, currently default is True)
Run in perfect foresight or check mode

Summary by CodeRabbit

New Features
- Added full and minimal v3.1 database schemas, DB schema comparison and v3→v3.1 migration tools, a unit registry, unit-screening workflow with report, and a config flag to enable unit checks.
Bug Fixes
- Fixed season foreign-key for curtailment outputs; renamed capacity-share column max_proportion → min_proportion across example and test DBs.
Refactor
- Standardized output writes with explicit column lists and integrated unit checks into sequencer flows.
Tests
- Added unit-checker tests.
Documentation
- Added "Units Checking" docs, README and sample config updates.
Chores
- Added pint and updated dependency pins; adjusted migration CLI default schema path.

- adding timing to post-processing

- queues are now swapping data vs. models - table_writer.py broken up into data pulling portion/writing portion

…s at 20 threads ea.

- commented some functions - removed some erroneous "sizeof" measures

- was hanging while putting in shutdown signals

- works with Utopia - logging cleaned up

Added additional QA steps during data transfer.

coderabbitai

Actionable comments posted: 9

🧹 Nitpick comments (5)

temoa/temoa_model/unit_checking/table_checker.py (1)
72-74: Prefer fullmatch for character validation.

Use re.fullmatch instead of search with anchors for clarity.
-        valid_chars = re.search(ACCEPTABLE_CHARACTERS, expr)
+        valid_chars = re.fullmatch(ACCEPTABLE_CHARACTERS, expr)
temoa/temoa_model/unit_checking/screener.py (1)
34-34: Top-level import of PROJECT_ROOT breaks library import.

Move from definitions import PROJECT_ROOT into the __main__ block to avoid import errors when consuming as a package.
-from definitions import PROJECT_ROOT
In the main block:
 if __name__ == '__main__':
+    from definitions import PROJECT_ROOT
     db_path = Path(PROJECT_ROOT) / 'data_files/mike_US/US_9R_8D_v3_stability_v3_1.sqlite'
     screen(db_path, report_path=Path(PROJECT_ROOT) / 'output_files/')
temoa/temoa_model/unit_checking/entry_checker.py (1)
83-87: Guard against SQL injection via table name (whitelist).

Validate table against an allowlist before string-interpolating into SQL. The call sites use internal constants; enforce that here.
-from temoa.temoa_model.unit_checking.common import (
-    UnitsFormat,
-)
+from temoa.temoa_model.unit_checking.common import (
+    UnitsFormat,
+    tables_with_units,
+)
@@
 def gather_from_table(conn: sqlite3.Connection, table: str) -> dict[str, list[int]]:
     """gather all unique "units" entries from a table and collect the row indices"""
 
     res = defaultdict(list)
     with conn:
         cur = conn.cursor()
+        if table not in tables_with_units:
+            raise ValueError(f'Unexpected table name: {table}')
         cur.execute(f'SELECT units FROM {table}')
Please confirm no external/user input reaches table. If it does, we should harden further (quote identifiers or map to known names). Based on learnings
temoa/temoa_model/unit_checking/relation_checker.py (2)
170-176: Prefer logging.exception to capture traceback on SQL failure.
-    except sqlite3.OperationalError:
-        logger.error('failed to process query: %s when processing table %s', query, table_name)
+    except sqlite3.OperationalError:
+        logger.exception('failed to process query: %s when processing table %s', query, table_name)
Also consider asserting table_name from an allowlist to mitigate S608. Based on hints

250-256: Prefer logging.exception on query failure.
-        except sqlite3.OperationalError:
-            logger.error(
+        except sqlite3.OperationalError:
+            logger.exception(
                 'failed to process query: %s when processing table %s', query, ct.table_name
             )

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 32628e3 and 74c12a8.

📒 Files selected for processing (13)

docs/source/Documentation.rst (1 hunks)
temoa/extensions/modeling_to_generate_alternatives/mga_sequencer.py (3 hunks)
temoa/extensions/myopic/myopic_sequencer.py (2 hunks)
temoa/temoa_model/temoa_config.py (3 hunks)
temoa/temoa_model/temoa_sequencer.py (4 hunks)
temoa/temoa_model/unit_checking/__init__.py (1 hunks)
temoa/temoa_model/unit_checking/common.py (1 hunks)
temoa/temoa_model/unit_checking/entry_checker.py (1 hunks)
temoa/temoa_model/unit_checking/relation_checker.py (1 hunks)
temoa/temoa_model/unit_checking/screener.py (1 hunks)
temoa/temoa_model/unit_checking/table_checker.py (1 hunks)
temoa/temoa_model/unit_checking/temoa_units.txt (1 hunks)
tests/test_unit_checker.py (1 hunks)

🧰 Additional context used

🧬 Code graph analysis (7)

temoa/extensions/modeling_to_generate_alternatives/mga_sequencer.py (1)

temoa/temoa_model/unit_checking/screener.py (1)

screen (54-231)

temoa/temoa_model/unit_checking/table_checker.py (2)

temoa/temoa_model/unit_checking/common.py (1)

consolidate_lines (166-173)

temoa/temoa_model/unit_checking/entry_checker.py (3)

validate_units_expression (64-74)

validate_units_format (49-61)

gather_from_table (77-89)

temoa/temoa_model/unit_checking/screener.py (2)

temoa/temoa_model/unit_checking/relation_checker.py (5)

check_efficiency_table (88-152)

make_commodity_lut (53-65)

check_inter_table_relations (155-225)

check_cost_tables (228-341)

make_c2a_lut (68-80)

temoa/temoa_model/unit_checking/table_checker.py (1)

check_table (54-142)

temoa/temoa_model/unit_checking/entry_checker.py (1)

temoa/temoa_model/unit_checking/common.py (1)

UnitsFormat (144-146)

temoa/temoa_model/temoa_sequencer.py (1)

temoa/temoa_model/unit_checking/screener.py (1)

screen (54-231)

temoa/temoa_model/unit_checking/relation_checker.py (2)

temoa/temoa_model/unit_checking/common.py (1)

consolidate_lines (166-173)

temoa/temoa_model/unit_checking/entry_checker.py (2)

validate_units_format (49-61)

validate_units_expression (64-74)

temoa/extensions/myopic/myopic_sequencer.py (1)

temoa/temoa_model/unit_checking/screener.py (1)

screen (54-231)

🪛 LanguageTool

temoa/temoa_model/unit_checking/temoa_units.txt

[grammar] ~1-~1: There might be a mistake here.
Context: ...ew additions needed to for completeness in Temoa model # the units here AUGMENT th...

(QB_NEW_EN)

[grammar] ~4-~4: There might be a mistake here.
Context: ...ion for more info. passenger = [person] seat = [object] vehicle = [object] # se...

(QB_NEW_EN)

[grammar] ~5-~5: There might be a mistake here.
Context: ...o. passenger = [person] seat = [object] vehicle = [object] # see pint's notes a...

(QB_NEW_EN)

[grammar] ~8-~8: There might be a mistake here.
Context: ...at = [object] vehicle = [object] # see pint's notes about currency and conversion bef...

(QB_NEW_EN)

[grammar] ~11-~11: There might be a mistake here.
Context: ...ition to USD. dollar = [currency] = USD euro = [currency] = EUR # Temoa uses et...

(QB_NEW_EN)

🪛 Ruff (0.14.0)

temoa/temoa_model/unit_checking/table_checker.py

141-141: Avoid specifying long messages outside the exception class

(TRY003)

temoa/temoa_model/unit_checking/screener.py

67-67: Avoid specifying long messages outside the exception class

(TRY003)

temoa/temoa_model/unit_checking/entry_checker.py

72-72: Consider moving this statement to an else block

(TRY300)

83-83: Possible SQL injection vector through string-based query construction

(S608)

temoa/temoa_model/unit_checking/relation_checker.py

163-165: Possible SQL injection vector through string-based query construction

(S608)

168-168: Possible SQL injection vector through string-based query construction

(S608)

173-173: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

246-249: Avoid specifying long messages outside the exception class

(TRY003)

250-250: Possible SQL injection vector through string-based query construction

(S608)

254-256: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

tests/test_unit_checker.py

59-59: Unused function argument: expected

(ARG001)

🔇 Additional comments (1)

temoa/temoa_model/unit_checking/common.py (1)

166-173: LGTM: helpful line consolidation utility.

Clear and constrained output length; used consistently.

coderabbitai · 2025-10-17T00:53:09Z

temoa/temoa_model/unit_checking/relation_checker.py

+from mypy.checkexpr import defaultdict
+from mypy.message_registry import NamedTuple
+from pint.registry import Unit


⚠️ Potential issue | 🔴 Critical

Wrong imports: using mypy internals breaks runtime.

Import from stdlib modules.

-from mypy.checkexpr import defaultdict -from mypy.message_registry import NamedTuple +from collections import defaultdict +from typing import NamedTuple

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

from mypy.checkexpr import defaultdict

from mypy.message_registry import NamedTuple

from pint.registry import Unit

from collections import defaultdict

from typing import NamedTuple

from pint.registry import Unit

🤖 Prompt for AI Agents

In temoa/temoa_model/unit_checking/relation_checker.py around lines 34–36, the code imports mypy internals and pint.registry which breaks runtime; replace those with standard-library and public-package imports: use "from collections import defaultdict", "from typing import NamedTuple", and import Unit from pint's public API (e.g. "from pint.unit import Unit") so the module uses stable, supported symbols at runtime.

temoa/temoa_model/unit_checking/relation_checker.py

temoa/temoa_model/unit_checking/screener.py

temoa/temoa_model/unit_checking/table_checker.py

coderabbitai · 2025-10-17T00:53:10Z

temoa/temoa_model/unit_checking/table_checker.py

+                ref = {expr: converted_units[0] / converted_units[1]}
+                res.update(ref)
+            else:
+                raise ValueError('Unknown units format: %s', format_type)


⚠️ Potential issue | 🟡 Minor

Incorrect ValueError formatting; message not interpolated.

Use f-string; also satisfies TRY003 hint.

- raise ValueError('Unknown units format: %s', format_type) + raise ValueError(f'Unknown units format: {format_type}')

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

raise ValueError('Unknown units format: %s', format_type)

raise ValueError(f'Unknown units format: {format_type}')

🧰 Tools

🪛 Ruff (0.14.0)

141-141: Avoid specifying long messages outside the exception class

(TRY003)

🤖 Prompt for AI Agents

In temoa/temoa_model/unit_checking/table_checker.py around line 141, the ValueError is raised using comma-separated arguments which doesn't interpolate the format_type; change it to raise ValueError(f"Unknown units format: {format_type}") using an f-string (satisfies TRY003) so the message includes the actual format_type value.

tests/test_unit_checker.py

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (3)

temoa/temoa_model/unit_checking/relation_checker.py (3)

36-36: Wrong Unit import (Pint public API).

Import Unit from pint.unit, not pint.registry. Otherwise, runtime ImportError risk.

Apply:

-from pint.registry import Unit
+from pint.unit import Unit

116-129: Avoid KeyError when commodity missing; improve diagnostics.

Use .get() and handle unknown commodities in messages.

-        invalid_input_flag = input_units != comm_units[ic]
-        invalid_output_flag = output_units != comm_units[oc]
+        expected_in = comm_units.get(ic)
+        expected_out = comm_units.get(oc)
+        invalid_input_flag = expected_in is None or input_units != expected_in
+        invalid_output_flag = expected_out is None or output_units != expected_out
@@
-            if invalid_input_flag:
-                msg += f'\n    Invalid input units: {comm_units[ic]}'
-            if invalid_output_flag:
-                msg += f'\n    Invalid output units: {comm_units[oc]}'
+            if invalid_input_flag:
+                msg += f'\n    Invalid input units: {expected_in if expected_in is not None else "<unknown commodity>"}'
+            if invalid_output_flag:
+                msg += f'\n    Invalid output units: {expected_out if expected_out is not None else "<unknown commodity>"}'

260-336: Cost table: wrong identifier used for lookups and messages.

When commodity_reference is set, the first selected column is the commodity value, not the column name; also the loop var is misnamed. Fix lookups and labels; use the row value consistently. Adjust C2A tech key accordingly.

-        for idx, (tech, raw_units_expression) in enumerate(rows, start=1):
+        for idx, (ref, raw_units_expression) in enumerate(rows, start=1):
@@
-            if ct.commodity_reference:
-                commodity_units = commodity_lut.get(ct.commodity_reference)
+            if ct.commodity_reference:
+                # `ref` is the commodity value from the selected column
+                commodity_units = commodity_lut.get(ref)
                 if not commodity_units:
-                    label = f'{ct.table_name}:  Unprocessed row (unknown commodity: {ct.commodity_reference}) '
+                    label = f'{ct.table_name}:  Unprocessed row (unknown commodity: {ref}) '
                     table_grouped_errors[label].append(idx)
                     continue
             else:
-                tech_io = tech_lut.get(tech)
+                tech_io = tech_lut.get(ref)
                 if tech_io:
                     commodity_units = tech_io.output_units
                 else:
-                    label = f'{ct.table_name}:  Unprocessed row (unknown tech: {tech}) '
+                    label = f'{ct.table_name}:  Unprocessed row (unknown tech: {ref}) '
                     table_grouped_errors[label].append(idx)
                     continue
@@
-            if ct.capacity_based:
-                c2a_units = c2a_lut.get(tech, ureg.dimensionless)  # default is dimensionless
+            if ct.capacity_based:
+                c2a_units = c2a_lut.get(ref, ureg.dimensionless)  # default is dimensionless
                 # apply to convert
                 measure_units *= c2a_units * ureg.year
@@
-            if not matched:
-                tech_reference = ct.commodity_reference if ct.commodity_reference else tech
+            if not matched:
+                tech_reference = ref
                 label = (
                     f'{ct.table_name}:  Non-matching measure unit found in cost denominator for tech/commodity {tech_reference}: {raw_units_expression}'
                     f'\n    Commodity units: {commodity_units}, Discovered (after conversions): {measure_units}'
                     f'\n    Conversions:  c2a units: {c2a_units*ureg.year if c2a_units else "N/A"}{", `per period` removed" if ct.period_based else ""}\n   '
                 )

🧹 Nitpick comments (6)

temoa/temoa_model/unit_checking/relation_checker.py (6)
88-91: Fix type hint for commodity units.

These are pint Units, not strings.
-def check_efficiency_table(
-    conn: sqlite3.Connection, comm_units: dict[str, str]
-) -> tuple[dict[str, IOUnits], list[str]]:
+def check_efficiency_table(
+    conn: sqlite3.Connection, comm_units: dict[str, Unit]
+) -> tuple[dict[str, IOUnits], list[str]]:
163-169: Dynamic SQL identifiers: validate/whitelist to mitigate injection (S608).

table_name and identifiers should be validated against a known set (schema constants) before interpolation. Add an allowlist check.

Based on hints

Example helper (add once, then use before building the query):
def assert_known_identifier(name: str, allowed: set[str]) -> str:
    if name not in allowed:
        raise ValueError(f'Unknown identifier: {name}')
    return name
Ensure table_name comes from an allowlist (e.g., schema constants) and not user input, then keep the f-string.

173-175: Prefer logger.exception to preserve traceback.

Use exception() in except blocks.

Based on hints
-        logger.error('failed to process query: %s when processing table %s', query, table_name)
+        logger.exception('Failed to process query: %s when processing table %s', query, table_name)
250-250: Dynamic SQL identifiers: validate/whitelist (S608).

ct.table_name and ct.commodity_reference must be validated (allowlist) before interpolation.

Based on hints

Validate identifiers from CostTableData construction (known schema), then use the f-string safely. If user-controlled, reject early.

254-256: Preserve traceback on DB errors.

Switch to logger.exception for better debugging.

Based on hints
-            logger.error(
+            logger.exception(
                 'failed to process query: %s when processing table %s', query, ct.table_name
             )
75-80: Variable naming nit: use tech for C2A LUT clarity.

The first column is tech; rename local var for readability.
-    for comm, units in rows:
+    for tech, units in rows:
         valid, group = validate_units_format(units, SINGLE_ELEMENT)
         if valid:
             valid, units = validate_units_expression(group[0])
-            res[comm] = units
+            res[tech] = units

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 74c12a8 and 1d85dba.

📒 Files selected for processing (1)

temoa/temoa_model/unit_checking/relation_checker.py (1 hunks)

🧰 Additional context used

🧬 Code graph analysis (1)

temoa/temoa_model/unit_checking/relation_checker.py (2)

temoa/temoa_model/unit_checking/common.py (1)

consolidate_lines (166-173)

temoa/temoa_model/unit_checking/entry_checker.py (2)

validate_units_format (49-61)

validate_units_expression (64-74)

🪛 Ruff (0.14.0)

temoa/temoa_model/unit_checking/relation_checker.py

163-165: Possible SQL injection vector through string-based query construction

(S608)

168-168: Possible SQL injection vector through string-based query construction

(S608)

173-173: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

246-249: Avoid specifying long messages outside the exception class

(TRY003)

250-250: Possible SQL injection vector through string-based query construction

(S608)

254-256: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (1)

temoa/utilities/db_migration_v3_to_v3_1.py (1)
193-199: Fix severity label: "WARNING" should be "ERROR".

Past review indicated this mismatch label should be corrected. Since the script closes connections and exits, this is a fatal error, not a warning. The same pattern at lines 231-237 correctly uses "ERROR".

Apply this diff for consistency:
-            print(f'WARNING: Column mismatch in {table_name}')
+            print(f'ERROR: Column mismatch in {table_name}')

🧹 Nitpick comments (2)

temoa/utilities/db_migration_v3_to_v3_1.py (2)
178-178: Use positive exit codes for portability.

The script exits with -1 on errors. POSIX conventions expect positive exit codes (1-255), and -1 may be interpreted as 255 on some systems. For consistency with standard practice, use sys.exit(1) for error conditions.

Apply this diff:
-    sys.exit(-1)
+    sys.exit(1)
This same pattern appears in lines 199, 237, and should be updated throughout for consistency.

272-272: Prefer list unpacking over concatenation.

The static analyzer suggests using unpacking syntax for better readability.

Apply this diff:
-        if set(old_cols + ['units']) != set(new_cols):
+        if set([*old_cols, 'units']) != set(new_cols):

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1d85dba and ec89d26.

📒 Files selected for processing (1)

temoa/utilities/db_migration_v3_to_v3_1.py (1 hunks)

🧰 Additional context used

🪛 Ruff (0.14.0)

temoa/utilities/db_migration_v3_to_v3_1.py

203-203: Possible SQL injection vector through string-based query construction

(S608)

216-216: Possible SQL injection vector through string-based query construction

(S608)

245-245: Possible SQL injection vector through string-based query construction

(S608)

255-255: Possible SQL injection vector through string-based query construction

(S608)

272-272: Consider [*old_cols, 'units'] instead of concatenation

Replace with [*old_cols, 'units']

(RUF005)

283-283: Possible SQL injection vector through string-based query construction

(S608)

295-295: Possible SQL injection vector through string-based query construction

(S608)

🔇 Additional comments (1)

temoa/utilities/db_migration_v3_to_v3_1.py (1)

300-319: LGTM! Solid finalization sequence.

The commit, vacuum, FK re-enabling, and FK verification sequence is well-structured. The error handling for FK checks provides clear diagnostics, and connection cleanup is properly done.

…e log setup when importing IamDataFrame, causing failure of Temoa logging.

… in relation checker

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (5)

temoa/utilities/db_migration_v3_to_v3_1.py (1)

208-212: Inconsistent severity label: fatal condition labeled as WARNING.

Line 209 prints "WARNING" but then calls exit_on_failure, which is fatal. This is inconsistent with line 246, which correctly labels the same condition as "ERROR" in the modified-transfers block.

Apply this diff:
         if set(old_cols) != set(new_cols):
-            print(f'WARNING: Column mismatch in {table_name}')
+            print(f'ERROR: Column mismatch in {table_name}')
             print(f'Old columns: {old_cols}')
             print(f'New columns: {new_cols}')
             exit_on_failure(f'Column mismatch in {table_name}')

temoa/temoa_model/unit_checking/screener.py (1)

81-83: DB version check uses raw metadata; cast to int or it may fail at runtime.

MetaData values are strings; current comparisons can raise TypeError or always fail. Cast safely.

Apply:

-            major = meta_data.get('DB_MAJOR', 0)
-            minor = meta_data.get('DB_MINOR', 0)
+            try:
+                major = int(meta_data.get('DB_MAJOR', '0'))
+                minor = int(meta_data.get('DB_MINOR', '0'))
+            except (TypeError, ValueError):
+                major, minor = 0, 0

temoa/temoa_model/unit_checking/relation_checker.py (3)

116-129: Guard commodity lookups; avoid KeyError and clarify messages.

-        invalid_input_flag = input_units != comm_units[ic]
-        invalid_output_flag = output_units != comm_units[oc]
+        expected_in = comm_units.get(ic)
+        expected_out = comm_units.get(oc)
+        invalid_input_flag = expected_in is None or input_units != expected_in
+        invalid_output_flag = expected_out is None or output_units != expected_out
@@
-            if invalid_input_flag:
-                msg += f'\n    Invalid input units: {comm_units[ic]}'
-            if invalid_output_flag:
-                msg += f'\n    Invalid output units: {comm_units[oc]}'
+            if invalid_input_flag:
+                msg += f'\n    Invalid input units: {expected_in if expected_in is not None else "<unknown commodity>"}'
+            if invalid_output_flag:
+                msg += f'\n    Invalid output units: {expected_out if expected_out is not None else "<unknown commodity>"}'

256-341: Cost table lookups use the column name, not the row value; causes false errors.

Use the selected field’s row value for lookups and C2A conversion.

-        query = f'SELECT {ct.commodity_reference if ct.commodity_reference else 'tech'}, units FROM {ct.table_name}'
+        query = f'SELECT {ct.commodity_reference if ct.commodity_reference else "tech"}, units FROM {ct.table_name}'
@@
-        for idx, (tech, raw_units_expression) in enumerate(rows, start=1):
+        for idx, (ref, raw_units_expression) in enumerate(rows, start=1):
@@
-            if ct.commodity_reference:
-                commodity_units = commodity_lut.get(ct.commodity_reference)
+            if ct.commodity_reference:
+                # `ref` holds the commodity from the selected column
+                commodity_units = commodity_lut.get(ref)
                 if not commodity_units:
-                    label = f'{ct.table_name}:  Unprocessed row (unknown commodity: {ct.commodity_reference}) '
+                    label = f'{ct.table_name}:  Unprocessed row (unknown commodity: {ref}) '
                     table_grouped_errors[label].append(idx)
                     continue
             else:
-                tech_io = tech_lut.get(tech)
+                tech_io = tech_lut.get(ref)
                 if tech_io:
                     commodity_units = tech_io.output_units
                 else:
-                    label = f'{ct.table_name}:  Unprocessed row (unknown tech: {tech}) '
+                    label = f'{ct.table_name}:  Unprocessed row (unknown tech: {ref}) '
                     table_grouped_errors[label].append(idx)
                     continue
@@
-            if ct.capacity_based:
-                c2a_units = c2a_lut.get(tech, ureg.dimensionless)  # default is dimensionless
+            if ct.capacity_based:
+                c2a_units = c2a_lut.get(ref, ureg.dimensionless)  # default is dimensionless
                 # apply to convert
                 measure_units *= c2a_units * ureg.year
@@
-            if not matched:
-                tech_reference = ct.commodity_reference if ct.commodity_reference else tech
+            if not matched:
+                tech_reference = ref
                 label = (
                     f'{ct.table_name}:  Non-matching measure unit found in cost denominator for tech/commodity {tech_reference}: {raw_units_expression}'
                     f'\n    Commodity units: {commodity_units}, Discovered (after conversions): {measure_units}'
                     f'\n    Conversions:  c2a units: {c2a_units*ureg.year if c2a_units else "N/A"}{", `per period` removed" if ct.period_based else ""}\n   '
                 )

36-36: Use Pint’s public Unit type, not internal pint.registry.

pint.registry.Unit is internal and may change; prefer pint.unit.Unit (compatible with Pint ≥0.25).

Apply:

-from pint.registry import Unit
+from pint.unit import Unit

Based on learnings.

🧹 Nitpick comments (10)

requirements.in (1)

46-47: Consider adding a minimum version constraint for pint.

Pint 0.25.0 requires Python >= 3.11, which aligns with your Python 3.12 environment. However, adding an explicit minimum version constraint would ensure reproducible builds and prevent issues if the unit-checking code relies on features from newer Pint versions.

Based on learnings

Apply this diff to add a minimum version constraint:
 # for database v3.1
-pint
+pint >= 0.23
Note: Adjust the minimum version (e.g., 0.23, 0.24, or 0.25) based on which Pint features your unit-checking code actually requires.

temoa/utilities/db_migration_v3_to_v3_1.py (3)

150-152: Use standard Unix exit code.

Exit code -1 is non-standard; use 1 to indicate a general error per Unix conventions.

Apply this diff:
 if new_db_path.exists():
     print(f'ERROR: destination database already exists: {new_db_path}. Exiting.')
-    sys.exit(-1)
+    sys.exit(1)
155-163: Standardize exit code in helper function.

The exit code should be 1 for consistency with Unix conventions.

Apply this diff:
 def exit_on_failure(msg):
     print(
         'Transition failed for reason below.  This issue is fatal and must be remediated.  Exiting.'
     )
     print(msg)
     con_old.close()
     con_new.close()
     new_db_path.unlink()
-    sys.exit(-1)
+    sys.exit(1)
284-284: Optional: Prefer unpacking syntax for list concatenation.

Ruff suggests using the unpacking operator for combining lists.

Apply this diff:
-        if set(old_cols + ['units']) != set(new_cols):
+        if set([*old_cols, 'units']) != set(new_cols):

temoa/temoa_model/unit_checking/screener.py (3)

90-97: Don’t return inside the per-DB loop on version failure; process all DBs.

Keep scanning remaining DBs, mark overall failure, and write one report at the end.

-                msg = 'Units Check 1 (DB Version):  Failed.  DB must be v3.1 or greater for units checking'
+                msg = 'Units Check 1 (DB Version):  Failed.  DB must be v3.1 or greater for units checking'
                 report_entries.extend((msg, '\n'))
                 logger.warning(msg)
-                # we are non-viable, write the (very short) report and return
-                _write_report(report_path, report_entries)
-                if verbose:
-                    print(msg)
-                return False
+                if verbose:
+                    print(msg)
+                all_clear = False
+                report_entries.append('\n')
+                logger.info('Skipping further checks for this DB due to version mismatch.')
+                continue

34-34: Avoid importing project-only symbols at module import time.

Move definitions import into the __main__ block to keep this module importable as a library.

-from definitions import PROJECT_ROOT
@@
-if __name__ == '__main__':
-    db_path = Path(PROJECT_ROOT) / 'data_files/mike_US/US_9R_8D_v3_stability_v3_1.sqlite'
-    screen(db_path, report_path=Path(PROJECT_ROOT) / 'output_files/')
+if __name__ == '__main__':
+    from definitions import PROJECT_ROOT
+    db_path = Path(PROJECT_ROOT) / 'data_files/mike_US/US_9R_8D_v3_stability_v3_1.sqlite'
+    screen(db_path, report_path=Path(PROJECT_ROOT) / 'output_files/')

Also applies to: 245-247

63-65: Remove unused table_units accumulator.

It’s never read; drop it and the stray string literal.

-    table_units = {}
-    """Table name : {tech | commodity: units}"""
@@
-                relations, table_errors = check_table(conn, table)
-                table_units[table] = relations
+                relations, table_errors = check_table(conn, table)

Also applies to: 109-111

temoa/temoa_model/unit_checking/relation_checker.py (3)

174-176: Prefer logger.exception to capture stack traces.

Improves diagnosability when queries fail.

-    except sqlite3.OperationalError:
-        logger.error('failed to process query: %s when processing table %s', query, table_name)
+    except sqlite3.OperationalError:
+        logger.exception('failed to process query: %s when processing table %s', query, table_name)

-        except sqlite3.OperationalError:
-            logger.error(
-                'failed to process query: %s when processing table %s', query, ct.table_name
-            )
+        except sqlite3.OperationalError:
+            logger.exception(
+                'failed to process query: %s when processing table %s', query, ct.table_name
+            )

Also applies to: 260-262

203-205: Remove f-strings with no placeholders (F541).

-        if not valid_table_units:
-            grouped_errors[f'Unprocessed row (invalid units--see earlier tests)'].append(idx)
-        if not c2a_valid:
-            grouped_errors[f'Unprocessed row (invalid c2a units--see earlier tests)'].append(idx)
+        if not valid_table_units:
+            grouped_errors['Unprocessed row (invalid units--see earlier tests)'].append(idx)
+        if not c2a_valid:
+            grouped_errors['Unprocessed row (invalid c2a units--see earlier tests)'].append(idx)

160-170: Defend against SQL injection by whitelisting table names.

table_name is interpolated into SQL; assert it’s from a trusted list.

 def check_inter_table_relations(
     conn: sqlite3.Connection, table_name, tech_lut: dict[str, IOUnits], capacity_based: bool
 ) -> list[str]:
     """check the tech and units in the given table vs. baseline (expected) values for the tech"""
-    grouped_errors = defaultdict(list)
+    grouped_errors = defaultdict(list)
+    # Defensive: ensure only known tables are queried
+    from temoa.temoa_model.unit_checking.common import activity_based_tables, capacity_based_tables
+    allowed = set(activity_based_tables) | set(capacity_based_tables)
+    if table_name not in allowed:
+        return [f'Unexpected table name: {table_name}']

If table_name already guaranteed by the caller, feel free to skip.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b494104 and 3853c00.

📒 Files selected for processing (5)

requirements.in (2 hunks)
requirements.txt (14 hunks)
temoa/temoa_model/unit_checking/relation_checker.py (1 hunks)
temoa/temoa_model/unit_checking/screener.py (1 hunks)
temoa/utilities/db_migration_v3_to_v3_1.py (1 hunks)

🧰 Additional context used

🧬 Code graph analysis (2)

temoa/temoa_model/unit_checking/screener.py (2)

temoa/temoa_model/unit_checking/relation_checker.py (5)

check_efficiency_table (89-153)

make_commodity_lut (53-65)

check_inter_table_relations (156-231)

check_cost_tables (234-345)

make_c2a_lut (68-80)

temoa/temoa_model/unit_checking/table_checker.py (1)

check_table (54-142)

temoa/temoa_model/unit_checking/relation_checker.py (2)

temoa/temoa_model/unit_checking/common.py (1)

consolidate_lines (166-173)

temoa/temoa_model/unit_checking/entry_checker.py (2)

validate_units_format (49-61)

validate_units_expression (64-74)

🪛 Ruff (0.14.0)

temoa/temoa_model/unit_checking/screener.py

67-67: Avoid specifying long messages outside the exception class

(TRY003)

temoa/temoa_model/unit_checking/relation_checker.py

164-166: Possible SQL injection vector through string-based query construction

(S608)

169-169: Possible SQL injection vector through string-based query construction

(S608)

174-174: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

203-203: f-string without any placeholders

Remove extraneous f prefix

(F541)

205-205: f-string without any placeholders

Remove extraneous f prefix

(F541)

252-255: Avoid specifying long messages outside the exception class

(TRY003)

256-256: Possible SQL injection vector through string-based query construction

(S608)

260-262: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

temoa/utilities/db_migration_v3_to_v3_1.py

216-216: Possible SQL injection vector through string-based query construction

(S608)

229-229: Possible SQL injection vector through string-based query construction

(S608)

257-257: Possible SQL injection vector through string-based query construction

(S608)

267-267: Possible SQL injection vector through string-based query construction

(S608)

284-284: Consider [*old_cols, 'units'] instead of concatenation

Replace with [*old_cols, 'units']

(RUF005)

295-295: Possible SQL injection vector through string-based query construction

(S608)

307-307: Possible SQL injection vector through string-based query construction

(S608)

🔇 Additional comments (6)

requirements.in (1)

25-25: Version 3.1.0 of pyam-iamc exists, and no known security vulnerabilities were found. The version constraint is valid.

temoa/utilities/db_migration_v3_to_v3_1.py (3)

216-216: Static analysis SQL injection warnings are false positives.

Ruff flags these lines as potential SQL injection vectors, but the construction is safe because:

table_name comes exclusively from hardcoded lists (direct_transfer_tables, transfer_with_mod.keys(), add_units_tables)

cols_str is built from column names retrieved via PRAGMA table_info, which reflects the actual schema structure

No user-controlled input enters these queries

The parameterized WHERE clause at line 257 correctly handles user-like data and was addressed in previous reviews.

Also applies to: 229-229, 257-257, 267-267, 295-295, 307-307

246-249: LGTM: Correct error labeling and parameterized query.

The error label correctly indicates a fatal condition (line 246), and the parameterized WHERE clause (lines 254-259) properly prevents SQL injection by separating the query structure from values.

Also applies to: 254-259

312-340: LGTM: Comprehensive post-migration validation.

The final sequence properly commits the transaction, vacuums the database, re-enables foreign key enforcement, validates FK integrity, and performs a sanity check on the TimePeriod table. The error handling for FK check failures is appropriate.

requirements.txt (2)

2-5: Python version consistency issue resolved.

Both requirements.in and requirements.txt now consistently specify Python 3.12. The past mismatch (3.11 vs 3.12) has been corrected. pint==0.25 is present (line 349) and compatible with Python 3.12.

349-357: Verify that your project supports Python 3.11+; pint 0.25's specific breaking changes do not affect this codebase.

Pint 0.25 introduces a minimum Python requirement of 3.11+ and fixes the round() and .to() return types. However, analysis of the codebase shows:

No round() or .to() method calls on pint quantities exist in the four unit-checking modules

Quantity operations use only safe APIs: multiplication/division (*, /), parse_units(), and compatible_units()

Tests exist for unit validation (created 9/22/25)

Confirm your project's Python version requirement is already 3.11+ to match pint 0.25's minimum.

temoa/temoa_model/unit_checking/relation_checker.py

coderabbitai

Actionable comments posted: 2

♻️ Duplicate comments (5)

temoa/temoa_model/unit_checking/relation_checker.py (4)

90-93: Fix comm_units type annotation to actual contents (Unit).

-def check_efficiency_table(
-    conn: sqlite3.Connection, comm_units: dict[str, str]
+def check_efficiency_table(
+    conn: sqlite3.Connection, comm_units: dict[str, Unit]
 ) -> tuple[dict[str, IOUnits], list[str]]:

117-131: Avoid KeyError when commodities are missing; improve messages.

-        invalid_input_flag = input_units != comm_units[ic]
-        invalid_output_flag = output_units != comm_units[oc]
+        expected_in = comm_units.get(ic)
+        expected_out = comm_units.get(oc)
+        invalid_input_flag = expected_in is None or input_units != expected_in
+        invalid_output_flag = expected_out is None or output_units != expected_out
@@
-            if invalid_input_flag:
-                msg += f'\n    Invalid input units: {comm_units[ic]}'
-            if invalid_output_flag:
-                msg += f'\n    Invalid output units: {comm_units[oc]}'
+            if invalid_input_flag:
+                msg += f'\n    Invalid input units: {expected_in if expected_in is not None else "<unknown commodity>"}'
+            if invalid_output_flag:
+                msg += f'\n    Invalid output units: {expected_out if expected_out is not None else "<unknown commodity>"}'

290-375: Cost-table checks use column name instead of row value; fix identifier handling throughout.

The first selected column is the row’s identifier (commodity or tech). Use it for lookups, C2A, and messages.

-        query = f'SELECT {ct.commodity_reference if ct.commodity_reference else 'tech'}, units FROM {ct.table_name}'
+        query = f"SELECT {ct.commodity_reference if ct.commodity_reference else 'tech'}, units FROM {ct.table_name}"
@@
-        for idx, (tech, raw_units_expression) in enumerate(rows, start=1):
+        for idx, (ref, raw_units_expression) in enumerate(rows, start=1):
@@
-            if ct.commodity_reference:
-                commodity_units = commodity_lut.get(ct.commodity_reference)
+            if ct.commodity_reference:
+                # `ref` is the commodity value from the selected column
+                commodity_units = commodity_lut.get(ref)
                 if not commodity_units:
-                    label = f'{ct.table_name}:  Unprocessed row (unknown commodity: {ct.commodity_reference}) '
+                    label = f'{ct.table_name}:  Unprocessed row (unknown commodity: {ref}) '
                     table_grouped_errors[label].append(idx)
                     continue
             else:
-                tech_io = tech_lut.get(tech)
+                tech_io = tech_lut.get(ref)
                 if tech_io:
                     commodity_units = tech_io.output_units
                 else:
-                    label = f'{ct.table_name}:  Unprocessed row (unknown tech: {tech}) '
+                    label = f'{ct.table_name}:  Unprocessed row (unknown tech: {ref}) '
                     table_grouped_errors[label].append(idx)
                     continue
@@
-            if ct.capacity_based:
-                c2a_units = c2a_lut.get(tech, ureg.dimensionless)  # default is dimensionless
+            if ct.capacity_based:
+                c2a_units = c2a_lut.get(ref, ureg.dimensionless)  # default is dimensionless
                 # apply to convert
                 measure_units *= c2a_units * ureg.year
@@
-            if not matched:
-                tech_reference = ct.commodity_reference if ct.commodity_reference else tech
+            if not matched:
+                tech_reference = ref
                 label = (
                     f'{ct.table_name}:  Non-matching measure unit found in cost denominator for tech/commodity {tech_reference}: {raw_units_expression}'
-                    f'\n    Expecting commodity units: {commodity_units}. Discovered (after conversions applied): {measure_units}'
+                    f'\n    Expecting commodity units: {commodity_units}. Discovered (after conversions applied): {measure_units}'
                     f'\n    Conversions:  c2a units: {c2a_units*ureg.year if c2a_units else "N/A"}{", `per period` removed" if ct.period_based else ""}\n   '
                 )

36-36: Import Unit from Pint public API.

pint.registry.Unit isn’t a stable public path. Use pint.unit.Unit.

-from pint.registry import Unit
+from pint.unit import Unit

temoa/temoa_model/unit_checking/screener.py (1)

83-99: DB version check compares strings to ints; always fails on string values.

Cast MetaData values to int before comparison to avoid false failures.

-            major = meta_data.get('DB_MAJOR', 0)
-            minor = meta_data.get('DB_MINOR', 0)
-            if major == 3 and minor >= 1:
+            try:
+                major = int(meta_data.get('DB_MAJOR', '0'))
+                minor = int(meta_data.get('DB_MINOR', '0'))
+            except (TypeError, ValueError):
+                major, minor = 0, 0
+            if major == 3 and minor >= 1:

🧹 Nitpick comments (7)

temoa/temoa_model/unit_checking/common.py (1)

155-161: Clarify allowed characters in comments to match the regex.

The comment lists only “*” and “_”, but the patterns also allow “^, /, (), digits, spaces”. Update the comment for accuracy.

Also applies to: 169-169
temoa/temoa_model/unit_checking/screener.py (3)
34-34: Avoid importing PROJECT_ROOT at module import; move under main.

Prevents hard dependency when importing this module as a library.
-from definitions import PROJECT_ROOT
@@
-if __name__ == '__main__':
-    db_path = Path(PROJECT_ROOT) / 'data_files/mike_US/US_9R_8D_v3_stability_v3_1.sqlite'
-    screen(db_path, report_path=Path(PROJECT_ROOT) / 'temp/')
+if __name__ == '__main__':
+    from definitions import PROJECT_ROOT  # localize to script use
+    db_path = Path(PROJECT_ROOT) / 'data_files/mike_US/US_9R_8D_v3_stability_v3_1.sqlite'
+    screen(db_path, report_path=Path(PROJECT_ROOT) / 'temp/')
Also applies to: 270-272

65-66: Reset per‑DB state and minor readability nits.

Reinitialize table_units inside the per‑DB loop; current dict persists across DBs.

Optional: track a per‑DB errors flag to avoid shadowing “errors”.
-    table_units = {}
-    """Table name : {tech | commodity: units}"""
-    for db_path in db_paths:
+    for db_path in db_paths:
+        table_units = {}  # Table name : {tech | commodity: units}
Also applies to: 67-73, 101-129

259-267: Ensure report directory exists; support directory paths consistently.

If report_path is a directory (possibly not yet created) or a file in a non‑existent directory, writing will fail. Create parents as needed.
 def _write_report(report_path: Path, report_entries: list[str]):
     """Write the report to file"""
     if not report_path:
         return
-    if report_path.is_dir():
-        # augment with a default filename
-        report_path /= 'units_check.txt'
+    if report_path.suffix == '':
+        # treat as directory path; create and append default filename
+        report_path.mkdir(parents=True, exist_ok=True)
+        report_path = report_path / 'units_check.txt'
+    else:
+        report_path.parent.mkdir(parents=True, exist_ok=True)
     with open(report_path, 'w', encoding='utf-8') as report_file:
         report_file.writelines(report_entries)
temoa/temoa_model/unit_checking/relation_checker.py (3)
183-185: Log the exception with traceback for failed queries.

Use logger.exception to capture context.
-    except sqlite3.OperationalError:
-        logger.error('failed to process query: %s when processing table %s', query, table_name)
+    except sqlite3.OperationalError:
+        logger.exception('failed to process query: %s when processing table %s', query, table_name)
170-179: Dynamic SQL table names are safe-by-construction; add a whitelist assert to satisfy S608.

Since table_name/ct.table_name originate from curated sets, assert membership before execution.
# before building the query
ALLOWED_TABLES = {'CapacityToActivity', 'commodity', 'efficiency', 'CostInvest', 'CostEmission', 'CostFixed', 'CostVariable'}
assert table_name in ALLOWED_TABLES  # or for cost_tables: assert ct.table_name in {t.table_name for t in cost_tables}
Also applies to: 290-293

251-257: Message wording for COMMODITY relations.

For COMMODITY checks, “for output of tech …” is misleading. Consider neutral phrasing.
-                f' for output of tech {tech_or_comm}'
+                f' for reference {tech_or_comm}'

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 88295b6 and 06d83fa.

📒 Files selected for processing (3)

temoa/temoa_model/unit_checking/common.py (1 hunks)
temoa/temoa_model/unit_checking/relation_checker.py (1 hunks)
temoa/temoa_model/unit_checking/screener.py (1 hunks)

🧰 Additional context used

🧬 Code graph analysis (2)

temoa/temoa_model/unit_checking/screener.py (3)

temoa/temoa_model/unit_checking/common.py (1)

RelationType (143-146)

temoa/temoa_model/unit_checking/relation_checker.py (5)

check_efficiency_table (90-154)

make_commodity_lut (54-66)

check_inter_table_relations (157-265)

check_cost_tables (268-379)

make_c2a_lut (69-81)

temoa/temoa_model/unit_checking/table_checker.py (1)

check_table (54-142)

temoa/temoa_model/unit_checking/relation_checker.py (2)

temoa/temoa_model/unit_checking/common.py (2)

consolidate_lines (172-179)

RelationType (143-146)

temoa/temoa_model/unit_checking/entry_checker.py (2)

validate_units_format (49-61)

validate_units_expression (64-74)

🪛 Ruff (0.14.1)

temoa/temoa_model/unit_checking/screener.py

69-69: Avoid specifying long messages outside the exception class

(TRY003)

temoa/temoa_model/unit_checking/relation_checker.py

170-172: Possible SQL injection vector through string-based query construction

(S608)

175-175: Possible SQL injection vector through string-based query construction

(S608)

177-177: Possible SQL injection vector through string-based query construction

(S608)

179-179: Avoid specifying long messages outside the exception class

(TRY003)

183-183: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

210-210: Avoid specifying long messages outside the exception class

(TRY003)

286-289: Avoid specifying long messages outside the exception class

(TRY003)

290-290: Possible SQL injection vector through string-based query construction

(S608)

294-296: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

coderabbitai · 2025-10-21T00:02:29Z

temoa/temoa_model/unit_checking/common.py

+RATIO_ELEMENT = UnitsFormat(
+    format=r'^\s*([A-Za-z0-9\*\/\^\_\s]+?)\s*\/\s*\(\s*([A-Za-z0-9\*\^\/\(\)\_\s]+?)\s*\)\s*$',
+    groups=2,
+)


⚠️ Potential issue | 🟠 Major

RATIO regex blocks parentheses in numerator; contradicts comments and likely valid inputs.

Expressions like "(kg*m)/(s)" won’t match because the numerator character class omits "()". Allow parentheses in the numerator too.

RATIO_ELEMENT = UnitsFormat( - format=r'^\s*([A-Za-z0-9\*\/\^\_\s]+?)\s*\/\s*$\s*([A-Za-z0-9\*\^\/\($\_\s]+?)\s*\)\s*$', + format=r'^\s*([A-Za-z0-9\*\^\/\_\s]+?)\s*\/\s*$\s*([A-Za-z0-9\*\^\/\($\_\s]+?)\s*\)\s*$', groups=2, )

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

RATIO_ELEMENT = UnitsFormat(

format=r'^\s*([A-Za-z0-9\*\/\^\_\s]+?)\s*\/\s*$\s*([A-Za-z0-9\*\^\/\($\_\s]+?)\s*\)\s*$',

groups=2,

)

RATIO_ELEMENT = UnitsFormat(

format=r'^\s*([A-Za-z0-9\*\^\/\_\s]+?)\s*\/\s*$\s*([A-Za-z0-9\*\^\/\($\_\s]+?)\s*\)\s*$',

groups=2,

)

🤖 Prompt for AI Agents

In temoa/temoa_model/unit_checking/common.py around lines 163 to 166, the RATIO_ELEMENT regex currently excludes parentheses from the numerator character class which blocks valid inputs like "(kg*m)/(s)"; update the numerator character class to allow parentheses (i.e., include '$' and '$' inside the first bracketed class in the raw regex string) so the pattern accepts parenthesized numerators while preserving the rest of the pattern and groups.

coderabbitai · 2025-10-21T00:02:29Z

temoa/temoa_model/unit_checking/relation_checker.py

+                io_units = tech_lut[tech_or_comm]
+                if not io_units:
+                    grouped_errors[
+                        f'Unprocessed row (missing reference for tech "{tech_or_comm}" --see earlier tests)'
+                    ].append(idx)
+                    continue
+                expected_units = io_units.output_units


⚠️ Potential issue | 🔴 Critical

KeyError risk: use .get() for ACTIVITY lookups.

Indexing tech_lut[tech_or_comm] will raise if missing; use .get() as in the CAPACITY case.

- case RelationType.ACTIVITY: - io_units = tech_lut[tech_or_comm] - if not io_units: + case RelationType.ACTIVITY: + io_units = tech_lut.get(tech_or_comm) + if not io_units: grouped_errors[ f'Unprocessed row (missing reference for tech "{tech_or_comm}" --see earlier tests)' ].append(idx) continue expected_units = io_units.output_units

🤖 Prompt for AI Agents

In temoa/temoa_model/unit_checking/relation_checker.py around lines 200 to 206, the code directly indexes tech_lut[tech_or_comm] which can raise KeyError for missing ACTIVITY entries; change the lookup to use tech_lut.get(tech_or_comm) (like the CAPACITY case), check if the result is falsy, append the same grouped_errors message and continue, and only access expected_units = io_units.output_units when io_units is present.

…nit checker is and how to execute with it. Update to 2 config files to include unit checking

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

README.md (1)
67-76: Improve phrasing to eliminate split infinitive and reduce redundancy.

Lines 68–69 contain a split infinitive ("are advised to carefully review") and redundant phrasing ("Users who use this tool"). Static analysis caught the split infinitive concern. The current wording can be tightened for clarity.

Consider this revision:
-Converting legacy db's to Version 3.0 can be done with the included database migration tool.  Users who use this tool are advised to carefully review the console outputs during conversion to ensure accuracy and check the 
-converted database carefully.  The migration tool will build an empty new Version 3.0 database and move data from
+Converting legacy databases to Version 3.0 can be done with the included database migration tool. We advise reviewing console outputs carefully during conversion to ensure accuracy and checking the 
+converted database carefully. The migration tool will build an empty new Version 3.0 database and move data from
(Minor touches: "db's" → "databases", "Users who use this tool are advised to" → "We advise", moved "carefully" after the verb, and light punctuation adjustments for consistency.)

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 06d83fa and 6a7f347.

📒 Files selected for processing (5)

README.md (3 hunks)
data_files/my_configs/config_sample.toml (1 hunks)
data_files/my_configs/morris_utopia.toml (1 hunks)
docs/source/Documentation.rst (1 hunks)
temoa/temoa_model/temoa_sequencer.py (4 hunks)

✅ Files skipped from review due to trivial changes (1)

data_files/my_configs/morris_utopia.toml

🚧 Files skipped from review as they are similar to previous changes (2)

temoa/temoa_model/temoa_sequencer.py
docs/source/Documentation.rst

🧰 Additional context used

🪛 LanguageTool

README.md

[style] ~68-~68: Try moving the adverb to make the sentence clearer.
Context: ...tion tool. Users who use this tool are advised to carefully review the console outputs during conversion to ensure acc...

(SPLIT_INFINITIVE)

🔇 Additional comments (3)

data_files/my_configs/config_sample.toml (1)

41-43: Unit check option is well-documented and properly placed.

The new unit_check configuration option is clearly commented and positioned logically within the DATA/MODEL CHECKS section. The explanatory text effectively conveys its purpose and output location.

README.md (2)

81-84: Unit Checking documentation is clear and well-integrated.

The new section on unit checking functionality is well-written, correctly describes the v3.1 migration requirement, and clearly explains that the v3.1 utility expects v3.0 source databases. This aligns seamlessly with the new configuration option and PR objectives.

98-98: Config table entry for Unit Checking is clear and consistent.

The new row accurately describes the unit checking feature and maintains consistency with other entries in the table and the corresponding configuration option in config_sample.toml.

…ustments

coderabbitai

Actionable comments posted: 3

♻️ Duplicate comments (6)

temoa/temoa_model/unit_checking/common.py (1)

163-166: Allow parentheses in RATIO numerator.

Valid inputs like “(kg*m)/(s)” should pass.

 RATIO_ELEMENT = UnitsFormat(
-    format=r'^\s*([A-Za-z0-9\*\/\^\_\s]+?)\s*\/\s*\(\s*([A-Za-z0-9\*\^\/\(\)\_\s]+?)\s*\)\s*$',
+    format=r'^\s*([A-Za-z0-9\*\^\/\(\)\_\s]+?)\s*\/\s*\(\s*([A-Za-z0-9\*\^\/\(\)\_\s]+?)\s*\)\s*$',
     groups=2,
 )

temoa/temoa_model/unit_checking/relation_checker.py (4)

199-206: Use .get() for ACTIVITY tech lookups to prevent KeyError.

-            case RelationType.ACTIVITY:
-                io_units = tech_lut[tech_or_comm]
-                if not io_units:
+            case RelationType.ACTIVITY:
+                io_units = tech_lut.get(tech_or_comm)
+                if not io_units:
                     grouped_errors[
                         f'Unprocessed row (missing reference for tech "{tech_or_comm}" --see earlier tests)'
                     ].append(idx)
                     continue
                 expected_units = io_units.output_units

117-131: Avoid KeyError on missing commodities; improve messages.

Use .get() for lookups and handle unknowns in output.

-        invalid_input_flag = input_units != comm_units[ic]
-        invalid_output_flag = output_units != comm_units[oc]
+        expected_in = comm_units.get(ic)
+        expected_out = comm_units.get(oc)
+        invalid_input_flag = expected_in is None or input_units != expected_in
+        invalid_output_flag = expected_out is None or output_units != expected_out
@@
-            if invalid_input_flag:
-                msg += f'\n    Invalid input units: {comm_units[ic]}'
-            if invalid_output_flag:
-                msg += f'\n    Invalid output units: {comm_units[oc]}'
+            if invalid_input_flag:
+                msg += f'\n    Invalid input units: {expected_in if expected_in is not None else "<unknown commodity>"}'
+            if invalid_output_flag:
+                msg += f'\n    Invalid output units: {expected_out if expected_out is not None else "<unknown commodity>"}'

90-92: Correct type annotation for comm_units.

It maps commodity name to Unit, not str.

-def check_efficiency_table(
-    conn: sqlite3.Connection, comm_units: dict[str, str]
+def check_efficiency_table(
+    conn: sqlite3.Connection, comm_units: dict[str, Unit]
 ) -> tuple[dict[str, IOUnits], list[str]]:

290-375: Cost tables: use row identifier for lookups; fix c2a key and messages.

When commodity_reference is set, the first selected column is the commodity value, not the column name. Also use the same row identifier for C2A when capacity‑based.

-        for idx, (tech, raw_units_expression) in enumerate(rows, start=1):
+        for idx, (ref, raw_units_expression) in enumerate(rows, start=1):
@@
-            if ct.commodity_reference:
-                commodity_units = commodity_lut.get(ct.commodity_reference)
+            if ct.commodity_reference:
+                # `ref` is the commodity value from the selected column
+                commodity_units = commodity_lut.get(ref)
                 if not commodity_units:
-                    label = f'{ct.table_name}:  Unprocessed row (unknown commodity: {ct.commodity_reference}) '
+                    label = f'{ct.table_name}:  Unprocessed row (unknown commodity: {ref}) '
                     table_grouped_errors[label].append(idx)
                     continue
             else:
-                tech_io = tech_lut.get(tech)
+                tech_io = tech_lut.get(ref)
                 if tech_io:
                     commodity_units = tech_io.output_units
                 else:
-                    label = f'{ct.table_name}:  Unprocessed row (unknown tech: {tech}) '
+                    label = f'{ct.table_name}:  Unprocessed row (unknown tech: {ref}) '
                     table_grouped_errors[label].append(idx)
                     continue
@@
-            if ct.capacity_based:
-                c2a_units = c2a_lut.get(tech, ureg.dimensionless)  # default is dimensionless
+            if ct.capacity_based:
+                c2a_units = c2a_lut.get(ref, ureg.dimensionless)  # default is dimensionless
                 # apply to convert
                 measure_units *= c2a_units * ureg.year
@@
-            if not matched:
-                tech_reference = ct.commodity_reference if ct.commodity_reference else tech
+            if not matched:
+                tech_reference = ref
                 label = (
                     f'{ct.table_name}:  Non-matching measure unit found in cost denominator for tech/commodity {tech_reference}: {raw_units_expression}'
                     f'\n    Expecting commodity units: {commodity_units}. Discovered (after conversions applied): {measure_units}'
-                    f'\n    Conversions:  c2a units: {c2a_units if c2a_units else "N/A"}{", `per period` removed" if ct.period_based else ""}\n   '
+                    f'\n    Conversions:  c2a units: {c2a_units if c2a_units else "N/A"}{", `per period` removed" if ct.period_based else ""}\n   '
                 )

temoa/temoa_model/unit_checking/screener.py (1)

81-99: Fix: DB version values are strings; cast before comparison.

Current comparison always risks false negatives. Parse MetaData values to int with a safe fallback.

-            data = conn.execute('SELECT element, value FROM MetaData').fetchall()
-            meta_data = dict(data)
-            major = meta_data.get('DB_MAJOR', 0)
-            minor = meta_data.get('DB_MINOR', 0)
-            if major == 3 and minor >= 1:
+            data = conn.execute('SELECT element, value FROM MetaData').fetchall()
+            meta_data = dict(data)
+            try:
+                major = int(meta_data.get('DB_MAJOR', '0'))
+                minor = int(meta_data.get('DB_MINOR', '0'))
+            except (TypeError, ValueError):
+                major, minor = 0, 0
+            if major == 3 and minor >= 1:
                 msg = 'Units Check 1 (DB Version):  Passed'
                 report_entries.extend((msg, '\n'))
                 logger.info(msg)
                 if verbose:
                     print(msg)
             else:
                 msg = 'Units Check 1 (DB Version):  Failed.  DB must be v3.1 or greater for units checking'
                 report_entries.extend((msg, '\n'))
                 logger.warning(msg)
-                # we are non-viable, write the (very short) report and return
-                _write_report(report_path, report_entries)
+                # we are non-viable for this DB; continue to process others (if any)
                 if verbose:
                     print(msg)
-                return False
+                all_clear = False
+                continue

If you prefer hard‑fail on first invalid DB, keep the early return but still cast.

🧹 Nitpick comments (4)

temoa/temoa_model/unit_checking/screener.py (2)

34-35: Defer PROJECT_ROOT import to main guard.

Prevents import-time dependency for library consumers.

-from definitions import PROJECT_ROOT
@@
-if __name__ == '__main__':
-    db_path = Path(PROJECT_ROOT) / '<path to db from project ROOT>'
+if __name__ == '__main__':
+    from definitions import PROJECT_ROOT
+    db_path = Path(PROJECT_ROOT) / '<path to db from project ROOT>'
     screen(db_path, report_path=Path(PROJECT_ROOT) / 'temp')

Also applies to: 275-278

65-66: Remove unused table_units aggregation (dead state).

table_units is populated but never read. Drop it to reduce memory and noise.

-    table_units = {}
-    """Table name : {tech | commodity: units}"""
+    # Removed unused table_units aggregation
@@
-                relations, table_errors = check_table(conn, table)
-                table_units[table] = relations
+                _, table_errors = check_table(conn, table)

Also applies to: 110-120

temoa/temoa_model/unit_checking/relation_checker.py (2)

181-185: Prefer logger.exception to capture trace on SQL errors.

-    except sqlite3.OperationalError:
-        logger.error('failed to process query: %s when processing table %s', query, table_name)
+    except sqlite3.OperationalError:
+        logger.exception('failed to process query: %s when processing table %s', query, table_name)

1-10: Optional: validate table_name against a whitelist to mitigate S608.

Since identifiers are interpolated, assert membership in known sets to avoid abuse and quiet linters.

-from temoa.temoa_model.unit_checking.common import (
+from temoa.temoa_model.unit_checking.common import (
     RATIO_ELEMENT,
     SINGLE_ELEMENT,
     consolidate_lines,
     CostTableData,
     RelationType,
+    activity_based_tables,
+    capacity_based_tables,
+    commodity_based_tables,
 )
@@
 def check_inter_table_relations(
@@
 ) -> list[str]:
     """check the tech and units in the given table vs. baseline (expected) values for the tech"""
+    allowed = set(activity_based_tables + capacity_based_tables + commodity_based_tables)
+    if table_name not in allowed:
+        raise ValueError(f'Unexpected table: {table_name}')

Also applies to: 150-178

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0ad6378 and 558dbef.

📒 Files selected for processing (3)

temoa/temoa_model/unit_checking/common.py (1 hunks)
temoa/temoa_model/unit_checking/relation_checker.py (1 hunks)
temoa/temoa_model/unit_checking/screener.py (1 hunks)

🧰 Additional context used

🧬 Code graph analysis (2)

temoa/temoa_model/unit_checking/screener.py (3)

temoa/temoa_model/unit_checking/common.py (1)

RelationType (143-146)

temoa/temoa_model/unit_checking/relation_checker.py (5)

check_efficiency_table (90-154)

make_commodity_lut (54-66)

check_inter_table_relations (157-265)

check_cost_tables (268-379)

make_c2a_lut (69-81)

temoa/temoa_model/unit_checking/table_checker.py (1)

check_table (54-142)

temoa/temoa_model/unit_checking/relation_checker.py (2)

temoa/temoa_model/unit_checking/common.py (2)

consolidate_lines (172-179)

RelationType (143-146)

temoa/temoa_model/unit_checking/entry_checker.py (2)

validate_units_format (49-61)

validate_units_expression (64-74)

🪛 Ruff (0.14.1)

temoa/temoa_model/unit_checking/screener.py

69-69: Avoid specifying long messages outside the exception class

(TRY003)

temoa/temoa_model/unit_checking/relation_checker.py

170-172: Possible SQL injection vector through string-based query construction

(S608)

175-175: Possible SQL injection vector through string-based query construction

(S608)

177-177: Possible SQL injection vector through string-based query construction

(S608)

179-179: Avoid specifying long messages outside the exception class

(TRY003)

183-183: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

210-210: Avoid specifying long messages outside the exception class

(TRY003)

286-289: Avoid specifying long messages outside the exception class

(TRY003)

290-290: Possible SQL injection vector through string-based query construction

(S608)

294-296: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

🔇 Additional comments (1)

temoa/temoa_model/unit_checking/common.py (1)
125-131: Fix CostTableData definition; current NamedTuple call is invalid.

Use class-syntax or functional tuple-of-tuples syntax.
-CostTableData = NamedTuple(
-    'CostTableData',
-    table_name=str,
-    commodity_reference=str | None,
-    capacity_based=bool,
-    period_based=bool,
-)
+class CostTableData(NamedTuple):
+    table_name: str
+    commodity_reference: str | None
+    capacity_based: bool
+    period_based: bool
Likely an incorrect or invalid review comment.

coderabbitai · 2025-10-23T16:44:03Z

temoa/temoa_model/unit_checking/relation_checker.py

+"""
+Tools for Energy Model Optimization and Analysis (Temoa):
+An open source framework for energy systems optimization modeling
+
+Copyright (C) 2015,  NC State University
+
+This program is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 2 of the License, or
+(at your option) any later version.
+
+This program is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+A complete copy of the GNU General Public License v2 (GPLv2) is available
+in LICENSE.txt.  Users uncompressing this from an archive may not have
+received this license file.  If not, see <http://www.gnu.org/licenses/>.
+
+
+Written by:  J. F. Hyink
+jeff@westernspark.us
+https://westernspark.us
+Created on:  9/22/25
+
+A systematic check of expected relationships between tables to ensure units are consistent
+
+"""
+import dataclasses
+import logging
+import sqlite3
+from collections import defaultdict
+from collections.abc import Iterable
+
+from pint.registry import Unit
+
+from temoa.temoa_model.unit_checking import ureg
+from temoa.temoa_model.unit_checking.common import (
+    RATIO_ELEMENT,
+    SINGLE_ELEMENT,
+    consolidate_lines,
+    CostTableData,
+    RelationType,
+)
+from temoa.temoa_model.unit_checking.entry_checker import (
+    validate_units_format,
+    validate_units_expression,
+)
+
+logger = logging.getLogger(__name__)
+
+
+def make_commodity_lut(conn: sqlite3.Connection) -> dict[str, Unit]:
+    """Get a dictionary of the units for each commodity entry"""
+    res = {}
+    cursor = conn.cursor()
+    query = 'SELECT name, units FROM commodity'
+    cursor.execute(query)
+    rows = cursor.fetchall()
+    for comm, units in rows:
+        valid, group = validate_units_format(units, SINGLE_ELEMENT)
+        if valid:
+            valid, units = validate_units_expression(group[0])
+            res[comm] = units
+    return res
+
+
+def make_c2a_lut(conn: sqlite3.Connection) -> dict[str, Unit]:
+    """Get a dictionary of the units for each capacity to activity entry"""
+    res = {}
+    cursor = conn.cursor()
+    query = 'SELECT tech, units FROM CapacityToActivity'
+    cursor.execute(query)
+    rows = cursor.fetchall()
+    for comm, units in rows:
+        valid, group = validate_units_format(units, SINGLE_ELEMENT)
+        if valid:
+            valid, units = validate_units_expression(group[0])
+            res[comm] = units
+    return res
+
+
+@dataclasses.dataclass(frozen=True)
+class IOUnits:
+    input_units: Unit
+    output_units: Unit
+
+
+def check_efficiency_table(
+    conn: sqlite3.Connection, comm_units: dict[str, str]
+) -> tuple[dict[str, IOUnits], list[str]]:
+    """
+    Check the technology units for Efficiency table entries
+
+    Returns a dictionary of technology : IOUnits and a list of error messages
+
+    """
+
+    query = 'SELECT tech, input_comm, output_comm, units FROM efficiency'
+    rows = conn.execute(query).fetchall()
+    res = {}
+    error_msgs = []
+    invalid_rows = []
+    for idx, (tech, ic, oc, units) in enumerate(rows, start=1):
+        input_units, output_units = None, None
+        valid, located_units = validate_units_format(units, RATIO_ELEMENT)
+        if valid:
+            valid, output_units = validate_units_expression(located_units[0])
+        if valid:
+            valid, input_units = validate_units_expression(located_units[1])
+        if not valid:
+            invalid_rows.append(idx)
+            # we give up early.  The specifics of why this failed should be evident in earlier tests
+            continue
+
+        # check that our tech matches the units of the connected commodities
+        invalid_input_flag = input_units != comm_units[ic]
+        invalid_output_flag = output_units != comm_units[oc]
+        if invalid_input_flag or invalid_output_flag:
+            logger.warning(
+                'Efficiency units conflict with associated commodity for Technology %s near row %d',
+                tech,
+                idx,
+            )
+            msg = f"\n  Expected:  {f'{ic} [{input_units}]' :^25} ----> {tech :^20} ----> {f'{oc} [{output_units}]': ^25}"
+            if invalid_input_flag:
+                msg += f'\n    Invalid input units: {comm_units[ic]}'
+            if invalid_output_flag:
+                msg += f'\n    Invalid output units: {comm_units[oc]}'
+            error_msgs.append(msg)
+
+        # check that the output of this technology is consistent in units with other instances of same tech
+        if tech in res:
+            if res[tech].output_units != output_units:
+                logger.warning(
+                    'Efficiency units conflict with same-name tech for Technology %s near row %d',
+                    tech,
+                    idx,
+                )
+                msg = f"\n  Found:  {f'{ic} [{input_units}]' :^25} ----> {tech :^20} ----> {f'{oc} [{output_units}]': ^25}"
+                msg += f'\n    Conflicting output units: {res[tech].output_units} vs {output_units}'
+                error_msgs.append(msg)
+
+        else:
+            res[tech] = IOUnits(input_units, output_units)
+
+    # we gather all non-processed rows in one statement here due to size of table vs. individual reporting
+    if invalid_rows:
+        listed_lines = consolidate_lines(invalid_rows)
+        line_error_msg = f'Non-processed rows (see earlier tests): {listed_lines}'
+        error_msgs.append(line_error_msg)
+
+    return res, error_msgs
+
+
+def check_inter_table_relations(
+    conn: sqlite3.Connection,
+    table_name,
+    tech_lut: dict[str, IOUnits],
+    comm_lut: dict[str, Unit],
+    relation_type: RelationType,
+) -> list[str]:
+    """check the tech and units in the given table vs. baseline (expected) values for the tech"""
+    grouped_errors = defaultdict(list)
+    match relation_type:
+        case RelationType.CAPACITY:
+            # we make a query to join on the C2A units to pick those up
+            query = (
+                f'SELECT {table_name}.tech, {table_name}.units, ca.units '
+                f'FROM {table_name} JOIN CapacityToActivity ca '
+                f'ON {table_name}.tech = ca.tech AND {table_name}.region = ca.region'
+            )
+        case RelationType.ACTIVITY:
+            query = f'SELECT tech, units, NULL FROM {table_name}'
+        case RelationType.COMMODITY:
+            query = f'SELECT commodity, units, NULL FROM {table_name}'
+        case _:
+            raise ValueError(f'Unexpected relation type: {relation_type}')
+    try:
+        rows = conn.execute(query).fetchall()
+    except sqlite3.OperationalError:
+        logger.error('failed to process query: %s when processing table %s', query, table_name)
+        msg = f'Failed to process table {table_name}.  See log for failed query.'
+        return [msg]
+
+    # process the rows
+    for idx, (tech_or_comm, table_units, c2a_units) in enumerate(rows, start=1):
+        expected_units = None
+        match relation_type:
+            case RelationType.CAPACITY:
+                io_units = tech_lut.get(tech_or_comm)
+                if not io_units:
+                    grouped_errors[
+                        f'Unprocessed row (missing reference for tech "{tech_or_comm}" --see earlier tests)'
+                    ].append(idx)
+                    continue
+                expected_units = io_units.output_units
+            case RelationType.ACTIVITY:
+                io_units = tech_lut[tech_or_comm]
+                if not io_units:
+                    grouped_errors[
+                        f'Unprocessed row (missing reference for tech "{tech_or_comm}" --see earlier tests)'
+                    ].append(idx)
+                    continue
+                expected_units = io_units.output_units
+            case RelationType.COMMODITY:
+                expected_units = comm_lut.get(tech_or_comm)
+            case _:
+                raise ValueError(f'Unexpected relation type: {relation_type}')
+        if not expected_units:
+            grouped_errors[
+                f'Unprocessed row (missing reference for tech "{tech_or_comm}" --see earlier tests)'
+            ].append(idx)
+            continue
+
+        # validate the units in the table...
+        entry_format_valid, units_data = validate_units_format(table_units, SINGLE_ELEMENT)
+        if entry_format_valid:
+            _, valid_table_units = validate_units_expression(units_data[0])
+        else:
+            valid_table_units = None
+
+        # validate the c2a units, if needed
+        if c2a_units:
+            c2a_valid, units_data = validate_units_format(c2a_units, SINGLE_ELEMENT)
+            if c2a_valid:
+                # further ensure the conversion is valid and retain the validity
+                c2a_valid, valid_c2a_units = validate_units_expression(units_data[0])
+            else:
+                valid_c2a_units = None
+        else:  # we are in a valid state: no C2A units provided/needed
+            c2a_valid = True
+            valid_c2a_units = None
+
+        if not valid_table_units:
+            grouped_errors['Unprocessed row (invalid units--see earlier tests)'].append(idx)
+        if not c2a_valid:
+            grouped_errors['Unprocessed row (invalid c2a units--see earlier tests)'].append(idx)
+        if not valid_table_units or not c2a_valid:
+            continue
+
+        # if we have valid c2a units, combine them to get the units of activity
+        if valid_c2a_units:
+            res_units = valid_table_units * (valid_c2a_units * ureg.year)
+
+        else:
+            res_units = valid_table_units
+
+        # check that the res_units match the expectation from the tech
+        if expected_units != res_units:
+            msg = (
+                f'Units mismatch from expected reference. Table Entry: {valid_table_units}, '
+                f'{f" C2A Entry: {valid_c2a_units}, " if valid_c2a_units else ""}'
+                f'expected: {expected_units}'
+                f' for output of tech {tech_or_comm}'
+            )
+            grouped_errors[msg].append(idx)
+
+    # gather into list format
+    res = []
+    for msg, line_nums in grouped_errors.items():
+        res.append(f'{msg} at rows: {consolidate_lines(line_nums)}')
+
+    return res
+
+
+def check_cost_tables(
+    conn: sqlite3.Connection,
+    cost_tables: Iterable[CostTableData],
+    tech_lut: dict[str, IOUnits],
+    c2a_lut: dict[str, Unit],
+    commodity_lut: dict[str, Unit],
+) -> list[str]:
+    """
+    Check all cost tables for (a) alignment of units to tech output (the denominator) and (b) 100% commonality
+    in the cost units (numerator)
+    Note:  we'll *assume* the first passing entry in the first table establishes the common cost units and
+           check for consistency
+    """
+    common_cost_unit = None  # Expectation:  MUSD.  Something with a prefix and currency dimension
+    error_msgs = []
+    for ct in cost_tables:
+        table_grouped_errors = defaultdict(list)
+        if ct.commodity_reference and ct.capacity_based:
+            raise ValueError(
+                f'Table that is "capacity based" {ct.table_name} flagged as '
+                'having commodity field--expecting tech field.  Check data.'
+            )
+        query = f'SELECT {ct.commodity_reference if ct.commodity_reference else 'tech'}, units FROM {ct.table_name}'
+        try:
+            rows = conn.execute(query).fetchall()
+        except sqlite3.OperationalError:
+            logger.error(
+                'failed to process query: %s when processing table %s', query, ct.table_name
+            )
+            msg = f'Failed to process table {ct.table_name}.  See log for failed query.'
+            error_msgs.append(msg)
+            continue
+        for idx, (tech, raw_units_expression) in enumerate(rows, start=1):
+            # convert to pint expression
+            cost_units, measure_units = None, None
+            # screen for empty/missing raw inputs
+            if not raw_units_expression:
+                label = f'{ct.table_name}:  Unprocessed row (missing units): {raw_units_expression}'
+                table_grouped_errors[label].append(idx)
+                continue
+            valid, (elements) = validate_units_format(raw_units_expression, RATIO_ELEMENT)
+            if valid:
+                cost_valid, cost_units = validate_units_expression(elements[0])
+                units_valid, measure_units = validate_units_expression(elements[1])
+            else:
+                cost_valid, units_valid = False, False
+            if not (cost_valid and units_valid):
+                label = f'{ct.table_name}:  Unprocessed row (invalid units--see earlier tests): {raw_units_expression}'
+                table_grouped_errors[label].append(idx)
+                continue
+
+            # Test 1: Look for cost commonality
+            if common_cost_unit is None:
+                # try to establish it
+                # check that what we have captured is in the currency dimension == "clean"
+                # dev note:  multiplying by 1 allows us to use the check_units_expression() function
+                if (1 * cost_units).check('[currency]'):
+                    common_cost_unit = cost_units
+                else:
+                    # something is wrong, hopefully it was just this entry?
+                    # mark it, dump it, and try again...
+                    error_msgs.append(
+                        f'{ct.table_name}:  Unprocessed row (unreducible cost units): {cost_units} at row: {idx}'
+                    )
+                    continue
+            else:  # use the common units to test
+                if cost_units != common_cost_unit:
+                    label = f'{ct.table_name}:  Non-standard cost found (expected common cost units of {common_cost_unit}) got: {cost_units}'
+                    table_grouped_errors[label].append(idx)
+
+            # Test 2:  Check the units of measure to ensure alignment with the tech's output units
+            # find the referenced commodity units from the tech or commodity depending on table structure...
+            if ct.commodity_reference:
+                commodity_units = commodity_lut.get(ct.commodity_reference)
+                if not commodity_units:
+                    label = f'{ct.table_name}:  Unprocessed row (unknown commodity: {ct.commodity_reference}) '
+                    table_grouped_errors[label].append(idx)
+                    continue
+            else:
+                tech_io = tech_lut.get(tech)
+                if tech_io:
+                    commodity_units = tech_io.output_units
+                else:
+                    label = f'{ct.table_name}:  Unprocessed row (unknown tech: {tech}) '
+                    table_grouped_errors[label].append(idx)
+                    continue
+
+            # pull the C2A factor if this table is capacity-based and determine the "match units" which should
+            # match the commodity units in the table, after removing the "per period" time factor, if it exists
+            c2a_units = None
+            if ct.capacity_based:
+                c2a_units = c2a_lut.get(tech, ureg.dimensionless)  # default is dimensionless
+                # apply to convert
+                measure_units *= c2a_units * ureg.year
+
+            if ct.period_based:
+                measure_units /= ureg.year  # remove the "per year" from this element
+
+            matched = measure_units == commodity_units
+
+            if not matched:
+                tech_reference = ct.commodity_reference if ct.commodity_reference else tech
+                label = (
+                    f'{ct.table_name}:  Non-matching measure unit found in cost denominator for tech/commodity {tech_reference}: {raw_units_expression}'
+                    f'\n    Expecting commodity units: {commodity_units}. Discovered (after conversions applied): {measure_units}'
+                    f'\n    Conversions:  c2a units: {c2a_units if c2a_units else "N/A"}{", `per period` removed" if ct.period_based else ""}\n   '
+                )
+                table_grouped_errors[label].append(idx)
+
+        for label, listed_lines in table_grouped_errors.items():
+            error_msgs.append(f'{label} at rows: {consolidate_lines(listed_lines)}')


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Quick scan helper to verify fixes (lookups and annotations).

🏁 Script executed:

#!/bin/bash # Fail on direct dict indexing that can KeyError and check annotation rg -n -C1 --type=py $'comm_units\\[' temoa/temoa_model/unit_checking/relation_checker.py rg -n -C1 --type=py $'tech_lut\\[' temoa/temoa_model/unit_checking/relation_checker.py rg -n -C1 --type=py $'def\\s+check_efficiency_table\\(.*comm_units:\\s*dict\\[str,\\s*Unit\\]' temoa/temoa_model/unit_checking/relation_checker.py

Length of output: 671

Fix type annotation and unsafe direct dict indexing.

The code has three related issues:

Type annotation mismatch (line 80): Parameter comm_units: dict[str, str] should be dict[str, Unit] since make_commodity_lut() returns dict[str, Unit] and the function compares values against Unit objects.

Unsafe direct indexing on comm_units (lines 118–119, 128–130): Direct access via comm_units[ic] and comm_units[oc] can raise KeyError. Use .get() with appropriate error handling instead, consistent with other defensive checks in the function.

Inconsistent dict access pattern (line 200): In check_inter_table_relations(), the ACTIVITY case uses direct indexing tech_lut[tech_or_comm] while the CAPACITY case (line 188) safely uses .get(). Standardize to .get().

🧰 Tools

🪛 Ruff (0.14.1)

170-172: Possible SQL injection vector through string-based query construction

(S608)

175-175: Possible SQL injection vector through string-based query construction

(S608)

177-177: Possible SQL injection vector through string-based query construction

(S608)

179-179: Avoid specifying long messages outside the exception class

(TRY003)

183-183: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

210-210: Avoid specifying long messages outside the exception class

(TRY003)

286-289: Avoid specifying long messages outside the exception class

(TRY003)

290-290: Possible SQL injection vector through string-based query construction

(S608)

294-296: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

🤖 Prompt for AI Agents

In temoa/temoa_model/unit_checking/relation_checker.py lines 1-378: change the type annotation of check_efficiency_table's comm_units parameter from dict[str, str] to dict[str, Unit]; replace all direct dict indexing that can KeyError with safe .get() lookups and handle missing keys (specifically replace comm_units[ic] and comm_units[oc] checks around lines ~118-130 with comm_units.get(ic)/comm_units.get(oc) and add error handling/logging and early-continue when a commodity is missing); and make the ACTIVITY branch in check_inter_table_relations (around lines ~188-205) use tech_lut.get(tech_or_comm) like the CAPACITY branch and handle the missing-io_units case consistently by recording the unprocessed-row error and continuing.

coderabbitai · 2025-10-23T16:44:03Z

temoa/temoa_model/unit_checking/relation_checker.py

+                f'SELECT {table_name}.tech, {table_name}.units, ca.units '
+                f'FROM {table_name} JOIN CapacityToActivity ca '
+                f'ON {table_name}.tech = ca.tech AND {table_name}.region = ca.region'
+            )


⚠️ Potential issue | 🟠 Major

Don’t drop CAPACITY rows without C2A; LEFT JOIN and flag missing C2A.

Inner join hides unmatched rows. Use LEFT JOIN and treat missing C2A as invalid.

- query = ( - f'SELECT {table_name}.tech, {table_name}.units, ca.units ' - f'FROM {table_name} JOIN CapacityToActivity ca ' - f'ON {table_name}.tech = ca.tech AND {table_name}.region = ca.region' - ) + query = ( + f'SELECT {table_name}.tech, {table_name}.units, ca.units ' + f'FROM {table_name} LEFT JOIN CapacityToActivity ca ' + f'ON {table_name}.tech = ca.tech AND {table_name}.region = ca.region' + ) @@ - if c2a_units: + if c2a_units: c2a_valid, units_data = validate_units_format(c2a_units, SINGLE_ELEMENT) if c2a_valid: # further ensure the conversion is valid and retain the validity c2a_valid, valid_c2a_units = validate_units_expression(units_data[0]) else: valid_c2a_units = None - else: # we are in a valid state: no C2A units provided/needed - c2a_valid = True - valid_c2a_units = None + else: + # Require C2A for CAPACITY; optional otherwise + c2a_valid = relation_type != RelationType.CAPACITY + valid_c2a_units = None

Also applies to: 224-235

🧰 Tools

🪛 Ruff (0.14.1)

170-172: Possible SQL injection vector through string-based query construction

(S608)

🤖 Prompt for AI Agents

In temoа/temoa_model/unit_checking/relation_checker.py around lines 170-173 (and similarly around lines 224-235), the current INNER JOIN hides Capacity rows that lack a CapacityToActivity (C2A) match; change the SQL to use a LEFT JOIN and detect rows where ca.units (or the C2A key) is NULL, marking those Capacity rows as invalid/missing C2A instead of silently dropping them. Update the query to LEFT JOIN and add logic after query execution to flag/collect rows with NULL ca.* values (or include a CASE/COALESCE expression) so missing C2A entries are reported rather than excluded.

coderabbitai · 2025-10-23T16:44:03Z

temoa/temoa_model/unit_checking/screener.py

+def _write_report(report_path: Path, report_entries: list[str]):
+    """Write the report to file"""
+    if not report_path:
+        return
+    if not report_path.exists():
+        report_path.mkdir(parents=True)
+    if report_path.is_dir():
+        # augment with a default filename including timestamp
+        from datetime import datetime
+
+        timestamp = datetime.now().strftime('%Y-%m-%d_%H%M%S')
+        report_path /= f'units_check_{timestamp}.txt'
+    with open(report_path, 'w', encoding='utf-8') as report_file:
+        report_file.writelines(report_entries)


🛠️ Refactor suggestion | 🟠 Major

Harden report writer: distinguish file vs directory paths.

Avoid creating a directory when a file path is intended; ensure parents exist for file outputs.

def _write_report(report_path: Path, report_entries: list[str]): """Write the report to file""" if not report_path: return - if not report_path.exists(): - report_path.mkdir(parents=True) - if report_path.is_dir(): + # If path points to a directory (existing or suffixless), write a timestamped file inside it. + if report_path.is_dir() or report_path.suffix == "": + if not report_path.exists(): + report_path.mkdir(parents=True, exist_ok=True) # augment with a default filename including timestamp from datetime import datetime timestamp = datetime.now().strftime('%Y-%m-%d_%H%M%S') - report_path /= f'units_check_{timestamp}.txt' - with open(report_path, 'w', encoding='utf-8') as report_file: + report_path = report_path / f'units_check_{timestamp}.txt' + else: + # Treat as file path; ensure parent exists + report_path.parent.mkdir(parents=True, exist_ok=True) + with open(report_path, 'w', encoding='utf-8') as report_file: report_file.writelines(report_entries)

🤖 Prompt for AI Agents

In temoa/temoa_model/unit_checking/screener.py around lines 259-272, the current writer creates directories for the given report_path itself and can mistakenly turn an intended file path into a directory; instead, only create parent directories for a file and only append the default filename when the provided path actually refers to a directory. Change the logic to: treat a falsy/None path as no-op; if report_path.exists() and report_path.is_dir() (or report_path explicitly provided as a directory), append the timestamped filename; otherwise treat report_path as a file and ensure its parent directory exists by calling report_path.parent.mkdir(parents=True, exist_ok=True) (do not call mkdir on report_path itself); then open the final file path and write entries. Ensure you use exist_ok=True when creating dirs and preserve UTF-8 writing.

…ustments

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (3)

temoa/temoa_model/unit_checking/relation_checker.py (3)

200-207: Critical: Inconsistent dict access causes KeyError risk.

The ACTIVITY case uses direct indexing tech_lut[tech_or_comm] while the CAPACITY case (line 193) safely uses .get(). This inconsistency will raise KeyError for missing technologies.

             case RelationType.ACTIVITY:
-                io_units = tech_lut[tech_or_comm]
+                io_units = tech_lut.get(tech_or_comm)
                 if not io_units:
                     grouped_errors[
                         f'Unprocessed row (missing reference for tech "{tech_or_comm}" --see earlier tests)'
                     ].append(idx)
                     continue
                 expected_units = io_units.output_units

90-131: Critical: Type annotation and KeyError risks remain unresolved.

Despite past reviews, two critical issues persist:

Line 91: Type annotation is incorrect - comm_units should be dict[str, Unit] not dict[str, str], since make_commodity_lut() returns dict[str, Unit] and values are compared against Unit objects.
Lines 118-119, 128-130: Direct dict indexing comm_units[ic] and comm_units[oc] will raise KeyError if commodities are missing. Use defensive .get() lookups instead:

 def check_efficiency_table(
-    conn: sqlite3.Connection, comm_units: dict[str, str]
+    conn: sqlite3.Connection, comm_units: dict[str, Unit]
 ) -> tuple[dict[str, IOUnits], list[str]]:

         # check that our tech matches the units of the connected commodities
-        invalid_input_flag = input_units != comm_units[ic]
-        invalid_output_flag = output_units != comm_units[oc]
+        expected_in = comm_units.get(ic)
+        expected_out = comm_units.get(oc)
+        if expected_in is None or expected_out is None:
+            logger.warning(
+                'Missing commodity unit for Technology %s near row %d (input: %s, output: %s)',
+                tech, idx, ic if expected_in is None else '', oc if expected_out is None else ''
+            )
+            invalid_rows.append(idx)
+            continue
+        invalid_input_flag = input_units != expected_in
+        invalid_output_flag = output_units != expected_out
         if invalid_input_flag or invalid_output_flag:
             logger.warning(
                 'Efficiency units conflict with associated commodity for Technology %s near row %d',
                 tech,
                 idx,
             )
             msg = f"\n  Expected:  {f'{ic} [{input_units}]' :^25} ----> {tech :^20} ----> {f'{oc} [{output_units}]': ^25}"
             if invalid_input_flag:
-                msg += f'\n    Invalid input units: {comm_units[ic]}'
+                msg += f'\n    Invalid input units: {expected_in}'
             if invalid_output_flag:
-                msg += f'\n    Invalid output units: {comm_units[oc]}'
+                msg += f'\n    Invalid output units: {expected_out}'
             error_msgs.append(msg)

320-410: Critical: Incorrect commodity/tech identifier usage in cost table validation.

When ct.commodity_reference is set, the query selects the commodity column (e.g., 'emis_comm'), but the code incorrectly looks up commodity_lut.get(ct.commodity_reference) (the column name) instead of the actual row value. This causes false "unknown commodity" errors.

Issues:

Line 320: Variable named tech is misleading when it holds a commodity value
Line 361: Looks up ct.commodity_reference (column name) instead of row value
Lines 380, 393: Use tech variable which may actually be a commodity

-        for idx, (tech, raw_units_expression) in enumerate(rows, start=1):
+        for idx, (identifier, raw_units_expression) in enumerate(rows, start=1):
             # convert to pint expression
             cost_units, measure_units = None, None
             # screen for empty/missing raw inputs
             if not raw_units_expression:
-                label = f'{ct.table_name}:  Unprocessed row (missing units): {raw_units_expression}'
+                label = f'{ct.table_name}:  Unprocessed row (missing units)'
                 table_grouped_errors[label].append(idx)
                 continue
             valid, (elements) = validate_units_format(raw_units_expression, RATIO_ELEMENT)
             if valid:
                 cost_valid, cost_units = validate_units_expression(elements[0])
                 units_valid, measure_units = validate_units_expression(elements[1])
             else:
                 cost_valid, units_valid = False, False
             if not (cost_valid and units_valid):
                 label = f'{ct.table_name}:  Unprocessed row (invalid units--see earlier tests): {raw_units_expression}'
                 table_grouped_errors[label].append(idx)
                 continue
 
             # Test 1: Look for cost commonality
             if common_cost_unit is None:
                 # try to establish it
                 # check that what we have captured is in the currency dimension == "clean"
                 # dev note:  multiplying by 1 allows us to use the check_units_expression() function
                 if (1 * cost_units).check('[currency]'):
                     common_cost_unit = cost_units
                 else:
                     # something is wrong, hopefully it was just this entry?
                     # mark it, dump it, and try again...
                     error_msgs.append(
                         f'{ct.table_name}:  Unprocessed row (unreducible cost units): {cost_units} at row: {idx}'
                     )
                     continue
             else:  # use the common units to test
                 if cost_units != common_cost_unit:
                     label = f'{ct.table_name}:  Non-standard cost found (expected common cost units of {common_cost_unit}) got: {cost_units}'
                     table_grouped_errors[label].append(idx)
 
             # Test 2:  Check the units of measure to ensure alignment with the tech's output units
             # find the referenced commodity units from the tech or commodity depending on table structure...
             if ct.commodity_reference:
-                commodity_units = commodity_lut.get(ct.commodity_reference)
+                # identifier holds the actual commodity value from the selected column
+                commodity_units = commodity_lut.get(identifier)
                 if not commodity_units:
-                    label = f'{ct.table_name}:  Unprocessed row (unknown commodity: {ct.commodity_reference}) '
+                    label = f'{ct.table_name}:  Unprocessed row (unknown commodity: {identifier}) '
                     table_grouped_errors[label].append(idx)
                     continue
             else:
-                tech_io = tech_lut.get(tech)
+                # identifier holds the tech value
+                tech_io = tech_lut.get(identifier)
                 if tech_io:
                     commodity_units = tech_io.output_units
                 else:
-                    label = f'{ct.table_name}:  Unprocessed row (unknown tech: {tech}) '
+                    label = f'{ct.table_name}:  Unprocessed row (unknown tech: {identifier}) '
                     table_grouped_errors[label].append(idx)
                     continue
 
             # pull the C2A factor if this table is capacity-based and determine the "match units" which should
             # match the commodity units in the table, after removing the "per period" time factor, if it exists
             c2a_units = None
-            oring_measure_units = measure_units
+            original_measure_units = measure_units
             if ct.capacity_based:
-                c2a_units = c2a_lut.get(tech, '<none>')  # default is dimensionless
+                c2a_units = c2a_lut.get(identifier, ureg.dimensionless)
                 # apply to convert
-                if c2a_units != '<none>':
-                    measure_units *= c2a_units
+                measure_units *= c2a_units
                 # apply the nominal period
                 measure_units *= ureg.year
 
             if ct.period_based:
                 measure_units /= ureg.year  # remove the "per year" from this element
 
             matched = measure_units == commodity_units
 
             if not matched:
-                tech_reference = ct.commodity_reference if ct.commodity_reference else tech
-                label = f'{ct.table_name}:  Non-matching measure unit found in cost denominator for tech/comm: {tech_reference}:'
+                label = f'{ct.table_name}:  Non-matching measure unit found in cost denominator for tech/comm: {identifier}:'
                 conversions = []
                 if ct.capacity_based:
                     conversions.append(f'C2A Factor: {c2a_units}')
                     conversions.append(f'Nominal Period: {ureg.year}')
                 if ct.period_based:
                     conversions.append(f'Per-period Factor: {ureg.year}')
                 detail = _ding_label(
                     table_entry=raw_units_expression,
-                    focus=f'Converted Measure in Denominator: {oring_measure_units}',
+                    focus=f'Converted Measure in Denominator: {original_measure_units}',
                     conversions=conversions,
                     result=measure_units,
                     expectation=commodity_units,
                 )
                 label += f'{detail}\n'
 
                 table_grouped_errors[label].append(idx)

Also fixed: typo oring_measure_units → original_measure_units and simplified the c2a_units default handling.

🧹 Nitpick comments (1)

temoa/temoa_model/unit_checking/relation_checker.py (1)
169-173: Consider LEFT JOIN to detect missing CapacityToActivity entries.

The current INNER JOIN silently drops CAPACITY rows that lack a C2A match. Using LEFT JOIN would allow detection and reporting of these missing entries as validation errors rather than silently ignoring them.
             query = (
                 f'SELECT {table_name}.tech, {table_name}.units, ca.units '
-                f'FROM {table_name} JOIN CapacityToActivity ca '
+                f'FROM {table_name} LEFT JOIN CapacityToActivity ca '
                 f'ON {table_name}.tech = ca.tech AND {table_name}.region = ca.region'
             )
Then update the validation logic (around lines 233-235) to flag missing C2A as an error for CAPACITY relations:
         else:  # we are in a valid state: no C2A units provided/needed
-            c2a_valid = True
+            c2a_valid = relation_type != RelationType.CAPACITY
             valid_c2a_units = None
+            if not c2a_valid:
+                grouped_errors['Unprocessed row (missing CapacityToActivity entry)'].append(idx)

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 558dbef and 68dfae1.

📒 Files selected for processing (2)

docs/source/Documentation.rst (1 hunks)
temoa/temoa_model/unit_checking/relation_checker.py (1 hunks)

🧰 Additional context used

🧠 Learnings (1)

📚 Learning: 2025-10-30T14:37:44.217Z

Learnt from: idelder
PR: TemoaProject/temoa#180
File: temoa/components/limits.py:632-647
Timestamp: 2025-10-30T14:37:44.217Z
Learning: In the `limit_tech_input_split_average_constraint` function in `temoa/components/limits.py`, there was a bug where the `total_inp` summation incorrectly used the fixed variable `i` instead of the loop variable `S_i` when indexing `processOutputsByInput` and calling `get_variable_efficiency`. This pattern should be watched for in similar aggregation constraints: when summing over a loop variable like `S_i`, all references within that loop should use `S_i`, not a fixed outer variable.

Applied to files:

temoa/temoa_model/unit_checking/relation_checker.py

🧬 Code graph analysis (1)

temoa/temoa_model/unit_checking/relation_checker.py (2)

temoa/temoa_model/unit_checking/common.py (2)

consolidate_lines (172-179)

RelationType (143-146)

temoa/temoa_model/unit_checking/entry_checker.py (2)

validate_units_format (49-61)

validate_units_expression (64-74)

🪛 Ruff (0.14.2)

temoa/temoa_model/unit_checking/relation_checker.py

170-172: Possible SQL injection vector through string-based query construction

(S608)

176-176: Possible SQL injection vector through string-based query construction

(S608)

178-178: Possible SQL injection vector through string-based query construction

(S608)

180-180: Avoid specifying long messages outside the exception class

(TRY003)

184-184: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

211-211: Avoid specifying long messages outside the exception class

(TRY003)

306-309: Avoid specifying long messages outside the exception class

(TRY003)

310-310: Possible SQL injection vector through string-based query construction

(S608)

314-316: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

🔇 Additional comments (1)

docs/source/Documentation.rst (1)

411-542: LGTM! Comprehensive documentation of the new unit checking feature.

The documentation clearly explains:

The upgrade path to database v3.1 with unit checking

How Pint is used for validation (not in model calculations)

Unit expression formats and testing procedures

The five-step validation sequence

A detailed table showing which tables have units checking

The technical content aligns well with the implementation in the codebase.

ParticularlyPythonicBS · 2025-12-18T19:03:15Z

Closing since #232 is merged now

jeff-ws and others added 30 commits November 22, 2024 10:23

Tinkering with MC worker Queue

0bdcf03

Tinkering with MC worker Queue

b8e739e

Tinkering with MC worker Queue

9f6c359

Tinkering with MC worker Queue

983b898

- adding timing to post-processing

Tinkering with MC worker Queue

912591d

- adding timing to post-processing

Tinkering with MC worker Queue

53e13b8

- adding timing to post-processing

Tinkering with MC worker Queue

4a4b48b

- adding timing to post-processing

Tinkering with MC worker Queue

e443ad7

Tinkering with MC worker Queue

69404ad

Tinkering with MC worker Queue

bd6499d

Tinkering with MC worker Queue

a9db306

- adding timing to post-processing

Tinkering with MC worker Queue

cacbfd5

- adding timing to post-processing

Tinkering with MC worker Queue

9580510

- adding timing to post-processing

Tinkering with MC worker Queue

218c615

- adding timing to post-processing

Merge remote-tracking branch 'origin/main'

ead691f

MC running with new MC Worker

1d7551c

- queues are now swapping data vs. models - table_writer.py broken up into data pulling portion/writing portion

Fixed bug in data_brick.py that was not actually returning an obj.

f31b86b

adjust workers for TRACE

8ad98b0

adding some additional logging, reverting worker settings to 6 worker…

59c1284

…s at 20 threads ea.

adding some additional logging, reverting worker settings to 6 worker…

db7eeb2

…s at 20 threads ea.

adding some additional logging, reverting worker settings to 6 worker…

1ca0677

…s at 20 threads ea.

Clean Up work on MC

dd3b60b

- commented some functions - removed some erroneous "sizeof" measures

Adding T/S log entries

45d1321

Increase size of work queue

7a80fb3

- was hanging while putting in shutdown signals

Rev 2.0 of MC complete

f59e011

- works with Utopia - logging cleaned up

Rev 2.0 of MC complete

06dd688

- works with Utopia - logging cleaned up

Added ability to output Dual Variables to Myopic Mode

eefe33a

Merge branch 'TemoaProject:main' into main

146a31b

Tuned up migration utility

1543cf8

Added additional QA steps during data transfer.

Formatting

60b404c

jeff-ws added 3 commits October 16, 2025 17:41

updates to screener to handle empty entries and multiple db files

03d6b22

incorporation into main body

a1b2378

kill verbose output

74c12a8

coderabbitai bot reviewed Oct 17, 2025

View reviewed changes

fix incorrect import packages

1d85dba

jeff-ws changed the title ~~Groundwork for DB Version 3.1~~ Groundwork for DB Version 3.1 and Unit Checking Oct 17, 2025

coderabbitai bot reviewed Oct 17, 2025

View reviewed changes

jeff-ws and others added 2 commits October 16, 2025 18:57

Update temoa/utilities/db_migration_v3_to_v3_1.py

9fbe6d7

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

Update temoa/utilities/db_migration_v3_to_v3_1.py

ec89d26

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

coderabbitai bot reviewed Oct 17, 2025

View reviewed changes

jeff-ws added 6 commits October 16, 2025 19:05

fix loop

b494104

format tweaks to screener output

e3edd95

upgrade to pyam-iamc >= 3.1. Version 3.0 of pyam importing mangled th…

7f5f79b

…e log setup when importing IamDataFrame, causing failure of Temoa logging.

Consolidate tests by row in Test 4--as in others

bc047f3

better fail handling in utility conversion. Added null handling tweak…

3853c00

… in relation checker

remove a little scaffolding

88295b6

coderabbitai bot reviewed Oct 19, 2025

View reviewed changes

temoa/temoa_model/unit_checking/relation_checker.py Show resolved Hide resolved

Added check of Demand table and a little cleanup of formatting

06d83fa

coderabbitai bot reviewed Oct 21, 2025

View reviewed changes

Documentation updates to main dox and README.md describing what the u…

6a7f347

…nit checker is and how to execute with it. Update to 2 config files to include unit checking

coderabbitai bot reviewed Oct 21, 2025

View reviewed changes

jeff-ws added 2 commits October 20, 2025 17:44

fix typo 'unit' -> 'units' in config files

0ad6378

Flip the per year expectation for FixedCost and a few other minor adj…

558dbef

…ustments

coderabbitai bot reviewed Oct 23, 2025

View reviewed changes

Flip the per year expectation for FixedCost and a few other minor adj…

68dfae1

…ustments

coderabbitai bot reviewed Oct 30, 2025

View reviewed changes

Merge branch 'TemoaProject:main' into main

bf86ab8

ParticularlyPythonicBS mentioned this pull request Dec 5, 2025

Unit checking for v4 databases #232

Merged

ParticularlyPythonicBS closed this Dec 18, 2025

	raise ValueError('Unknown units format: %s', format_type)
	raise ValueError(f'Unknown units format: {format_type}')

Groundwork for DB Version 3.1 and Unit Checking #124

Groundwork for DB Version 3.1 and Unit Checking #124

Uh oh!

Conversation

jeff-ws commented Sep 23, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 17, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot Oct 17, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 21, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 21, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 23, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 23, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 23, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

ParticularlyPythonicBS commented Dec 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jeff-ws commented Sep 23, 2025 •

edited by coderabbitai bot

Loading