Skip to content

Conversation

@LadyChristina
Copy link
Member

All Submissions:

  • Have you followed the guidelines in our Contributing documentation?
  • Have you verified that there aren't any other open Pull Requests for the same update/change?
  • Does the Pull Request pass all tests?

Description

Added new parameter to config file that specifies where to look for the raw block data. The value is a list of (relative) paths, so this allows us to have the raw data of different ledgers in different directories if needed.

@LadyChristina LadyChristina requested review from ZeeshanJan and Copilot and removed request for dimkarakostas June 13, 2025 19:28

This comment was marked as outdated.

@LadyChristina LadyChristina requested a review from Copilot June 13, 2025 19:29
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds a new configuration parameter for specifying input directories for raw block data and updates the parsing, mapping, data-collection, and execution modules to use a list of directories instead of a single directory.

  • Updated parser, mapping, and helper modules to accept a list of input directories.
  • Replaced direct references to RAW_DATA_DIR with get_input_directories() in tests, run.py, and data collection scripts.
  • Updated constructors and function signatures across the codebase to use “ledger” and “input_dirs” consistently.

Reviewed Changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
tests/test_parsers.py Replaced RAW_DATA_DIR with get_input_directories() and updated corresponding variable names and assignments.
tests/test_mappings.py Adjusted parameters to use input_dirs in mapping tests.
run.py Modified processing to use get_input_directories() and added error handling for missing raw data files.
data_collection_scripts/collect_block_data.py Changed parameter passing from RAW_DATA_DIR to a received raw_data_dir parameter from get_input_directories().
consensus_decentralization/parsers/{ethereum,dummy,default}_parser.py Updated constructor signatures and file retrieval methods for handling multiple input directories.
consensus_decentralization/parse.py Updated parameter names and logging to reflect the new ledger and input_dirs usage.
consensus_decentralization/helper.py Removed RAW_DATA_DIR and introduced get_input_directories() to fetch raw data paths from the config.
config.yaml Added a new configuration key “input_directories” with a default path.

@ZeeshanJan
Copy link
Contributor

LGTM.
I have run the tool with customised path of input data (of bitcoin) & it worked.

@LadyChristina LadyChristina merged commit 605ebc9 into main Jun 15, 2025
1 check passed
@LadyChristina LadyChristina deleted the configurable_input_paths branch June 15, 2025 18:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants