Skip to content

[REQUEST] Refactor CLI and main modules #31

@tparkerd

Description

@tparkerd

Type of Request

  • Functionality
  • UX/UI
  • Documentation
  • Quality of Life
  • Other

For context, the run-time argument parsing is pretty tightly associated with the actual modules. For example, as is, it's passing around an args argument for most function calls. I picked this habit up from a project in the Baxter lab. It's not awful and it makes it a lot neater when you have a lot of shared parameters (e.g., verbose flag).

Pros:

  • Fewer parameters for function definitions and calls

Cons:

  • It's effectively a global state object without actually being one
  • It's difficult to directly call or work with the core functionality without making a dummy argument parser
  • It's difficult to use modules in a standalone way (e.g., python -m rawtools.raw2img)

For these reasons, I think the CLI module and individual modules should be refactored to pull the core of the CLI into the module themselves. It's probably easier to explain with an example.

Let's look at the current argparser in cli.py for raw2img:

description='Convert .raw 3d volume file to typical image format slices'
parser = argparse.ArgumentParser(description=description,formatter_class=argparse.ArgumentDefaultsHelpFormatter)
parser.add_argument("-v", "--verbose", action="store_true", help="Increase output verbosity")
parser.add_argument("-V", "--version", action="version", version=f'%(prog)s {__version__}')
parser.add_argument("-t", "--threads", type=int, default=cpu_count(), help=f"Maximum number of threads dedicated to processing.")
parser.add_argument("-f", '--force', action="store_true", help="Force file creation. Overwrite any existing files.")
parser.add_argument("-n", '--dry-run', dest='dryrun', action="store_true", help="Perform a trial run. Do not create image files, but logs will be updated.")
parser.add_argument("--format", default='png', help="Set image filetype. Availble options: ['png', 'tif']")
parser.add_argument("path", metavar='PATH', type=str, nargs=1, help='Input directory to process')
args = parser.parse_args()
# Make sure user does not request more CPUs can available
if args.threads > cpu_count():
args.threads = cpu_count()
# Change format to always be lowercase
args.format = args.format.lower()
args.path = list(set(args.path)) # remove any duplicates
args.module_name = 'raw2img'
log.configure(args)
raw2img.main(args)

This should be moved into the separate function within raw2img.py, probably named main.
So the entry point for raw2img should change from raw2img=rawtools.cli:raw_image to raw2img=rawtools.raw2img:main.

This raw2img.main would look something along the lines of the following

def main():
    start_time = time()
    
    description='Convert .raw 3d volume file to typical image format slices'
    parser = argparse.ArgumentParser(description=description,formatter_class=argparse.ArgumentDefaultsHelpFormatter)
    parser.add_argument("-v", "--verbose", action="store_true", help="Increase output verbosity")
    parser.add_argument("-V", "--version", action="version", version=f'%(prog)s {__version__}')
    parser.add_argument("-t", "--threads", type=int, default=cpu_count(), help=f"Maximum number of threads dedicated to processing.")
    parser.add_argument("-f", '--force', action="store_true", help="Force file creation. Overwrite any existing files.")
    parser.add_argument("-n", '--dry-run', dest='dryrun', action="store_true", help="Perform a trial run. Do not create image files, but logs will be updated.")
    parser.add_argument("--format", default='png', help="Set image filetype. Available options: ['png', 'tif']")
    parser.add_argument("path", metavar='PATH', type=str, nargs=1, help='Input directory to process')
    args = parser.parse_args()

    # Make sure user does not request more CPUs can available
    if args.threads > cpu_count():
        args.threads = cpu_count()

    # Change format to always be lowercase
    args.format = args.format.lower()
    args.path = list(set(args.path)) # remove any duplicates

    args.module_name = 'raw2img'
    log.configure(args)

    # Collect all volumes and validate their metadata
    try:
        # Gather all files
        args.files = []
        for p in args.path:
            for root, dirs, files in os.walk(p):
                for filename in files:
                    args.files.append(os.path.join(root, filename))

        # Append any loose, explicitly defined paths to .RAW files
        args.files.extend([f for f in args.path if f.endswith('.raw')])

        # Get all RAW files
        args.files = [f for f in args.files if f.endswith('.raw')]
        logging.debug(f"All files: {args.files}")
        args.files = list(set(args.files))  # remove duplicates
        logging.info(f"Found {len(args.files)} volume(s).")
        logging.debug(f"Unique files: {args.files}")

        # Validate that a DAT file exists for each volume
        for fp in args.files:
            dat_fp = f"{os.path.splitext(fp)[0]}.dat"  # .DAT filepath
            logging.debug(f"Validating DAT file: '{dat_fp}'")
            # Try to extract the dimensions to make sure that the file exists
            dat.read(dat_fp)
    except Exception as err:
        logging.error(err)
    else:
        # For each provided volume...
        pbar = tqdm(total=len(args.files), desc=f"Overall progress")
        for fp in args.files:
            logging.debug(f"Processing '{fp}'")
            # Extract slices for all volumes in provided folder
            extract_slices(args, fp)
            pbar.update()
        pbar.close()

    logging.debug(f'Total execution time: {time() - start_time} seconds')

if __name__ == "__main__":
    main()

Note
I added in an if-main-name block to call the newly defined main function.

A side benefit of this is fewer imports to run each module, so it should be much quicker to execute the command at run-time.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions