-
Notifications
You must be signed in to change notification settings - Fork 2
Description
Type of Request
- Functionality
- UX/UI
- Documentation
- Quality of Life
- Other
For context, the run-time argument parsing is pretty tightly associated with the actual modules. For example, as is, it's passing around an args argument for most function calls. I picked this habit up from a project in the Baxter lab. It's not awful and it makes it a lot neater when you have a lot of shared parameters (e.g., verbose flag).
Pros:
- Fewer parameters for function definitions and calls
Cons:
- It's effectively a global state object without actually being one
- It's difficult to directly call or work with the core functionality without making a dummy argument parser
- It's difficult to use modules in a standalone way (e.g.,
python -m rawtools.raw2img)
For these reasons, I think the CLI module and individual modules should be refactored to pull the core of the CLI into the module themselves. It's probably easier to explain with an example.
Let's look at the current argparser in cli.py for raw2img:
python-rawtools/rawtools/cli.py
Lines 110 to 132 in 2a042b1
| description='Convert .raw 3d volume file to typical image format slices' | |
| parser = argparse.ArgumentParser(description=description,formatter_class=argparse.ArgumentDefaultsHelpFormatter) | |
| parser.add_argument("-v", "--verbose", action="store_true", help="Increase output verbosity") | |
| parser.add_argument("-V", "--version", action="version", version=f'%(prog)s {__version__}') | |
| parser.add_argument("-t", "--threads", type=int, default=cpu_count(), help=f"Maximum number of threads dedicated to processing.") | |
| parser.add_argument("-f", '--force', action="store_true", help="Force file creation. Overwrite any existing files.") | |
| parser.add_argument("-n", '--dry-run', dest='dryrun', action="store_true", help="Perform a trial run. Do not create image files, but logs will be updated.") | |
| parser.add_argument("--format", default='png', help="Set image filetype. Availble options: ['png', 'tif']") | |
| parser.add_argument("path", metavar='PATH', type=str, nargs=1, help='Input directory to process') | |
| args = parser.parse_args() | |
| # Make sure user does not request more CPUs can available | |
| if args.threads > cpu_count(): | |
| args.threads = cpu_count() | |
| # Change format to always be lowercase | |
| args.format = args.format.lower() | |
| args.path = list(set(args.path)) # remove any duplicates | |
| args.module_name = 'raw2img' | |
| log.configure(args) | |
| raw2img.main(args) |
This should be moved into the separate function within raw2img.py, probably named main.
So the entry point for raw2img should change from raw2img=rawtools.cli:raw_image to raw2img=rawtools.raw2img:main.
This raw2img.main would look something along the lines of the following
def main():
start_time = time()
description='Convert .raw 3d volume file to typical image format slices'
parser = argparse.ArgumentParser(description=description,formatter_class=argparse.ArgumentDefaultsHelpFormatter)
parser.add_argument("-v", "--verbose", action="store_true", help="Increase output verbosity")
parser.add_argument("-V", "--version", action="version", version=f'%(prog)s {__version__}')
parser.add_argument("-t", "--threads", type=int, default=cpu_count(), help=f"Maximum number of threads dedicated to processing.")
parser.add_argument("-f", '--force', action="store_true", help="Force file creation. Overwrite any existing files.")
parser.add_argument("-n", '--dry-run', dest='dryrun', action="store_true", help="Perform a trial run. Do not create image files, but logs will be updated.")
parser.add_argument("--format", default='png', help="Set image filetype. Available options: ['png', 'tif']")
parser.add_argument("path", metavar='PATH', type=str, nargs=1, help='Input directory to process')
args = parser.parse_args()
# Make sure user does not request more CPUs can available
if args.threads > cpu_count():
args.threads = cpu_count()
# Change format to always be lowercase
args.format = args.format.lower()
args.path = list(set(args.path)) # remove any duplicates
args.module_name = 'raw2img'
log.configure(args)
# Collect all volumes and validate their metadata
try:
# Gather all files
args.files = []
for p in args.path:
for root, dirs, files in os.walk(p):
for filename in files:
args.files.append(os.path.join(root, filename))
# Append any loose, explicitly defined paths to .RAW files
args.files.extend([f for f in args.path if f.endswith('.raw')])
# Get all RAW files
args.files = [f for f in args.files if f.endswith('.raw')]
logging.debug(f"All files: {args.files}")
args.files = list(set(args.files)) # remove duplicates
logging.info(f"Found {len(args.files)} volume(s).")
logging.debug(f"Unique files: {args.files}")
# Validate that a DAT file exists for each volume
for fp in args.files:
dat_fp = f"{os.path.splitext(fp)[0]}.dat" # .DAT filepath
logging.debug(f"Validating DAT file: '{dat_fp}'")
# Try to extract the dimensions to make sure that the file exists
dat.read(dat_fp)
except Exception as err:
logging.error(err)
else:
# For each provided volume...
pbar = tqdm(total=len(args.files), desc=f"Overall progress")
for fp in args.files:
logging.debug(f"Processing '{fp}'")
# Extract slices for all volumes in provided folder
extract_slices(args, fp)
pbar.update()
pbar.close()
logging.debug(f'Total execution time: {time() - start_time} seconds')
if __name__ == "__main__":
main()Note
I added in an if-main-name block to call the newly definedmainfunction.
A side benefit of this is fewer imports to run each module, so it should be much quicker to execute the command at run-time.