Skip to content

Commit 47a52d4

Browse files
Merge pull request #43 from sam-grant/main
pycut
2 parents 3267239 + 6300a28 commit 47a52d4

4 files changed

Lines changed: 1811 additions & 7 deletions

File tree

README.md

Lines changed: 167 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -31,20 +31,22 @@ pyprocess # Listing and parallelisation
3131
pyimport # TTree (EventNtuple) importing interface
3232
pyplot # Plotting and visualisation
3333
pyprint # Array visualisation
34-
pyselect # Data selection and cut management
34+
pyselect # Data selection
35+
pycut # Cut management
3536
pyvector # Element wise wector operations
3637
pymcutil # Monte Carlo utilities (coming soon)
3738
pylogger # Helper module for managing printouts
3839
```
3940

40-
### 2.1 Tutorials
41+
### 2.1 Demos and tutorials
4142

4243
To learn by example, follow the `pyutils` tutorial series.
4344

4445
1. [pyutils_basics.ipynb](examples/notebooks/pyutils_basics.ipynb) - Introduction to core functionality
4546
1. [pyutils_on_EAF.ipynb](examples/notebooks/pyutils_on_EAF.ipynb) - Reading data with `pyutils` from the Elastic Analysis Facility (EAF)
4647
1. [pyutils_multifile.ipynb](examples/notebooks/pyutils_multifile.ipynb) - Basic parallelisation with file lists and SAM definitions, as well as complex parallelised analysis tasks using the `pyprocess` `Skeleton` template class.
47-
48+
1. [pyplot_demo.ipynb](examples/notebooks/pyplot_demo.ipynb) - A comprehensive demonstration of the `pyplot.Plot` class.
49+
1. [pycut_demo.ipynb](examples/notebooks/pycut_demo.ipynb) - A comprehensive demonstration of the `pycut.CutManager` class.
4850

4951
### 2.2 Module documentation
5052

@@ -655,6 +657,166 @@ CLASSES
655657

656658
</details>
657659

660+
---
661+
#### `pycut`
662+
663+
A comprehensive framework for managing analysis cuts.
664+
665+
<details>
666+
<summary>Click for details</summary>
667+
668+
## Features
669+
670+
### Cut definition and management
671+
- **`add_cut(name, description, mask, active=True, group=None)`** - Define analysis cuts with boolean masks
672+
- **`toggle_cut(cut_dict)`** - Enable/disable individual cuts using dictionary mapping
673+
- **`toggle_group(group_dict)`** - Enable/disable entire groups of cuts
674+
675+
### Cut flow generation
676+
- **`create_cut_flow(data)`** - Generate detailed cut flow showing progressive event retention
677+
- **`format_cut_flow(cut_flow)`** - Format cut flow as `pandas` DataFrame
678+
- **`combine_cut_flows(cut_flow_list)`** - Combine multiple cut flows (useful for multiprocessing)
679+
680+
### Selection application
681+
- **`combine_cuts(cut_names=None, active_only=True)`** - Generate combined boolean mask from selected cuts
682+
- **`get_active_cuts()`** - Retrieve currently active cuts for inspection
683+
684+
### State management
685+
- **`save_state(state_name)`** - Save current cut configuration for later restoration
686+
- **`restore_state(state_name)`** - Restore previously saved cut configuration
687+
- **`restore_original_state()`** - Reset all cuts to their initial active states
688+
- **`list_saved_states()`** - Display all available saved configurations
689+
690+
### Organisation and inspection
691+
- **`get_groups()`** - Retrieve cuts organised by group membership
692+
- **`list_groups()`** - Display summary of all groups and their cut contents
693+
694+
## Typical workflow
695+
696+
1. Define cuts using `add_cut()` with appropriate groups
697+
2. Generate baseline cut flow with `create_cut_flow()`
698+
3. Save nominal configuration with `save_state()`
699+
4. Create alternative configurations using `toggle_cut()` or `toggle_group()`
700+
5. Compare efficiencies between configurations
701+
6. Apply selected cuts using `combine_cuts()` for analysis
702+
7. Restore configurations as needed with `restore_state()`
703+
704+
The module integrates well with the broader pyutils ecosystem, working with data processed by `pyprocess` and selections created by `pyselect`.
705+
706+
```
707+
Help on module pyutils.pycut in pyutils:
708+
709+
NAME
710+
pyutils.pycut
711+
712+
CLASSES
713+
builtins.object
714+
CutManager
715+
716+
class CutManager(builtins.object)
717+
| CutManager(verbosity=1)
718+
|
719+
| Class to manage analysis cuts
720+
|
721+
| Methods defined here:
722+
|
723+
| __init__(self, verbosity=1)
724+
| Initialise
725+
|
726+
| Args:
727+
| verbosity (int, optional): Printout level (0: minimal, 1: normal, 2: detailed)
728+
|
729+
| add_cut(self, name, description, mask, active=True, group=None)
730+
| Add a cut to the collection.
731+
|
732+
| Args:
733+
| name (str): Name of the cut
734+
| description (str): Description of what the cut does
735+
| mask (awkward.Array): Boolean mask array for the cut
736+
| active (bool, optional): Whether the cut is active by default
737+
| group (str, optional): Group name for organizing cuts
738+
|
739+
| combine_cut_flows(self, cut_flow_list, format_as_df=True)
740+
| Combine a list of cut flows after multiprocessing
741+
|
742+
| Args:
743+
| cut_flows: List of cut statistics lists from different files
744+
| format_as_df (bool, optional): Format output as a pd.DataFrame. Defaults to True.
745+
|
746+
| Returns:
747+
| list: Combined cut statistics
748+
|
749+
| combine_cuts(self, cut_names=None, active_only=True)
750+
| Return a Boolean combined mask from specified cuts. Applies an AND operation across all cuts.
751+
| Args:
752+
|
753+
| cut_names (list, optional): List of cut names to include (if None, use all cuts)
754+
| active_only (bool, optional): Whether to only include active cuts
755+
|
756+
| create_cut_flow(self, data)
757+
| Utility to calculate cut flow from array and cuts object
758+
|
759+
| Args:
760+
| data (awkward.Array): Input data
761+
|
762+
| format_cut_flow(self, cut_flow, include_group=True)
763+
| Format cut flow as a DataFrame with more readable column names
764+
|
765+
| Args:
766+
| cut_flow (dict): The cut flow to format
767+
| include_group (bool, optional): Whether to include group column
768+
| Returns:
769+
| df_cut_flow (pd.DataFrame)
770+
|
771+
| get_active_cuts(self)
772+
| Utility to get all active cutss
773+
|
774+
| get_groups(self)
775+
| Get all unique group names and their cuts
776+
|
777+
| Returns:
778+
| dict: Dictionary mapping group names to lists of cut names
779+
|
780+
| list_groups(self)
781+
| Print all groups and their cuts
782+
|
783+
| list_saved_states(self)
784+
| List all saved states
785+
|
786+
| restore_original_state(self)
787+
| Restore all cuts to their original active states (as defined when added)
788+
|
789+
| restore_state(self, state_name='default')
790+
| Restore previously saved cut states
791+
|
792+
| Args:
793+
| state_name (str): Name of the saved state to restore
794+
|
795+
| save_state(self, state_name='default')
796+
| Save current active states of all cuts
797+
|
798+
| Args:
799+
| state_name (str): Name for this saved state
800+
|
801+
| toggle_cut(self, cut_dict)
802+
| Utility to set cut(s) as inactive or active based on input dictionary
803+
|
804+
| Args:
805+
| cut_dict (dict): Dictionary mapping cut names to their desired active state
806+
| e.g., {"cut_name_1": False, "cut_name_2": True}
807+
|
808+
| toggle_group(self, group_dict)
809+
| Utility to set entire group(s) of cuts as inactive or active
810+
|
811+
| Args:
812+
| group_dict (dict): Dictionary mapping group names to their desired active state
813+
| e.g., {"quality_cuts": False, "momentum_cuts": True}
814+
|
815+
| ----------------------------------------------------------------------
816+
```
817+
818+
</details>
819+
658820
---
659821

660822
#### `pyvector`
@@ -854,7 +1016,7 @@ cd pyutils
8541016
pip install -e . --user
8551017
```
8561018

857-
To verify that Python can import the and use your local install:
1019+
To verify that Python can import and use your local install:
8581020

8591021
```bash
8601022
python -c "import pyutils;print(pyutils.__file__)"
@@ -867,6 +1029,7 @@ which should return
8671029
```
8681030

8691031
Your changes will be automatically be applied to the `pyutils` installed in your environment, with no need to rerun the `pip` command, and you can import modules and classes using the same syntax as normal.
1032+
8701033
## Contact
8711034

8721035
Reach out via Slack (#analysis-tools or #analysis-tools-devel) if you need help or would like to contribute.

0 commit comments

Comments
 (0)