Skip to content

Commit f2f6918

Browse files
Automatic Linting from pre-commit and add alt text to img tags.
1 parent ed50404 commit f2f6918

File tree

6 files changed

+238
-237
lines changed

6 files changed

+238
-237
lines changed

docs/hyperloop/co2eestimates.md

Lines changed: 9 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -6,28 +6,30 @@ title: CO2 equivalent estimates
66
## <a name="estimates"></a>CO2 equivalent estimates
77

88
<div align="center">
9-
<img src="../images/co2eqtooltip.png" width="35%">
9+
<img src="../images/co2eqtooltip.png" width="35%" alt="Co2eq tooltip" />
1010
</div>
1111
<br>
1212

1313
* In Hyperloop, an estimate of the CO2eq produced by your trains is shown in order to give an idea of the environmental impact. We need to run analysis to achieve our scientific goals but we can optimize the code and sometimes work efficiently even with less trains. The displayed value should help educate the decision if a train is needed.
14-
The value is shown before trains have been run (an estimate based using the wagon test) and then when a train run has finished.
14+
The value is shown before trains have been run (an estimate based using the wagon test) and then when a train run has finished.
1515
* The estimate is visible in:
1616
* Wagon test view and train submission view (an estimate based on the wagon test)
1717
* Train test view (an estimate based on the train test. So, this estimate accounts for all wagons in the train)
18-
* In the 'General' tab of the train view, when a train is in the state 'Done'. This estimate directly uses the CPU usage of the train run.
18+
* In the 'General' tab of the train view, when a train is in the state 'Done'. This estimate directly uses the CPU usage of the train run.
1919

2020
## <a name="calculations"></a> How is it calculated?
2121

2222
Principle:
23+
2324
* There are a few studies for estimating CO2eq of computing, and there are many factors to consider such as the efficiency of the machines used and the power grid which the machines are on.
2425
* Furthermore, grey energy (the energy to produce the machines) and the transfers of the data could be considered.
2526
* It would be difficult to give justice to all these details and check for each train where exactly the jobs were running.
2627
* Therefore, in Hyperloop, estimates of CO2eq are directly derived from CPU usage using an average conversion factor.
2728

2829
Calculation input:
29-
* Power per computing core: We use an optimistic value of 10 W/core for pure power consumption and assume a 50/50 split between the carbon from power and carbon from embodied / embedded emissions, therefore working effectively with 20 W/core. Some literature estimates up to 53 W/core for power alone in high-performance computing (see https://www.nature.com/articles/s41550-020-1169-1).
30-
* Electricity emission factors (kgCO2eq per kWh) vary depending how electricity is produced. However, carbon is not everything and we do not want to enter a debate on how energy should be produced here. Therefore we use an average value from https://arxiv.org/pdf/2011.02839.pdf of 0.301 kgCO2eq / kWh, noting that even for the same country the estimates vary significantly depending on the source (e.g. comparing to https://arxiv.org/pdf/2101.02049).
31-
* Hyperloop estimations do not account for the power consumption of data transfer, central infrastructure, or power for storage. The paper 'electricity intensity of internet data transmission', https://onlinelibrary.wiley.com/doi/pdf/10.1111/jiec.12630, estimates 0.06kWh/GB in data transfer. This would mean that a petabyte of data transfer needs 18 tCO2eq. Additionally, the carbon produced by storing the data would not be negligible. For simplicity, we do not account for these aspects, so that our estimates are more directly linked to individual train runs, and not the wider Grid infrastructure.
30+
31+
* Power per computing core: We use an optimistic value of 10 W/core for pure power consumption and assume a 50/50 split between the carbon from power and carbon from embodied / embedded emissions, therefore working effectively with 20 W/core. Some literature estimates up to 53 W/core for power alone in high-performance computing (see <https://www.nature.com/articles/s41550-020-1169-1>).
32+
* Electricity emission factors (kgCO2eq per kWh) vary depending how electricity is produced. However, carbon is not everything and we do not want to enter a debate on how energy should be produced here. Therefore we use an average value from <https://arxiv.org/pdf/2011.02839.pdf> of 0.301 kgCO2eq / kWh, noting that even for the same country the estimates vary significantly depending on the source (e.g. comparing to <https://arxiv.org/pdf/2101.02049>).
33+
* Hyperloop estimations do not account for the power consumption of data transfer, central infrastructure, or power for storage. The paper 'electricity intensity of internet data transmission', <https://onlinelibrary.wiley.com/doi/pdf/10.1111/jiec.12630>, estimates 0.06kWh/GB in data transfer. This would mean that a petabyte of data transfer needs 18 tCO2eq. Additionally, the carbon produced by storing the data would not be negligible. For simplicity, we do not account for these aspects, so that our estimates are more directly linked to individual train runs, and not the wider Grid infrastructure.
3234
* At 20W per core, and 0.301 kgCO2eq / kWh, this gives us: **6t CO2eq per 1MCPUh** or **1 CPU year = 53.3 kgCO2eq**
33-
* In order to compare these emissions to something we know, we use CO2eq produced by flights based on curb6.com
35+
* In order to compare these emissions to something we know, we use CO2eq produced by flights based on curb6.com

docs/hyperloop/hyperlooppolicy.md

Lines changed: 14 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -7,38 +7,38 @@ title: Fair usage policy
77

88
The very large amount of data that will be collected in Run 3 represents a challenge for analysis, for both the CPU needs and the read data from storage, and therefore a resource usage policy has been put in place to ensure proper use of computing resources. The policy has been openly discussed in multiple meetings, including ALICE weeks, and is subject to adjustments as necessary and as the collaboration gains experience with the Run 3 analysis. If you have questions or doubts, please first refer to your PWG convener who will then bring up the case with the analysis coordinator.
99

10-
The image below summarizes the policy:
10+
The image below summarizes the policy:
1111

1212
<div align="center">
13-
<img src="../images/hyperlooppolicy.png" width="80%">
13+
<img src="../images/hyperlooppolicy.png" width="80%" alt="Screenshot of hyperlooppolicy">
1414
</div>
1515

16-
In general, four categories of trains exist:
16+
In general, four categories of trains exist:
1717

18-
* Trains below 30 TB and taking more than 2.0y of CPU time (red shaded area) are very strongly discouraged. In those cases, please resort to very small trains (where throughputs of even 100 KB/s are allowed with autosubmission) to run.
18+
* Trains below 30 TB and taking more than 2.0y of CPU time (red shaded area) are very strongly discouraged. In those cases, please resort to very small trains (where throughputs of even 100 KB/s are allowed with autosubmission) to run.
1919
* Trains that are lower than 2y in CPU usage and loop over less than 200 TB are free to execute and can be executed on Hyperloop via autosubmission. In a certain region between 30-200 TB, slightly more than 2y in CPU time is allowed (see sketch).
20-
* Trains that loop over more than 200 TB and less than 800 TB are dealt with as follows:
21-
* if they require less than 10 years of CPU time, they need only PWG convener approval.
22-
* if they require more than 10 years of CPU time but less than 200 years, they need Analysis and Physics Coordinator approval to run.
23-
* if they require over 200 years of CPU, they need excplicit PB approval.
24-
* Heavy trains looping over datasets bigger than 800 TB are dealt with as follows:
25-
* if they require less than 20 years of CPU time, they need only PWG approval.
26-
* if they require between 20 to 200y of CPU, they can be approved offline by Analysis and Physics Coordination.
27-
* if they require over 200 years of CPU, they need explicit PB approval.
20+
* Trains that loop over more than 200 TB and less than 800 TB are dealt with as follows:
21+
* if they require less than 10 years of CPU time, they need only PWG convener approval.
22+
* if they require more than 10 years of CPU time but less than 200 years, they need Analysis and Physics Coordinator approval to run.
23+
* if they require over 200 years of CPU, they need excplicit PB approval.
24+
* Heavy trains looping over datasets bigger than 800 TB are dealt with as follows:
25+
* if they require less than 20 years of CPU time, they need only PWG approval.
26+
* if they require between 20 to 200y of CPU, they can be approved offline by Analysis and Physics Coordination.
27+
* if they require over 200 years of CPU, they need explicit PB approval.
2828

2929
## <a name="implementation"></a>Implementation in Hyperloop datasets
3030

3131
In practice the chart above is mapped on a number of distinct resource groups which determine the limits assigned to each dataset:
3232

3333
<div align="center">
34-
<img src="../images/resourcetable.png" width=800>
34+
<img src="../images/resourcetable.png" width=800 alt="Screenshot of resourcetable">
3535
</div>
3636

3737
The smaller the dataset size, the more often it is automatically submitted per week and the more often you are allowed to run on it per week. Manual requests to datasets above 50 TB are only fulfilled at the automatical submission times defined. This is in order to allow grouping of wagons to large trains.
3838

3939
## <a name="deriveddata"></a>Derived data
4040

41-
Derived datasets can be created on Hyperloop which are by construction much smaller than the original datasets. Those are advantagous because steps which are identical in each analysis train run (e.g. event selection and centrality calculation, secondary-vertex finding) are only executed once which saves CPU. Furthermore, as the size is smaller such trains cause less load on the storages.
41+
Derived datasets can be created on Hyperloop which are by construction much smaller than the original datasets. Those are advantagous because steps which are identical in each analysis train run (e.g. event selection and centrality calculation, secondary-vertex finding) are only executed once which saves CPU. Furthermore, as the size is smaller such trains cause less load on the storages.
4242

4343
As an example, you can imagine that you run a derived data train on a dataset of 500 TB where you need explicit approval. Say you have a reduction factor of 100, then your output derived data is about 5 TB. You will be allowed to run on that dataset much more frequent, see the table above.
4444

docs/hyperloop/legoexpert.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ title: For the Run 2 LEGO train expert. What has changed?
1717
* There is a history feature for wagons and datasets. You can access it by clicking on the button `📜` available inside of a wagon/dataset view. A detailed view of what has been created/updated/removed from the wagon/dataset is shown, as well as the username and the time when the change was made.
1818

1919
<div align="center">
20-
<img src="../images/datasetHistory.png" width="100%">
20+
<img src="../images/datasetHistory.png" width="100%" alt="Screenshot of dataset history">
2121
</div>
2222

2323
* There are automated notifications. These notifications are created per user, and display changes made to tools, like _Datasets_, that are being used by the user. They are displayed per _Analysis_ in the _My Analyses_ page, or globally in the button `🔔` which can be found on the top menu.
@@ -31,5 +31,5 @@ title: For the Run 2 LEGO train expert. What has changed?
3131
* **Performance Graphs** page allows the user to upload his own local metrics file, and then generate the test graphs specific to that file. You produce a local _performanceMetrics.json_ by running the o2 workflow with the argument _--resources-monitoring 2_ which, in this example, produces monitoring information every 2 seconds. These are the same type of graphs produced in the _Test Graphs_ tab of the train run. This page can be accessed at: <https://alimonitor.cern.ch/hyperloop/performance-graphs>.
3232

3333
<div align="center">
34-
<img src="../images/performanceGraphs.png" width="100%">
34+
<img src="../images/performanceGraphs.png" width="100%" alt="Screenshot of performance graphs">
3535
</div>

docs/hyperloop/notifications.md

Lines changed: 17 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -9,13 +9,13 @@ title: Notifications
99
* The notifications can be seen in the _My Analyses_ page and in the _Notifications_ page, by clicking `🔔` in the menu bar.
1010

1111
<div align="center">
12-
<img src="../images/notificationsMyAnalyses.png" width="90%">
12+
<img src="../images/notificationsMyAnalyses.png" width="90%" alt="Screenshot of notifications in my analyses">
1313
</div>
1414

1515
* The user can click the `✖️` button to remove a notification. In order to remove all the notifications, go to the Notifications page, and click the ``_Dismiss all_ button.
1616

1717
<div align="center">
18-
<img src="../images/allNotifications.png" width="90%">
18+
<img src="../images/allNotifications.png" width="90%" alt="Screenshot of all notifications">
1919
</div>
2020

2121
## <a name="datasetChanged"></a>Dataset changed
@@ -27,33 +27,33 @@ title: Notifications
2727
* The automatic composition settings have changed, e.g. the schedule
2828

2929
<div align="center">
30-
<img src="../images/datasetChanged.png" width="90%">
30+
<img src="../images/datasetChanged.png" width="90%" alt="Screenshot of dataset changed">
3131
</div>
3232

3333
## <a name="datasetActivated"></a>Dataset activated / deactivated
3434

3535
* Notifies the user when a dataset included in his or her analyses has been successfully activated or deactivated.
3636

3737
<div align="center">
38-
<img src="../images/datasetActivation.png" width="90%">
38+
<img src="../images/datasetActivation.png" width="90%" alt="Screenshot of dataset activation">
3939
</div>
4040

4141
## <a name="productionAdded"></a>Dataset production added or removed
4242

4343
* For RUN 3 data and MC, the user is informed if the production has been successfully added to or removed from the dataset.
4444

4545
<div align="center">
46-
<img src="../images/productionAdded.png" width="90%">
46+
<img src="../images/productionAdded.png" width="90%" alt="Screenshot of production added">
4747
</div>
4848

4949
* For RUN 2 data, the user is notified when a conversion train run has been added to or removed from the dataset.
5050

5151
<div align="center">
52-
<img src="../images/trainrunAdded.png" width="90%">
52+
<img src="../images/trainrunAdded.png" width="90%" alt="Screenshot of trainrun added">
5353
</div>
5454

5555
<div align="center">
56-
<img src="../images/trainrunRemoved.png" width="90%">
56+
<img src="../images/trainrunRemoved.png" width="90%" alt="Screenshot of trainrun removed">
5757
</div>
5858

5959
* For derived data, a notification is sent when a Hyperloop train that produced derived data has been added or removed.
@@ -63,15 +63,15 @@ title: Notifications
6363
* The user is informed when a run has been added to or removed from the DPG runlist. This change is usually done by the DPG experts.
6464

6565
<div align="center">
66-
<img src="../images/runlistUpdated.png" width="90%">
66+
<img src="../images/runlistUpdated.png" width="90%" alt="Screenshot of runlist updated">
6767
</div>
6868

6969
## <a name="mergelistUpdate"></a>Mergelist updated
7070

7171
* The mergelist defines which runs are merged into one file at the end of the train running. The user is informed when a mergelist has been modified, added to or removed from the dataset production.
7272

7373
<div align="center">
74-
<img src="../images/mergelistUpdate.png" width="90%">
74+
<img src="../images/mergelistUpdate.png" width="90%" alt="Screenshot of mergelist update">
7575
</div>
7676

7777
## <a name="linkedDataset"></a>Short datasets
@@ -89,55 +89,53 @@ Informs the user when a wagon has been disabled in different circumstances:
8989
* Local tests are cleaned if the wagons are not submitted in a period of 4 weeks. The user is notified that the respective wagons are automatically disabled.
9090

9191
<div align="center">
92-
<img src="../images/testCleaned.png" width="90%">
92+
<img src="../images/testCleaned.png" width="90%" alt="Screenshot of test cleaned">
9393
</div>
9494

9595
* When a wagon with derived data output is enabled, the test cannot start if the wagon and its dependencies share the same workflow. As a result, the wagon is disabled and the user is notified about the wagons which share the same task.
9696

9797
* The notification format is: The wagon _"wagon_name"_ was disabled in _"dataset_name"_. There is derived data. The following wagons have the same workflows {_wagon1_, _wagon2_: _common_workflow_},...,{_wagonX_, _wagonY_: _common_workflow_}
9898

9999
<div align="center">
100-
<img src="../images/wagonDisabled1.png" width="90%">
100+
<img src="../images/wagonDisabled1.png" width="90%" alt="Screenshot of wagon disabled 1">
101101
</div>
102102

103103
* If among the wagon and its dependencies there are identical derived data outputs, the test cannot start, and the wagon is disabled.
104104

105105
* The notification format is: The wagon _"wagon_name"_ was disabled in _"dataset_name"_. The following wagons have the same derived data outputs {_wagon1_, _wagon2_: _common_derived_data_},...,{_wagonX_, _wagonY_: _common_derived_data_}
106106

107107
<div align="center">
108-
<img src="../images/wagonDisabled.png" width="90%">
108+
<img src="../images/wagonDisabled.png" width="90%" alt="Screenshot of wagon disabled">
109109
</div>
110-
110+
111111
* The wagon is disabled if the workflow name has been changed in the meantime. This is fixed by updating the workflow name in the wagon configuration.
112112

113113
<div align="center">
114-
<img src="../images/notificationWorkflow.png" width="90%">
114+
<img src="../images/notificationWorkflow.png" width="90%" alt="Screenshot of notification workflow">
115115
</div>
116116

117117
* The wagon is disabled if one of the user defined dependencies of the wagon is considered identical to a service wagon. In order to most efficiently make use of the Grid and the analysis factilities, copies of core services are not permitted as it prevents combining several users into one train.
118118

119119
<div align="center">
120-
<img src="../images/notificationIdenticalWagon.png" width="90%">
120+
<img src="../images/notificationIdenticalWagon.png" width="90%" alt="Screenshot of notification of identical wagon">
121121
</div>
122122

123123
A service wagon is considered identical to a user wagon if it shares the same activated output tables, the same workflow, and it has matching configurables. To fix this error, please use the listed service wagon as a dependency instead of the copy.
124124

125-
126-
127125
## <a name="inconsistentParameters"></a>Inconsistent parameters
128126

129127
* Hyperloop makes a comparison between the wagon configuration and the configuration defined in O2 for the package tag selected for the wagon. If they do not coincide, the user will be informed about the mismatch. The comparison is case sensitive, therefore a Configurable will not match if its name does not contain the identical lowercase / uppercase combination.
130128

131129
* The user is notified if there is a configurable present in the wagon configuration that is not defined in O2 for the selected package tag tag. Likewise, it informs the user when the wagon configuration misses one or more of the Configurables defined in O2 for the specific tag.
132130

133131
<div align="center">
134-
<img src="../images/inconsistentParameters2.png" width="90%">
132+
<img src="../images/inconsistentParameters2.png" width="90%" alt="Screenshot of inconsistent parameters 2">
135133
</div>
136134

137135
* If the **wagon configuration is old**, and the wagon is enabled with the latest package tag, the user is advised to sync the wagon in order to get the present configuration. Following this, the test will start automatically. Likewise, the test is reset whenever there is a change in the database, such as updating or syncing the wagon configuration or its dependencies.
138136

139137
<div align="center">
140-
<img src="../images/inconsistentParameters.png" width="90%">
138+
<img src="../images/inconsistentParameters.png" width="90%" alt="Screenshot of inconsistent parameters">
141139
</div>
142140

143141
* If the **wagon is enabled with an older tag**, the configuration might not match (hence the notification). If the old tag is needed, then syncing is not an option because this will set the package to the latest one. Therefore, the wagon configuration has to be modified as needed. The user can take as a reference _full_config.json_ in the test output, which shows the configuration the test is being run with, and compare it to the wagon configuration.

0 commit comments

Comments
 (0)