You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: NOTICE.md
-1Lines changed: 0 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -11,4 +11,3 @@ The file packages/utils/src/pyearthtools/utils/parsing/init_parsing.py contains
11
11
The file packages/data/src/pyearthtools/data/indexes/utilities/folder_size.py contains code from https://stackoverflow.com/questions/1392413/calculating-a-directorys-size-using-python, released under Creative Commons BY-SA 4.0 (International).
12
12
13
13
The file packages/data/src/edit/data/indexes/extensions.py extends and is largely sourced from https://github.com/pydata/xarray/blob/main/xarray/core/extensions.py, released under the Apache 2.0 license.
Copy file name to clipboardExpand all lines: README.md
+1-2Lines changed: 1 addition & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -20,6 +20,5 @@ We have information for:
20
20
-[Installation instructions](https://pyearthtools.readthedocs.io/en/latest/installation.html) for different usage scenarios
21
21
-[Data catalogue setup](https://pyearthtools.readthedocs.io/en/latest/catalogue.html) for facility managers or individuals to establish their research data catalogue
22
22
-[A shiny tutorial gallery full of neat examples](https://pyearthtools.readthedocs.io/en/latest/nbook/Gallery.html)
23
-
- Much more, including how-to guides, project setup guide, information on accessing data, guides to evaluation, orientiation for
23
+
- Much more, including how-to guides, project setup guide, information on accessing data, guides to evaluation, orientiation for
24
24
physical scientists and data scientists at our [documentation homepage](https://pyearthtools.readthedocs.io/en/latest/) (you may be reading this now or you may be visiting the README from elsewhere)
Copy file name to clipboardExpand all lines: docs/catalogue.md
+5-5Lines changed: 5 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -25,17 +25,17 @@ The good news is that even though there is no end to data volume increases, ther
25
25
26
26
## The role of data in science and machine learning
27
27
28
-
Data is essential in three ways.
28
+
Data is essential in three ways.
29
29
30
-
Firstly, access to open, trusted data sets is essential to being able to commence an academic research project into machine learning. Most researchers will get their start on data the can access easily. There are few things which have more impact than publishing high-quality open research benchmark datasets for stimulating research into a particular field or area.
30
+
Firstly, access to open, trusted data sets is essential to being able to commence an academic research project into machine learning. Most researchers will get their start on data the can access easily. There are few things which have more impact than publishing high-quality open research benchmark datasets for stimulating research into a particular field or area.
31
31
32
32
Secondly, machine learning systems which are only trained on closed data sources are unlikely to be trusted by fellow researchers, leading to a lack of research interest in the field. Research thrives on re-use of prior research and prior data, inspiring people to investigate, compete and innovate. While not all data needs to be made public, providing enough to enable methodological research is important, particularly for validation and verification purposes.
33
33
34
-
Thirdly, publishing research (or even operational/official) data from a new model allows others to evaluate, compare and assess the efficacy of models.
34
+
Thirdly, publishing research (or even operational/official) data from a new model allows others to evaluate, compare and assess the efficacy of models.
35
35
36
36
## Connecting directly to cloud data storage
37
37
38
-
It is technically possible to access all data via a network, without involving a local on-disk cache. For model inference, this is reasonably efficient. For model training which required repeated access of the relevant data, using cloud storage directly is very inefficient.
38
+
It is technically possible to access all data via a network, without involving a local on-disk cache. For model inference, this is reasonably efficient. For model training which required repeated access of the relevant data, using cloud storage directly is very inefficient.
39
39
40
40
## Creating an on-disk catalogue
41
41
@@ -48,4 +48,4 @@ It is technically possible to access all data via a network, without involving a
48
48
49
49
## Separating project data from general open data
50
50
51
-
## Creating new datasets for sharing and cataloguing
51
+
## Creating new datasets for sharing and cataloguing
Copy file name to clipboardExpand all lines: docs/devguide.md
+3-3Lines changed: 3 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -31,13 +31,13 @@ Unless you are an advanced Git user, we would recommend you follow this process:
31
31
32
32
Prior to developing a pull request, it may be a good idea to create a GitHub issue to capture what the pull request is trying to achieve, any pertinent details, and (if applicable) how it aligns to the roadmap. Otherwise, please explain this in the pull request.
33
33
34
-
To submit a pull request, please use the following workflow:
34
+
To submit a pull request, please use the following workflow:
35
35
36
36
1. Ensure you are working on a new feature branch in **your fork**.
37
37
2. Keep your feature branch rebased and up-to-date with the develop branch of PyEarthTools. You can do this by first syncing the develop branch on your fork, and then rebase your feature branch against the develop branch on your fork.
38
38
3. When ready, submit a pull request to the develop branch of https://github.com/ACCESS-Community-Hub/PyEarthTools.
39
39
40
-
To help disambiguate branches, some contributors like to prefix their branch names with a short numerical indentifier. This is up to the contributor and any approach to branch naming is welcome.
40
+
To help disambiguate branches, some contributors like to prefix their branch names with a short numerical indentifier. This is up to the contributor and any approach to branch naming is welcome.
41
41
42
42
## Pull Request Etiquette
43
43
@@ -61,4 +61,4 @@ A code review is responsible for checking the following:
61
61
4. Style guidelines are followed, static analysis and lint checking have been done
62
62
5. Code is readable and well-structured
63
63
6. Code does not do anything unexpected or beyond the scope of the function
64
-
7. Any additional dependencies are justified and do not result in bloat
64
+
7. Any additional dependencies are justified and do not result in bloat
Copy file name to clipboardExpand all lines: docs/installation.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -74,7 +74,7 @@ Developers of PyEarthTools will most likely want to check out the entire monorep
74
74
The following instructions detail how to install PyEarthTools in [editable mode](https://setuptools.pypa.io/en/latest/userguide/development_mode.html), making it easier to implement and test changes iteratively.
75
75
76
76
```{tip}
77
-
Each sub-package is versioned separately, so bugfixes or updates in a single sub-package can be performed independently without requiring a new release of the entire ecosystem.
77
+
Each sub-package is versioned separately, so bugfixes or updates in a single sub-package can be performed independently without requiring a new release of the entire ecosystem.
Copy file name to clipboardExpand all lines: docs/maintainer.md
+6-8Lines changed: 6 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -13,11 +13,11 @@ Information relevant for package maintenance
13
13
## This section covers asking new contributors to add their details to .zenodo.json
14
14
15
15
```
16
-
Thank you very much for your contribution.
16
+
Thank you very much for your contribution.
17
17
18
18
When we release a new version of PyEarthTools, that version is archived on Zenodo. See: XXXXX
19
19
20
-
As you have contributed to PyEarthtools, would you like to be listed on Zenodo as an author the next time PyEarthTools is archived?
20
+
As you have contributed to PyEarthtools, would you like to be listed on Zenodo as an author the next time PyEarthTools is archived?
21
21
22
22
If so, please open a new pull request. In that pull request please add your details to .zenodo.json (which can be found in the PyEarthTools root directory).
23
23
@@ -34,15 +34,15 @@ tldr; about 3 years old is OK, longer if painless
34
34
35
35
[https://scientific-python.org/specs/spec-0000/](https://scientific-python.org/specs/spec-0000/) provides a guide for the scientific Python ecosystem - we should aspire to be at least that compatible with older versions. It describes an approach including outlining when particular packages move out of support.
36
36
37
-
We have not tested compatibility against all possible package versions which are included in this spec. Conversely, in some cases, it has been fairly straightforward to support packages older than this.
37
+
We have not tested compatibility against all possible package versions which are included in this spec. Conversely, in some cases, it has been fairly straightforward to support packages older than this.
38
38
39
39
There is no formal "support" agreement for PyEarthTools. In the context of PyEarthTools package management, maintaining compability means being willing to make reasonable efforts to resolve any issues raised on the issue tracker. If a specific issue arises that would make it impractical to support a version within the compatibility window, then a response will be discussed and agreed on at the time on the basis of practicality.
40
40
41
-
There is currently no specific testing for older versions of libraries, only older versions of Python (which may or may not intake an older library version). A full matrix test of Python and package versioning would be prohibitively complex, and there would also be no guarantee that pinned older versions wouldn't result in an insecure build (even if only in a test runner).
41
+
There is currently no specific testing for older versions of libraries, only older versions of Python (which may or may not intake an older library version). A full matrix test of Python and package versioning would be prohibitively complex, and there would also be no guarantee that pinned older versions wouldn't result in an insecure build (even if only in a test runner).
42
42
43
43
The development branch versioning is unpinned, and so any issues arising from newly-released packages should quickly be encountered and then resolved before the next PyEarthTools release. Releases of PyEarthTools use "~=" versioning, which gives flexibility within a range of versions (see [https://packaging.python.org/en/latest/specifications/version-specifiers/#id5](https://packaging.python.org/en/latest/specifications/version-specifiers/#id5)).
44
44
45
-
## This section covers how to build the documentation locally
45
+
## This section covers how to build the documentation locally
46
46
(Readthedocs should update automatically from a GitHub Action)
47
47
48
48
### 1. Summary of the tech stack
@@ -89,11 +89,9 @@ Frequent issues include:
89
89
90
90
### Tutorial rendering
91
91
92
-
Things that render well in JupyterLab do not always render properly in readthedocs. Additionally, fixes that work well when built locally, don't always work when merged into the codebase.
92
+
Things that render well in JupyterLab do not always render properly in readthedocs. Additionally, fixes that work well when built locally, don't always work when merged into the codebase.
93
93
94
94
To check the rendering of tutorials in readthedocs:
95
95
- Compare the tutorial in readthedocs against a version running in JupyterLab (as not everything renders in GitHub).
96
96
- Check the entirety of the tutorial (sometimes things will render properly in one section, while not rendering properly in a different section of the same tutorial).
97
97
- If you make any changes to the code cells, re-execute the Notebook in JupyterLab before committing, otherwise some things (e.g. some plots) won't render in readthedocs. Then re-check the tutorial in readthedocs to ensure the tutorial is still rendering properly.
Copy file name to clipboardExpand all lines: docs/newuser.md
+4-4Lines changed: 4 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,6 @@
1
1
# New Users Guide
2
2
3
-
Welcome new user! This document will be continually updated based on new user experiences.
3
+
Welcome new user! This document will be continually updated based on new user experiences.
4
4
5
5
Table of Contents:
6
6
@@ -12,7 +12,7 @@ Table of Contents:
12
12
13
13
## Introduction for Earth System Scientists
14
14
15
-
PyEarthTools will greatly simplify your data access and data transformation code.
15
+
PyEarthTools will greatly simplify your data access and data transformation code.
16
16
17
17
Machine learning models must first be 'trained'. All machine learning models, from simple examples like linear regression to complex multidimensional neural networks (which may require huge computational resources), are based on the same principles. Model input is drawn from the sample data and presented to the model. The model then makes a prediction. This prediction may be correct or incorrect. The prediction is compared to the desired output (sometimes called the target value or truth value). That comparison is scored using a loss function. That loss function is then used to update the model based on the accuracy of the prediction. Sometimes this is done in small batches (e.g. 8 samples at once). This process is called model training.
18
18
@@ -57,10 +57,10 @@ BRAN[doi] # Get BRAN data at doi
57
57
> [!NOTE]
58
58
> This section of the documentation is currently under development
59
59
60
-
## Main Features of PyEarthTools for Model Inference
60
+
## Main Features of PyEarthTools for Model Inference
61
61
62
62
> [!NOTE]
63
63
> This section of the documentation is currently under development
0 commit comments