Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
57 commits
Select commit Hold shift + click to select a range
1295e91
start adding functions
ehinman Aug 8, 2025
c4b0b9a
start adding documentation and going through functions
ehinman Aug 27, 2025
c32ded5
adjust date function
ehinman Aug 28, 2025
99e949c
fix dates function
ehinman Aug 29, 2025
1641e85
keep working out issues with api calls
ehinman Aug 29, 2025
7bc6c6f
add documentation
ehinman Aug 29, 2025
1b29d6a
adjust how response is handled and edit walk pages, fix API limit print
ehinman Sep 19, 2025
3289982
add documentation
ehinman Sep 19, 2025
867d728
add more documentation, correct waterdata module
ehinman Sep 19, 2025
44213b5
allow post and get calls in recursive walk pages, fix typo where firs…
ehinman Sep 19, 2025
4affa2f
add in all possible arguments
ehinman Sep 19, 2025
21691d0
trying to get cql2 query correct, will keep at it
ehinman Sep 19, 2025
4c2a3ee
correct cql2 queries
ehinman Sep 22, 2025
14f2830
simplify syntax, remove unneeded dependencies
ehinman Sep 22, 2025
d25f854
start adding function documentation
ehinman Sep 24, 2025
7fe486a
add link urls
ehinman Sep 25, 2025
fad9ce0
fix date formatting function
ehinman Sep 25, 2025
a33d201
make waterdata outputs geopandas if geometry included
ehinman Sep 25, 2025
bd82c49
make gpd an optional dependency and change returns accordingly
ehinman Sep 25, 2025
06b0e69
incorporate geopandas boolean into function arguments and ensure user…
ehinman Sep 25, 2025
253da79
clean up some documentation and comments
ehinman Sep 25, 2025
f5cca07
add optional dependency to pyproject.toml
ehinman Sep 25, 2025
5c546e7
set convertType to default or user specification
ehinman Sep 25, 2025
e9221ac
start unit tests on new functions
ehinman Sep 25, 2025
b1436db
update README and add a NEWS markdown in which to place past updates
ehinman Sep 26, 2025
dc24658
make a few small changes to names and documentation
ehinman Sep 26, 2025
89b960c
define max_results when it is an input
ehinman Sep 26, 2025
1237777
comment out code that wasn't doing the correct thing with max_results
ehinman Sep 26, 2025
e84984a
Revert waterdata to requrests
thodson-usgs Sep 29, 2025
4c84fc0
Review waterdata module
Oct 2, 2025
f4693b6
Update README.md
Oct 2, 2025
0d06672
Add deprecation warning for nwis
Oct 22, 2025
96a4356
Update dataretrieval/waterdata/api.py
ehinman Nov 21, 2025
7f7f184
Update dataretrieval/waterdata/api.py
ehinman Nov 21, 2025
c14e00b
Update dataretrieval/waterdata/api.py
ehinman Nov 21, 2025
dcc7a1a
Apply suggestions from code review
ehinman Nov 21, 2025
370f9a5
Merge pull request #5 from nodohs/waterdata
ehinman Nov 21, 2025
4482751
add back in documentation and make formatting changes
ehinman Nov 21, 2025
37063b9
add metadata to api.py and testing
ehinman Nov 21, 2025
8bb2de8
small changes to remove unnecessary imports and add more documentation
ehinman Nov 21, 2025
2f6af7d
remove some redundant testing, make next url be an info log, not debug
ehinman Nov 21, 2025
f0bef3e
same as previous commit message, was behind on what I was committing
ehinman Nov 24, 2025
8605dea
convert failures counter to a stop that shows URL that failed
ehinman Nov 24, 2025
bd3f6ad
remove max_requests as this is confusing and should be better vetted …
ehinman Nov 24, 2025
6a326ce
add new latest-daily service
ehinman Nov 24, 2025
ada9a41
correct example documentation and add info about logging
ehinman Nov 24, 2025
4f73484
correct date, add nldi as module to init.py
ehinman Nov 24, 2025
da71b90
make error messages louder, clearer
ehinman Nov 25, 2025
e614d83
re-arrange README a little
ehinman Nov 25, 2025
535b30f
try to fix ubuntu flake8 error
ehinman Nov 25, 2025
1c56573
Adjust readme styling
ehinman Nov 25, 2025
9e26096
will this appease flake8
ehinman Nov 25, 2025
c4a0591
move versioning to above imports
ehinman Nov 25, 2025
ed8fa23
add actual version to user agent
ehinman Nov 25, 2025
cb2976e
update waterdata test to skip on python 3.9 and older
ehinman Nov 25, 2025
7927f1f
try new import to avoid errors
ehinman Nov 25, 2025
c575447
remove ubuntu 3.8 from github actions
ehinman Nov 25, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 5 additions & 3 deletions .github/workflows/python-package.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,10 @@ jobs:
matrix:
os: [ubuntu-latest, windows-latest]
python-version: [3.8, 3.9, '3.10', 3.11, 3.12]
exclude:
- os: ubuntu-latest
python-version: 3.8


steps:
- uses: actions/checkout@eef61447b9ff4aafe5dcd4e0bbf5d482be7e7871
Expand All @@ -36,7 +40,5 @@ jobs:
flake8 . --count --exit-zero --max-complexity=10 --max-line-length=127 --statistics
- name: Test with pytest and report coverage
run: |
cd tests
coverage run -m pytest
coverage run -m pytest tests/
coverage report -m
cd ..
7 changes: 7 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
**11/24/2025:** `dataretrieval` is pleased to offer a new module, `waterdata`, which gives users access USGS's modernized [Water Data APIs](https://api.waterdata.usgs.gov/). The Water Data API endpoints include daily values, instantaneous values, field measurements (modernized groundwater levels service), time series metadata, and discrete water quality data from the Samples database. Though there will be a period of overlap, the functions within `waterdata` will eventually replace the `nwis` module, which currently provides access to the legacy [NWIS Water Services](https://waterservices.usgs.gov/). More example workflows and functions coming soon. Check `help(waterdata)` for more information.

**09/03/2024:** The groundwater levels service has switched endpoints, and `dataretrieval` was updated accordingly in [`v1.0.10`](https://github.com/DOI-USGS/dataretrieval-python/releases/tag/v1.0.10). Older versions using the discontinued endpoint will return 503 errors for `nwis.get_gwlevels` or the `service='gwlevels'` argument. Visit [Water Data For the Nation](https://waterdata.usgs.gov/blog/wdfn-waterservices-2024/) for more information.

**03/01/2024:** USGS data availability and format have changed on Water Quality Portal (WQP). Since March 2024, data obtained from WQP legacy profiles will not include new USGS data or recent updates to existing data. All USGS data (up to and beyond March 2024) are available using the new WQP beta services. You can access the beta services by setting `legacy=False` in the functions in the `wqp` module.

To view the status of changes in data availability and code functionality, visit: https://doi-usgs.github.io/dataRetrieval/articles/Status.html
294 changes: 217 additions & 77 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,123 +4,263 @@
![Conda Version](https://img.shields.io/conda/v/conda-forge/dataretrieval)
![Downloads](https://static.pepy.tech/badge/dataretrieval)

:warning: USGS data availability and format have changed on Water Quality Portal (WQP). Since March 2024, data obtained from WQP legacy profiles will not include new USGS data or recent updates to existing data. All USGS data (up to and beyond March 2024) are available using the new WQP beta services. You can access the beta services by setting `legacy=False` in the functions in the `wqp` module.
## Latest Announcements

To view the status of changes in data availability and code functionality, visit: https://doi-usgs.github.io/dataRetrieval/articles/Status.html
:mega: **11/24/2025:** `dataretrieval` now features the new `waterdata` module,
which provides access to USGS's modernized [Water Data
APIs](https://api.waterdata.usgs.gov/). The Water Data API endpoints include
daily values, instantaneous values, field measurements, time series metadata,
and discrete water quality data from the Samples database. This new module will
eventually replace the `nwis` module, which provides access to the legacy [NWIS
Water Services](https://waterservices.usgs.gov/).

:mega: **09/03/2024:** The groundwater levels service has switched endpoints, and `dataretrieval` was updated accordingly in [`v1.0.10`](https://github.com/DOI-USGS/dataretrieval-python/releases/tag/v1.0.10). Older versions using the discontinued endpoint will return 503 errors for `nwis.get_gwlevels` or the `service='gwlevels'` argument. Visit [Water Data For the Nation](https://waterdata.usgs.gov/blog/wdfn-waterservices-2024/) for more information.
**Important:** Users of the Water Data APIs are strongly encouraged to obtain an
API key for higher rate limits and greater access to USGS data. [Register for
an API key](https://api.waterdata.usgs.gov/signup/) and set it as an
environment variable:

```python
import os
os.environ["API_USGS_PAT"] = "your_api_key_here"
```

Check out the [NEWS](NEWS.md) file for all updates and announcements.

## What is dataretrieval?
`dataretrieval` was created to simplify the process of loading hydrologic data into the Python environment.
Like the original R version [`dataRetrieval`](https://github.com/DOI-USGS/dataRetrieval),
it is designed to retrieve the major data types of U.S. Geological Survey (USGS) hydrology
data that are available on the Web, as well as data from the Water
Quality Portal (WQP), which currently houses water quality data from the
Environmental Protection Agency (EPA), U.S. Department of Agriculture
(USDA), and USGS. Direct USGS data is obtained from a service called the
National Water Information System (NWIS).

Note that the python version is not a direct port of the original: it attempts to reproduce the functionality of the R package,
though its organization and interface often differ.
`dataretrieval` simplifies the process of loading hydrologic data into Python.
Like the original R version
[`dataRetrieval`](https://github.com/DOI-USGS/dataRetrieval), it retrieves major
U.S. Geological Survey (USGS) hydrology data types available on the Web, as well
as data from the Water Quality Portal (WQP) and Network Linked Data Index
(NLDI).

If there's a hydrologic or environmental data portal that you'd like dataretrieval to
work with, raise it as an [issue](https://github.com/USGS-python/dataretrieval/issues).
## Installation

Here's an example using `dataretrieval` to retrieve data from the National Water Information System (NWIS).
Install dataretrieval using pip:

```python
# first import the functions for downloading data from NWIS
import dataretrieval.nwis as nwis
```bash
pip install dataretrieval
```

Or using conda:

```bash
conda install -c conda-forge dataretrieval
```

## Usage Examples

### Water Data API (Recommended - Modern USGS Data)

# specify the USGS site code for which we want data.
site = '03339000'
The `waterdata` module provides access to modern USGS Water Data APIs.

# get instantaneous values (iv)
df = nwis.get_record(sites=site, service='iv', start='2017-12-31', end='2018-01-01')
The example below retrieves daily streamflow data for a specific monitoring
location for water year 2025, where a "/" between two dates in the "time"
input argument indicates a desired date range:

# get basic info about the site
df2 = nwis.get_record(sites=site, service='site')
```python
import dataretrieval.waterdata as waterdata

# Get daily streamflow data (returns DataFrame and metadata)
df, metadata = waterdata.get_daily(
monitoring_location_id='USGS-01646500',
parameter_code='00060', # Discharge
time='2024-10-01/2025-09-30'
)

print(f"Retrieved {len(df)} records")
print(f"Site: {df['monitoring_location_id'].iloc[0]}")
print(f"Mean discharge: {df['value'].mean():.2f} {df['unit_of_measure'].iloc[0]}")
```
Services available from NWIS include:
- instantaneous values (iv)
- daily values (dv)
- statistics (stat)
- site info (site)
- discharge peaks (peaks)
- discharge measurements (measurements)

Water quality data are available from:
- [Samples](https://waterdata.usgs.gov/download-samples/#dataProfile=site) - Discrete USGS water quality data only
- [Water Quality Portal](https://www.waterqualitydata.us/) - Discrete water quality data from USGS and EPA. Older data are available in the legacy WQX version 2 format; all data are available in the beta WQX3.0 format.

To access the full functionality available from NWIS web services, nwis.get record appends any additional kwargs into the REST request. For example, this function call:
Fetch daily discharge data for multiple sites from a start date to present
using the following code:

```python
nwis.get_record(sites='03339000', service='dv', start='2017-12-31', parameterCd='00060')
df, metadata = waterdata.get_daily(
monitoring_location_id=["USGS-13018750","USGS-13013650"],
parameter_code='00060',
time='2024-10-01/..'
)

print(f"Retrieved {len(df)} records")
```
...will download daily data with the parameter code 00060 (discharge).
The following example downloads location information for all monitoring
locations that are categorized as stream sites in the state of Maryland:

## Accessing the "Internal" NWIS
If you're connected to the USGS network, dataretrieval call pull from the internal (non-public) NWIS interface.
Most dataretrieval functions pass kwargs directly to NWIS's REST API, which provides simple access to internal data; simply specify "access='3'".
For example
```python
nwis.get_record(sites='05404147',service='iv', start='2021-01-01', end='2021-3-01', access='3')
# Get monitoring location information
locations, metadata = waterdata.get_monitoring_locations(
state_name='Maryland',
site_type_code='ST' # Stream sites
)

print(f"Found {len(locations)} stream monitoring locations in Maryland")
```
Visit the
[API Reference](https://doi-usgs.github.io/dataretrieval-python/reference/waterdata.html)
for more information and examples on available services and input parameters.

More services and documentation to come!
**NEW:** This new module implements
[logging](https://docs.python.org/3/howto/logging.html#logging-basic-tutorial)
in which users can view the URL requests sent to the USGS Water Data APIs
and the number of requests they have remaining each hour. These messages can
be helpful for troubleshooting and support. To enable logging in your python
console or notebook:

## Quick start
```python
import logging
logging.basicConfig(level=logging.INFO)
```
To log messages to a file, you can specify a filename in the
`basicConfig` call:

dataretrieval can be installed using pip:
$ python3 -m pip install -U dataretrieval
```python
logging.basicConfig(filename='waterdata.log', level=logging.INFO)
```

or conda:
### NWIS Legacy Services (Deprecated but still functional)

$ conda install -c conda-forge dataretrieval
The `nwis` module accesses legacy NWIS Water Services:

More examples of use are include in [`demos`](https://github.com/USGS-python/dataretrieval/tree/main/demos).
```python
import dataretrieval.nwis as nwis

## Issue tracker
# Get site information
info, metadata = nwis.get_info(sites='01646500')

print(f"Site name: {info['station_nm'].iloc[0]}")

# Get daily values
dv, metadata = nwis.get_dv(
sites='01646500',
start='2024-10-01',
end='2024-10-02',
parameterCd='00060',
)

print(f"Retrieved {len(dv)} daily values")
```

Please report any bugs and enhancement ideas using the dataretrieval issue
tracker:
### Water Quality Portal (WQP)

https://github.com/USGS-python/dataretrieval/issues
Access water quality data from multiple agencies:

Feel free to also ask questions on the tracker.
```python
import dataretrieval.wqp as wqp

# Find water quality monitoring sites
sites = wqp.what_sites(
statecode='US:55', # Wisconsin
siteType='Stream'
)

## Contributing
print(f"Found {len(sites)} stream monitoring sites in Wisconsin")

# Get water quality results
results = wqp.get_results(
siteid='USGS-05427718',
characteristicName='Temperature, water'
)

Any help in testing, development, documentation and other tasks is welcome.
For more details, see the file [CONTRIBUTING.md](CONTRIBUTING.md).
print(f"Retrieved {len(results)} temperature measurements")
```

### Network Linked Data Index (NLDI)

## Need help?
Discover and navigate hydrologic networks:

The Water Mission Area of the USGS supports the development and maintenance of `dataretrieval`. Any questions can be directed to the Computational Tools team at
comptools@usgs.gov.
```python
import dataretrieval.nldi as nldi

Resources are available primarily for maintenance and responding to user questions.
Priorities on the development of new features are determined by the `dataretrieval` development team.
# Get watershed basin for a stream reach
basin = nldi.get_basin(
feature_source='comid',
feature_id='13293474' # NHD reach identifier
)

print(f"Basin contains {len(basin)} feature(s)")

# Find upstream flowlines
flowlines = nldi.get_flowlines(
feature_source='comid',
feature_id='13293474',
navigation_mode='UT', # Upstream tributaries
distance=50 # km
)

print(f"Found {len(flowlines)} upstream tributaries within 50km")
```

## Available Data Services

### Modern USGS Water Data APIs (Recommended)
- **Daily values**: Daily statistical summaries (mean, min, max)
- **Field measurements**: Discrete measurements from field visits
- **Monitoring locations**: Site information and metadata
- **Time series metadata**: Information about available data parameters
- **Latest daily values**: Most recent daily statistical summary data
- **Latest instantaneous values**: Most recent high-frequency continuous data
- **Samples data**: Discrete USGS water quality data
- **Instantaneous values** (*COMING SOON*): High-frequency continuous data

### Legacy NWIS Services (Deprecated)
- **Daily values (dv)**: Legacy daily statistical data
- **Instantaneous values (iv)**: Legacy continuous data
- **Site info (site)**: Basic site information
- **Statistics (stat)**: Statistical summaries
- **Discharge peaks (peaks)**: Annual peak discharge events
- **Discharge measurements (measurements)**: Direct flow measurements

### Water Quality Portal
- **Results**: Water quality analytical results from USGS, EPA, and other agencies
- **Sites**: Monitoring location information
- **Organizations**: Data provider information
- **Projects**: Sampling project details

### Network Linked Data Index (NLDI)
- **Basin delineation**: Watershed boundaries for any point
- **Flow navigation**: Upstream/downstream network traversal
- **Feature discovery**: Find monitoring sites, dams, and other features
- **Hydrologic connectivity**: Link data across the stream network

## More Examples

Explore additional examples in the
[`demos`](https://github.com/USGS-python/dataretrieval/tree/main/demos)
directory, including Jupyter notebooks demonstrating advanced usage patterns.

## Getting Help

- **Issue tracker**: Report bugs and request features at https://github.com/USGS-python/dataretrieval/issues
- **Documentation**: Full API documentation available in the source code docstrings

## Contributing

Contributions are welcome! Please see [CONTRIBUTING.md](CONTRIBUTING.md) for
development guidelines.

## Acknowledgments
This material is partially based upon work supported by the National Science Foundation (NSF) under award 1931297.
Any opinions, findings, conclusions, or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the NSF.

This material is partially based upon work supported by the National Science
Foundation (NSF) under award 1931297. Any opinions, findings, conclusions, or
recommendations expressed in this material are those of the authors and do not
necessarily reflect the views of the NSF.

## Disclaimer

This software is preliminary or provisional and is subject to revision.
It is being provided to meet the need for timely best science.
The software has not received final approval by the U.S. Geological Survey (USGS).
No warranty, expressed or implied, is made by the USGS or the U.S. Government as to the functionality of the software and related material nor shall the fact of release constitute any such warranty.
The software is provided on the condition that neither the USGS nor the U.S. Government shall be held liable for any damages resulting from the authorized or unauthorized use of the software.
This software is preliminary or provisional and is subject to revision. It is
being provided to meet the need for timely best science. The software has not
received final approval by the U.S. Geological Survey (USGS). No warranty,
expressed or implied, is made by the USGS or the U.S. Government as to the
functionality of the software and related material nor shall the fact of release
constitute any such warranty. The software is provided on the condition that
neither the USGS nor the U.S. Government shall be held liable for any damages
resulting from the authorized or unauthorized use of the software.

## Citation

Hodson, T.O., Hariharan, J.A., Black, S., and Horsburgh, J.S., 2023, dataretrieval (Python): a Python package for discovering
and retrieving water data available from U.S. federal hydrologic web services:
U.S. Geological Survey software release,
https://doi.org/10.5066/P94I5TX3.
Hodson, T.O., Hariharan, J.A., Black, S., and Horsburgh, J.S., 2023,
dataretrieval (Python): a Python package for discovering and retrieving water
data available from U.S. federal hydrologic web services: U.S. Geological Survey
software release, https://doi.org/10.5066/P94I5TX3.
11 changes: 6 additions & 5 deletions dataretrieval/__init__.py
Original file line number Diff line number Diff line change
@@ -1,15 +1,16 @@
from importlib.metadata import PackageNotFoundError, version

try:
__version__ = version("dataretrieval")
except PackageNotFoundError:
__version__ = "version-unknown"

from dataretrieval.nadp import *
from dataretrieval.nldi import *
from dataretrieval.nwis import *
from dataretrieval.samples import *
from dataretrieval.streamstats import *
from dataretrieval.utils import *
from dataretrieval.waterdata import *
from dataretrieval.waterwatch import *
from dataretrieval.wqp import *

try:
__version__ = version("dataretrieval")
except PackageNotFoundError:
__version__ = "version-unknown"
Loading