As shown in the code snippet below, the ingest script currently only includes the study's ID and the URL to the study's page on the NMDC data portal; both of which it derives from information in the Biosample. Retrieving additional details about the study, such as its name and its description, will require fetching data from the study_set collection (via some Runtime API endpoint, such as GET /studies).
|
def get_part_of_collection(self) -> list[bertron.DataCollection]: |
|
"""Returns a list of `DataCollection` instances, each describing one of the Biosample's associated studies. |
|
|
|
References: |
|
- https://ber-data.github.io/bertron-schema/DataCollection/ |
|
- https://microbiomedata.github.io/nmdc-schema/associated_studies/ |
|
|
|
TODO: Retrieve the name and description of the Study from the NMDC Runtime API, then include it here. |
|
""" |
|
data_collections = [] |
|
if self.associated_studies is not None and len(self.associated_studies) > 0: |
|
for study_id in self.associated_studies: |
|
data_collection = bertron.DataCollection( |
|
id=study_id, |
|
url=f"https://api.microbiomedata.org/studies/{study_id}", |
|
) |
|
data_collections.append(data_collection) |
|
return data_collections |
I think this will be a straightforward change to make, but may require renaming some variables and wrapping the cached data within a higher-level JSON object (e.g. one that has a biosamples property and a studies property).
As shown in the code snippet below, the ingest script currently only includes the study's ID and the URL to the study's page on the NMDC data portal; both of which it derives from information in the Biosample. Retrieving additional details about the study, such as its name and its description, will require fetching data from the
study_setcollection (via some Runtime API endpoint, such asGET /studies).data/contrib/nmdc/ingest.py
Lines 152 to 169 in 87fab60
I think this will be a straightforward change to make, but may require renaming some variables and wrapping the cached data within a higher-level JSON object (e.g. one that has a
biosamplesproperty and astudiesproperty).