Instability and unpredictability of the TDR servers is hindering scripted work that uses the Dataverse API. Quite a lot of the relevant metadata can be obtained from the DataCite API, though with less granularity (e.g., no versions) and lack of certain fields (e.g., parent dataverse). Build list of 'can' and 'cannot':
Can retrieve from DataCite:
- Author name, identifier, affiliation (and affiliation identifier)
- Dataset contact name and affiliation
- Subject (sort of, is mixed in with keywords but could be pulled out via CVOC)
- Keywords
- MDC metrics (not directly equivalent to DV metrics)
- Individual file sizes
- Individual file mimetypes
- Manual and automated timestamps (e.g., published, created, deposited)
- License
Cannot retrieve from DataCite:
- Parent institution's dataverse
- Dataset ID
- Version ID
- Dataverse information
- Dataset contact email
- Data depositor information
- Assorted restrictions fields
Instability and unpredictability of the TDR servers is hindering scripted work that uses the Dataverse API. Quite a lot of the relevant metadata can be obtained from the DataCite API, though with less granularity (e.g., no versions) and lack of certain fields (e.g., parent dataverse). Build list of 'can' and 'cannot':
Can retrieve from DataCite:
Cannot retrieve from DataCite: