This repository holds scripts to process Codelists downloaded from DAC and turn them into IATI codelists.
Files/Directories/process:
- The data downloaded from DAC is in the files in the root directory
- The
extract_dac.pyscript extracts information from the DAC Downloads into a more structured form inCurrent_DACdirectory - The
IATI_codelistsdirectory holds current IATI codelists. These should be copied from https://github.com/IATI/IATI-Codelists-NonEmbedded . These are used as part of the processing to try and preserve order. - The
convert_to_iati.pyscript uses information fromCurrent_DACandIATI_codelistsand writes the new codelists toDAC_to_IATI - The new codelists in
DAC_to_IATIshould be synced back to https://github.com/IATI/IATI-Codelists-NonEmbedded
python3 -m venv venv/env
source venv/env/bin/activate
pip install -r requirements.txt
- Create a branch
updates/YYYY-MM-DD - Sync from IATI-Codelist-NonEmbedded repo
rsync -avz --existing ~/Projects/IATI-Codelists-NonEmbedded/xml/ IATI_codelists - Download file and copy to repo:
DAC-CRS-CODES_YYYY-MM-DD.xml - Extract
python extract_dac.py- Make sure to update filename in code
Update this line in
convert_to_iati.pywith the date DAC updated files:element.attrib['withdrawal-date'] = "2022-01-21"
- Make sure to update filename in code
Update this line in
- Convert to IATI
python convert_to_iati.py - Copy into IATI-Codelists-NonEmbedded in a branch
updates/YYYY-MM-DD