-
-
Notifications
You must be signed in to change notification settings - Fork 0
Description
We have a bunch of structs which we fill from Google Sheets, one per row, e.g.
Lines 217 to 224 in aa0d578
| #[derive(Debug, PartialEq, Eq, Serialize)] | |
| pub(crate) struct ReviewerStaffOnlyDetails { | |
| pub(crate) name: String, | |
| pub(crate) attended_training: bool, | |
| pub(crate) checked: CheckStatus, | |
| pub(crate) quality: String, | |
| pub(crate) notes: String, | |
| } |
We currently do this in an annoyingly manual and inflexible way.
For this particular struct, we populate it here:
trainee-tracker/src/reviewer_staff_info.rs
Lines 48 to 79 in aa0d578
| for (row_index, cells) in sheet.rows.iter().enumerate() { | |
| if row_index == 0 { | |
| continue; | |
| } | |
| if cells.len() < 6 { | |
| continue; | |
| } | |
| let github_login = GithubLogin::from(cell_string(&cells[0])); | |
| let notes = match cells.get(6) { | |
| Some(cell) => cell_string(cell), | |
| None => String::new(), | |
| }; | |
| let checked = match (cell_bool(&cells[3]), cell_bool(&cells[4])) { | |
| (true, false) => CheckStatus::CheckedAndOk, | |
| (true, true) => CheckStatus::CheckedAndCheckAgain, | |
| (false, _) => CheckStatus::Unchecked, | |
| }; | |
| reviewers.insert( | |
| github_login.clone(), | |
| ReviewerStaffOnlyDetails { | |
| name: cell_string(&cells[1]), | |
| attended_training: cell_bool(&cells[2]), | |
| checked, | |
| quality: cell_string(&cells[5]), | |
| notes, | |
| }, | |
| ); | |
| } |
We ignore the headings completely, so if someone changed the sheet, we would start reading incorrect values. Then we hard-code the field order and types of each cell.
In other places, we manually check headings, and just error if they're wrong:
trainee-tracker/src/register.rs
Lines 105 to 160 in aa0d578
| for (row_number, cells) in sheet_data.into_iter().enumerate() { | |
| // Some sheets have documentation or pivot table | |
| if row_number == 0 && !cells.is_empty() && cell_string(&cells[0]) != "Name" { | |
| continue; | |
| } | |
| if cells.len() < 7 { | |
| return Err(anyhow::anyhow!( | |
| "Not enough columns for row {} - expected at least 7, got {} containing: {}", | |
| row_number, | |
| cells.len(), | |
| format!("{:#?}", cells), | |
| )); | |
| } | |
| if row_number == 0 { | |
| let headings = cells.iter().take(7).map(cell_string).collect::<Vec<_>>(); | |
| if headings | |
| != [ | |
| "Name", | |
| "Email", | |
| "Timestamp", | |
| "Course", | |
| "Module", | |
| "Day", | |
| "Location", | |
| ] | |
| { | |
| return Err(anyhow::anyhow!( | |
| "Register sheet contained wrong headings: {}", | |
| headings.join(", ") | |
| )); | |
| } | |
| } else { | |
| if cells[0].effective_value.is_none() { | |
| break; | |
| } | |
| let (sprint_number, attendance) = read_row(&cells, register_url.clone()) | |
| .with_context(|| format!("Failed to read attendance from row {}", row_number))?; | |
| if attendance.timestamp.date_naive() <= start_date | |
| || attendance.timestamp.date_naive() >= end_date | |
| { | |
| continue; | |
| } | |
| let sprint_index = sprint_number - 1; | |
| while sprints.len() < sprint_number { | |
| sprints.push(IndexMap::new()); | |
| } | |
| if sprints[sprint_index].contains_key(&attendance.email) { | |
| warn!( | |
| "Register sheet contained duplicate entry for sprint {} trainee {}", | |
| sprint_number, attendance.email | |
| ); | |
| } else { | |
| sprints[sprint_index].insert(attendance.email.clone(), attendance); | |
| } | |
| } | |
| } |
We also handle partial rows poorly.
I would love instead for us to just use serde. We could derive Deserialize for each struct we populate, and use a serde adaptor to read the sheet into a Vec<T>.
There is an existing serde_sheets crate for doing this, but it has some issues. If it runs into issues with a row, it just prints an error to stdout and continues (https://docs.rs/serde_sheets/latest/src/serde_sheets/lib.rs.html#200).
Ideally we would either improve this crate, or write our own one. I'm imagining supporting a function along the lines of:
pub async fn read_all<T: DeserializeOwned>(
sheets: &mut Sheets,
document_id: &str,
tab_name: &str,
) -> SheetReadResult<T>;
enum SheetReadResult<T> {
AllOk(Vec<T>),
SomeOk {
ok: Vec<T>,
partial_records: Vec<Record>,
deserialization_error_records: Vec<(Record, serde::Error)>,
},
Err(SheetsError),
}
struct Record {
row_number: usize,
cells: Vec<CellData>,
}