Conversation
ArneTR
left a comment
There was a problem hiding this comment.
I found nothing major in the review but have annotated some remarks and questions.
I gotta say I found this PR very hard to review. It feels very much like an AI generated PR which includes so much encapsulation into mini functions and hard to read boilerplate (e.g. records_by_year: dict[int, list[dict]] = {} instead of records_by_year = {}) that I cannot comprehend the interplay of the functions as a whole.
This might be related bc I did not write the initial code. However I would argue to maybe make less boilerplated code
| zone: str, | ||
| carbon_intensity: float, | ||
| timestamp: datetime | None = None, | ||
| estimation: bool = False, |
There was a problem hiding this comment.
I would call this variable estimated. Makes it clearer to me that it is a bool and not the type of estimation
| @@ -167,7 +190,12 @@ def _to_iso(dt: datetime) -> str: | |||
| return dt.astimezone(timezone.utc).isoformat().replace("+00:00", "Z") | |||
There was a problem hiding this comment.
Is the Z really so common? Would UTC not be better?
There was a problem hiding this comment.
I do this because EM uses this format
https://app.electricitymaps.com/developer-hub/api/reference#carbon-intensity-past
and I wanted to be compatible with the Electricity maps API
| region TEXT NOT NULL, | ||
| carbon_intensity DOUBLE PRECISION NOT NULL, | ||
| provider TEXT NOT NULL, | ||
| estimation BOOLEAN NOT NULL DEFAULT TRUE, |
There was a problem hiding this comment.
How can carbon yearly ever not be an estimation?
There was a problem hiding this comment.
I copied this from the other table and always set it to true. I refactored this to now set the field when the data is read from the DB
| def fetch_latest(conn: Connection, region: str) -> dict[str, dict]: | ||
| def fetch_latest(conn: Connection, region: str) -> list[dict]: | ||
| """Return the most recent row for each provider at a region.""" | ||
| with conn.cursor(row_factory=dict_row) as cur: |
There was a problem hiding this comment.
no need to have a dict_row factory here IMHO. could be normal list lookup
There was a problem hiding this comment.
But like this I get dicts that I can use? Otherwist I would need to build the data structure in an extra step? What is the advantage of this?
|
|
||
| SELECT DISTINCT region | ||
| FROM carbon_yearly | ||
| WHERE region IS NOT NULL |
There was a problem hiding this comment.
According to the table definition the column is not null in both tables.
When is it ever null?
|
|
||
| def _align_to_quarter_hour(dt: datetime) -> datetime: | ||
| """Round a datetime up to the next 15-minute boundary.""" | ||
| if dt.second == 0 and dt.microsecond == 0 and dt.minute % 15 == 0: |
There was a problem hiding this comment.
That check feels absurdely unlikely to hit. Better not have in the first place and just do the calculation always?
There was a problem hiding this comment.
Hmmm, doesn't always happen when someone selects the full hour?
| return json.loads(match.group(1)) | ||
|
|
||
|
|
||
| def iter_yearly_dataset_records(data_dir: Path = YEARLY_DATA_DIR) -> Iterator[dict]: |
There was a problem hiding this comment.
why is this an iterator with all its embodied overhead if the only thing needed is the output and not a replacement in exisiting loop flows?
There was a problem hiding this comment.
changed to return a list
You will need to recreate the DB again as I changed quite a few things |
|
I have looked at it but cannot review it in full at the moment. Somehow I feel that the yearly integration in this form provides quite a bad user experience
One could make the argument though that the Elephant UI is more of a debug interface and for that I think it does a good job. Can only be used from someone who knows it quirks. It should hover more intuitively integrate into GMT. I wanted to try that, but the integration is still broken :) I am still experiencing the |
|
As discussed I will close this PR as I don't think we can really "save" anything here. It will always be an hack. The added gain from this is nothing compared to the complexity. |



A first idea of what we talked about today.
I am not to happy about the way the code is structured now as there are always 2 checks but I don't see any other option if we want to keep the api clean and agnostic.
We will have to update the GMT simulation frontend not to check all the yearly providers when looking for the best location.