-
Notifications
You must be signed in to change notification settings - Fork 15
Feat/forecast on date #170
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
CodexVeritas
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
General Thoughts:
- Given date questions are reusing numeric code a lot, we can reuse a lot of the numeric forecast tests. It might be too much to try to parametrize all of the different tests for numeric questions, but consider reusing/paramterizing/whether-we-should-do-anything about the below:
- Do date questions have log scale ever? We should test this works correctly if so.
- More minor edge cases (which is probably already handled by numeric distribution tests -> e.g. test_close_bound_distribution or test_forecast_log_scale_repeated_values)
- All the probability being put on a single number (does it smooth out the curve), etc?
- all probability being put way outside of bounds (it should error)
- Make sure that date questions show up in test_predicts_test_question.py (in which case you can probably remove the explicit test_date_question since these do the same thing)
- Make sure that you can properly save and load a date question report from json (test_metaculus_question_is_jsonable)
- Make sure that metac-bots now forecast on date questions by default if available.
| percentile=percentile.percentile, | ||
| value=datetime.strptime( | ||
| f"{percentile.value} UTC", "%Y-%m-%d %Z" | ||
| ).timestamp(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can just use pendulum.parse(string).timestamp(). It will handle any weird formatting quirks.
| ) | ||
| value: str = Field( | ||
| description="The number matching the percentile (e.g. '90% of people are age 60 or younger' translates to '60')", | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- You can make the typehint "datetime" and it will automatically make sure its a datetime, and structure_output should format things correctly as well
- Make sure the description matches the datetime format (the example given is as if it were regular numeric)
- If you do the above you should probably rename it to DatePercentile.
- Potentially add validation for timezone.
- Potentially move this to FallTemplateBot since this is used only for prompting and nowhere else? If not though, make sure to export this from the package so that I can use it in the template bot example for participants.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think making it a datetime directly would work, because it's not an instance of BaseClass.
| f""" | ||
| The text given to you is trying to give a forecast distribution for a date question. | ||
| - This text is trying to answer the numeric question: "{question.question_text}". | ||
| - As an example, someone else guessed that the answer will be between {question.lower_bound} and {question.upper_bound}, so the numbers parsed from and answer like this would be verbatim "{question.lower_bound}" and "{question.upper_bound}". |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there is a grammar error "so the numbers parsed from and answer like this..." ?
forecasting_tools/forecast_bots/official_bots/fall_template_bot.py
Outdated
Show resolved
Hide resolved
| Percentile 40: YYYY-MM-DD | ||
| Percentile 60: YYYY-MM-DD | ||
| Percentile 80: YYYY-MM-DD | ||
| Percentile 90: YYYY-MM-DD |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On second thought, we should probably include the hour here. I could see some edge cases where it puts all its probability on something happening on an election day, but then moving it 7hrs back would be very painful for it (as it would be the previous day). Also tell it to mention hours in UTC.
| elif isinstance(question, DateQuestion): | ||
| upper_bound_number = question.upper_bound.date().isoformat() | ||
| lower_bound_number = question.lower_bound.date().isoformat() | ||
| unit_of_measure = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably make this an empty string. I think "None" will turn to a literal string when put into an f-string (might be worth reading through the prompt for other potentially weird things at a debug point).
452a58a to
6012ba9
Compare
No description provided.