Skip to content

Conversation

@alex-costea
Copy link
Contributor

No description provided.

Copy link
Contributor

@CodexVeritas CodexVeritas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

General Thoughts:

  • Given date questions are reusing numeric code a lot, we can reuse a lot of the numeric forecast tests. It might be too much to try to parametrize all of the different tests for numeric questions, but consider reusing/paramterizing/whether-we-should-do-anything about the below:
    • Do date questions have log scale ever? We should test this works correctly if so.
    • More minor edge cases (which is probably already handled by numeric distribution tests -> e.g. test_close_bound_distribution or test_forecast_log_scale_repeated_values)
      • All the probability being put on a single number (does it smooth out the curve), etc?
      • all probability being put way outside of bounds (it should error)
    • Make sure that date questions show up in test_predicts_test_question.py (in which case you can probably remove the explicit test_date_question since these do the same thing)
  • Make sure that you can properly save and load a date question report from json (test_metaculus_question_is_jsonable)
  • Make sure that metac-bots now forecast on date questions by default if available.

percentile=percentile.percentile,
value=datetime.strptime(
f"{percentile.value} UTC", "%Y-%m-%d %Z"
).timestamp(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can just use pendulum.parse(string).timestamp(). It will handle any weird formatting quirks.

)
value: str = Field(
description="The number matching the percentile (e.g. '90% of people are age 60 or younger' translates to '60')",
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. You can make the typehint "datetime" and it will automatically make sure its a datetime, and structure_output should format things correctly as well
  2. Make sure the description matches the datetime format (the example given is as if it were regular numeric)
  3. If you do the above you should probably rename it to DatePercentile.
  4. Potentially add validation for timezone.
  5. Potentially move this to FallTemplateBot since this is used only for prompting and nowhere else? If not though, make sure to export this from the package so that I can use it in the template bot example for participants.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think making it a datetime directly would work, because it's not an instance of BaseClass.

f"""
The text given to you is trying to give a forecast distribution for a date question.
- This text is trying to answer the numeric question: "{question.question_text}".
- As an example, someone else guessed that the answer will be between {question.lower_bound} and {question.upper_bound}, so the numbers parsed from and answer like this would be verbatim "{question.lower_bound}" and "{question.upper_bound}".
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is a grammar error "so the numbers parsed from and answer like this..." ?

Percentile 40: YYYY-MM-DD
Percentile 60: YYYY-MM-DD
Percentile 80: YYYY-MM-DD
Percentile 90: YYYY-MM-DD
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On second thought, we should probably include the hour here. I could see some edge cases where it puts all its probability on something happening on an election day, but then moving it 7hrs back would be very painful for it (as it would be the previous day). Also tell it to mention hours in UTC.

elif isinstance(question, DateQuestion):
upper_bound_number = question.upper_bound.date().isoformat()
lower_bound_number = question.lower_bound.date().isoformat()
unit_of_measure = None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably make this an empty string. I think "None" will turn to a literal string when put into an f-string (might be worth reading through the prompt for other potentially weird things at a debug point).

@CodexVeritas CodexVeritas merged commit ccbf89f into main Dec 17, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants