-
Notifications
You must be signed in to change notification settings - Fork 552
Auto-Create Timestamps in prettify_prediction() When test_data is None
#1508
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Auto-Create Timestamps in prettify_prediction() When test_data is None
#1508
Conversation
… None - Removed NotImplementedError and instead generate timestamps automatically - Uses training data's end_date and frequency to create prediction timestamps - Supports np.ndarray, pd.Series, and pd.DataFrame inputs
thinkall
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you so much for the PR, @aditisingh02 ! Could you please also add a test case for your changes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR implements automatic timestamp generation in the TimeSeriesDataset.prettify_prediction() method when test_data is None, removing a NotImplementedError that was previously raised. The implementation generates future timestamps starting from the training data's end date using the dataset's inferred frequency.
Changes:
- Replaces
NotImplementedErrorwith automatic timestamp generation logic - Converts numpy arrays and pandas Series to DataFrames when needed
- Generates prediction timestamps using
pd.date_range()starting fromtrain_end_date + 1 period
flaml/automl/time_series/ts_data.py
Outdated
| start=train_end_date + pd.Timedelta(1, self.frequency), | ||
| periods=len(y_pred), | ||
| freq=self.frequency, | ||
| ) |
Copilot
AI
Jan 22, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The pd.Timedelta constructor does not accept pandas frequency strings (like 'D', 'MS', 'H', 'W', etc.) as the second parameter. The correct approach is to use pd.tseries.frequencies.to_offset(self.frequency) or simply not add a timedelta at all and let pd.date_range handle the offset.
The correct implementation should be:
pred_timestamps = pd.date_range(
start=train_end_date,
periods=len(y_pred) + 1,
freq=self.frequency,
)[1:] # Skip the first timestamp (which is train_end_date)Or alternatively:
from pandas.tseries.frequencies import to_offset
pred_timestamps = pd.date_range(
start=train_end_date + to_offset(self.frequency),
periods=len(y_pred),
freq=self.frequency,
)Note: The same bug exists in the existing create_forward_frame function at line 512, but that's outside the scope of this PR.
| start=train_end_date + pd.Timedelta(1, self.frequency), | |
| periods=len(y_pred), | |
| freq=self.frequency, | |
| ) | |
| start=train_end_date, | |
| periods=len(y_pred) + 1, | |
| freq=self.frequency, | |
| )[1:] # Skip the first timestamp (train_end_date itself) |
Updates Based on Review FeedbackBug Fix
Test Coverage Enhancements
Backward Compatibility
|
thinkall
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aditisingh02 , thank you so much for the revision. Copilot's feedbacks are not always correct. For instance, pd.Timedelta actually can accept frequency as its second parameter:
https://pandas.pydata.org/docs/reference/api/pandas.Timedelta.html . Sorry for the mistakes made by the Copilot review agent.
Do you mind make some changes to the tests you've added? I'd prefer simplify the tests. Besides, I believe an E2E tests with model training and prediction can better showcase the improvements of this PR.
The last minor change I'd suggest is that putting the new tests in test_forecast.py instead of test_model.py.
Updated Based on Review FeedbackThank you for the review, @thinkall! You're right about I've made all the requested changes: Test Simplification (Orthogonal Approach)Refactored the tests into two separate, focused test functions:
E2E TestAdded test_auto_timestamps_e2e that demonstrates the full workflow:
Relocated TestsMoved all tests from test_model.py to test_forecast.py as suggested. FormattingFixed formatting issues with black. All tests pass locally. Let me know if you'd like any further changes! |
|
@aditisingh02 , you need to run |
| pd.testing.assert_index_equal(pd.DatetimeIndex(result["date"]), expected_dates, check_names=False) | ||
|
|
||
|
|
||
| def test_auto_timestamps_e2e(budget=3): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test already works without the PR. Should have a test that won't work on current release and will be fixed with PR.
|
Thanks for the review!
Ready for another look! |
thinkall
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aditisingh02 , thank you so much for the active revision. However, the agent you're using doesn't seem to have the ability to resolve the issue in your code. It's time to go with "human-in-the-loop" :-)
Why are these changes needed?
Currently, the
TimeSeriesDataset.prettify_prediction()method inflaml/automl/time_series/ts_data.pythrows aNotImplementedErrorwhentest_dataisNone:This is frustrating for users who want to make predictions without providing explicit test data timestamps, which is a common use case in time series forecasting.
This PR implements automatic timestamp generation by:
train_data[time_col].max()) as the starting pointfrequencynp.ndarray,pd.Series, andpd.DataFrameExample behavior after this change:
Related issue number
Closes #1507
Checks