Skip to content

Partial Completion of Assignment 2#2

Open
abeerkhe wants to merge 2 commits into
mainfrom
assignment-2
Open

Partial Completion of Assignment 2#2
abeerkhe wants to merge 2 commits into
mainfrom
assignment-2

Conversation

@abeerkhe
Copy link
Copy Markdown
Owner

@abeerkhe abeerkhe commented Oct 6, 2025

What changes are you trying to make? (e.g. Adding or removing code, refactoring existing code, adding reports)

Uploading a fully completed Assignment 2.

What did you learn from the changes you have made?

I learned how to load and pre-process data, use it to create a pipeline with baseline and advanced regressors (and how to think about when to use which ones), and tuning, testing, and comparing pipelines against each other to evaluate the best-fitting parameter.

Finally, I exported the best model as a pickle file and explained one observation and global trends for the best-fitting model.

Was there another approach you were thinking about making? If so, what approach(es) were you thinking of?

I wanted to explore using more scaling methods, numerical features (for preproc2), numerical transformations, and scoring methods.

  • Scaling methods: play around with each scaling method and test the impact it has on the final model accuracy given the range of data.

  • Numerical features: the features I expected to have a large impact on the model weren't particularly relevant, so I would've liked to see how different values would impact the model. In addition, I only used the 3 chosen non-linearly transformed features and would have liked to spend time adding the other features linearly-transformed to see the final impact on the model.

Numerical transformations: Similar as above.

Scoring: I'm not fully happy with the output and SHAP score of my model so, alongside the above, would like to see how other scoring methods rate different models and compare their accuracy.

Were there any challenges? If so, what issue(s) did you face? How did you overcome it?

I faced a number of challenges with formatting and NaN values when trying to build Pipeline A and B. This is because I missed some formatting when creating preproc1 and preproc2 (specifically with listing tuples properly) and ensuring zero variance and 0 values are managed.

I had to use a lot of Google and reading through documentation to understand what the issues were. Ultimately, I had to start from the preceding step to rewatch lectures and re-create everything in a similar fashion to understand what was breaking and why.

I initially ran some tests to ensure neither of X nor Y had any null values; unfortunately, no null values doesn't exclude needing zero variance and imputer contingencies built into the preprocessing because 0 values (e.g., for rain) caused division by 0 and prevented the pipeline from properly operating.

How were these changes tested?

Lots and lots of re-running code blocks...

A reference to a related issue in your repository (if applicable)

N/A

Checklist

  • [ x ] I can confirm that my changes are working as intended

Final Comment

I'm not ultimately satisfied with the output of my model and would love pointed feedback on which scaling, transformations, and scoring (or other things, like features!) could have made the model more robust. :)

Copy link
Copy Markdown

@Dmytro-Bonislavskyi Dmytro-Bonislavskyi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @abeerkhe Please continue with the Tune Hyperparameters, Evaluation, Export, and Explain sections.
Right now, I can assess your assignment at about 40% (which might not be enough to pass).
If you finalize at least one more section, you can bring it up to around 60%.

@abeerkhe
Copy link
Copy Markdown
Owner Author

abeerkhe commented Oct 7, 2025

Hi @abeerkhe Please continue with the Tune Hyperparameters, Evaluation, Export, and Explain sections. Right now, I can assess your assignment at about 40% (which might not be enough to pass). If you finalize at least one more section, you can bring it up to around 60%.

I have uploaded a completed Assignment 2. I was given an extension on Assignment 1 and 2 by Jesus due to extenuating circumstances, but wanted to upload a partially completed project to show progress.

Please let me know of any additional feedback. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants