Skip to content

Fixed error 1 (openAI title sanitization) and added unit tests#466

Open
amahuli03 wants to merge 5 commits intoCodeForPhilly:developfrom
amahuli03:390-bugfix-upload-file-endpoint
Open

Fixed error 1 (openAI title sanitization) and added unit tests#466
amahuli03 wants to merge 5 commits intoCodeForPhilly:developfrom
amahuli03:390-bugfix-upload-file-endpoint

Conversation

@amahuli03
Copy link
Collaborator

Description

Fixes the /v1/api/uploadFile endpoint failure that occurs when title extraction falls back to OpenAI. OpenAI sometimes returns titles wrapped in quotes (ex: '"Updated CANMAT/ISBD Guidelines..."'). The raw response is saved without sanitization, which caused a 400 error in the reported logs. This PR strips wrapping quotes and whitespace from OpenAI-generated titles before returning.

This PR also truncates titles to 255 characters so the title is guaranteed to fit in the CharField size limit. This is just an extra safeguard in the case where OpenAI doesn't respect the 256 character limit specified in the prompt.

Related Issue

Addresses part of #390

Manual Tests

(I'm not able to manually test right now, looking for help getting my local dev environment and workflow set up)

Automated Tests

Added 2 new unit tests to test_title.py:

  • test_strips_quotes_from_openai_title verifies wrapping quotes are removed from OpenAI-generated titles
  • test_truncates_long_openai_title verifies titles exceeding 255 characters are truncated

All tests passing (4 existing + 2 new)

Reviewers

@sahilds1

Notes

Also added some more logs to the error handler in views.py so errors include full traceback.

This only addresses the first error from the original issue. The second error is being caused by a separate bug which I think has to do with the upload component in UploadFile.tsx where we're setting Content-Type: multipart/form-data but not providing a boundary parameter. Needs further investigation and manual testing.

(This is a separate issue) When I tried uploading files manually in my local dev environment, I got Unauthorized: /api/v1/api/uploadFile which is not a valid route, right? Has anyone seen this issue?

@sahilds1
Copy link
Collaborator

sahilds1 commented Feb 26, 2026

(This is a separate issue) When I tried uploading files manually in my local dev environment, I got Unauthorized: /api/v1/api/uploadFile which is not a valid route, right? Has anyone seen this issue?

@amahuli03 I merged your PR that fixed this issue (PR #468)! Do you anything from me for further work on this PR?

@sahilds1 sahilds1 self-requested a review February 26, 2026 00:35
@amahuli03
Copy link
Collaborator Author

@sahilds1 Nope, the other part of this issue is resolved now. I'll mark this as ready for review

@amahuli03 amahuli03 marked this pull request as ready for review February 26, 2026 17:02
doc = MagicMock()
doc.metadata = {"title": None}
doc.get_text.return_value = []
doc[0].get_text.return_value = []
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also found an issue with how I'm setting up the mocks in all three tests. generate_title effectively does this to the pdf: pdf[0].get_text("blocks")
The tests were passing regardless because MagicMock will auto-create a new attribute on access, so, for the purposes of this test, it worked. I've fixed this actually make sense

@amahuli03 amahuli03 requested a review from sahilds1 March 2, 2026 17:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants