Make upload MIME validation robust when python-magic is unavailable by arpit-k16 · Pull Request #340 · RohanExploit/VishwaGuru

arpit-k16 · 2026-02-05T09:57:42Z

Summary

This PR makes image upload validation more robust in environments where python-magic is missing or misconfigured, while preserving all existing security checks.

Changes

Import magic defensively in backend/main.py, falling back to None if the library is not available.
When python-magic is available, continue to use it to detect MIME type from the file contents.
When python-magic is not available or fails, fall back to:
- FastAPI’s UploadFile.content_type header, or
- mimetypes.guess_type based on the filename.
If MIME type cannot be determined or is not in the allowed list, the upload is rejected with a 400 error.
Keep existing security checks:
- Maximum file size limit
- PIL-based image validation and corruption checks
- Optional resizing of large images for performance

Rationale

Previously, the backend assumed python-magic was always installed. In environments where it was missing or misconfigured, the app could fail at import time or when validating uploads.

This change:

Avoids hard failures when python-magic is unavailable.
Maintains strict validation of uploads (size, MIME type, and image integrity).
Does not change any public API or the normal behavior when python-magic is installed.

Testing

Verified diff is limited to backend/main.py.
Reasoned through behavior in both cases:
- With python-magic present: MIME detection continues to use content-based detection.
- Without python-magic: uploads are still validated via content type / filename plus PIL checks, and invalid files are rejected with HTTP 400.
Existing tests that mock magic (e.g. backend/tests/test_detection_bytes.py) remain compatible because the import name and usage are unchanged.

Summary by CodeRabbit

New Features
- Integrated image upload + detection pipeline with server-side validation and per-user/IP upload limits.
Improvements
- Automatic resizing of oversized images for smoother processing.
- More resilient MIME-type detection with a fallback and startup warning if unavailable.
- Stronger upload validation, clearer 400 errors, and improved logging for image integrity and processing.

netlify · 2026-02-05T09:57:47Z

👷 Deploy request for fixmybharat pending review.

Visit the deploys page to approve it

Name	Link
🔨 Latest commit	`3e90f5e`

github-actions · 2026-02-05T09:57:51Z

🙏 Thank you for your contribution, @arpit-k16!

PR Details:

Title: Make upload MIME validation robust when python-magic is unavailable
Number: Make upload MIME validation robust when python-magic is unavailable #340

Quality Checklist:
Please ensure your PR meets the following criteria:

Code follows the project's style guidelines
Self-review of code completed
Code is commented where necessary
Documentation updated (if applicable)
No new warnings generated
Tests added/updated (if applicable)
All tests passing locally
No breaking changes to existing functionality

Review Process:

Automated checks will run on your code
A maintainer will review your changes
Address any requested changes promptly
Once approved, your PR will be merged! 🎉

Note: The maintainers will monitor code quality and ensure the overall project flow isn't broken.

coderabbitai · 2026-02-05T09:57:58Z

📝 Walkthrough

Walkthrough

Adds an upload validation and detection pipeline in backend/main.py: MIME detection (python-magic with fallback), size and per-identifier upload limits, synchronous PIL-based image integrity checks with automatic resize for oversized images, async wrappers, detection helper, DB table creation at startup, and a shared HTTP client.

Changes

Cohort / File(s)	Summary
File Upload Validation & Detection `backend/main.py`	Adds MIME detection (python-magic with fallback), ALLOWED_MIME_TYPES and MAX_FILE_SIZE constants, per-identifier upload limiter and `check_upload_limits`, `_validate_uploaded_file_sync` (size, MIME, PIL image checks, optional resize), async `validate_uploaded_file`, and `process_and_detect` to run detection workflows; raises HTTPException on failures and logs magic absence.
Initialization / Runtime wiring `backend/main.py`	Creates database tables at startup, exposes a shared HTTP client, notes image processing cache, and integrates detection workflow helper into app lifecycle.
Manifest / Tooling `manifest`	Manifest entries adjusted to reflect added/changed dependencies and features (lines added/removed).

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant Server
    participant Magic as python-magic
    participant Fallback as Header/Guess
    participant PIL
    participant DB
    participant Storage

    Client->>Server: POST /upload (file)
    Server->>Server: check_upload_limits(identifier)
    Server->>Magic: detect MIME (optional)
    alt magic available
        Magic-->>Server: mime/type
    else
        Server->>Fallback: use Content-Type or mimetypes.guess_type
        Fallback-->>Server: mime/type or None
    end
    Server->>Server: enforce size & ALLOWED_MIME_TYPES
    alt MIME is image
        Server->>PIL: open and validate image bytes
        PIL-->>Server: valid + dimensions
        alt dimensions > MAX_DIMENSION
            Server->>PIL: resize to MAX_DIMENSION
            PIL-->>Server: resized bytes
            Server->>Storage: update file bytes
        else
            Server->>Storage: store original bytes
        end
        Server->>Server: process_and_detect -> detection_func
        Server->>DB: (startup tables created) log/store results
    else
        Server-->>Client: HTTPException (invalid type/size)
    end
    Server-->>Client: response (success/error)

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐇 I sniff the bytes and hum with glee,

If magic's missing, I guess what I see,
I trim big pixels down just right,
I count each hop — one upload, one night,
A rabbit's patch of code, all tidy and spry.

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately describes the main change: making upload MIME validation robust when python-magic is unavailable, which aligns with the PR's core objective of handling missing/misconfigured python-magic while preserving existing validations.
Docstring Coverage	✅ Passed	Docstring coverage is 80.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions

👋 Hello @arpit-k16! Thank you for your first Pull Request to VishwaGuru! 🎉

We appreciate your contribution to this open source platform empowering India's youth to engage with democracy.

What happens next?

A maintainer will review your PR soon
Please ensure all tests pass and the code follows our style guidelines
Be ready to address any feedback or requested changes

Before your PR is merged:

Ensure your code follows the project's coding standards
All tests should pass
Update documentation if needed
Make sure your commits are clear and descriptive

Resources:

Thank you for contributing! 🚀

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@backend/main.py`:
- Around line 217-224: The code sets file.size after rewinding the buffer so it
becomes 0; fix by measuring the buffer length before rewinding (or by seeking to
the end then using tell) and then assign that value to file.size — e.g., after
img.save(...) and before output.seek(0) capture size = output.tell() or
output.seek(0, io.SEEK_END); size = output.tell(); then set file.file = output,
file.size = size and finally rewind the stream (output.seek(0)); update the
block that manipulates output, img.save, and file to use this order.

🧹 Nitpick comments (2)

backend/main.py (2)
184-186: Document the security trade-off when using fallback MIME detection.

When python-magic is unavailable, the code falls back to file.content_type (client-provided, spoofable) and mimetypes.guess_type (extension-based, also spoofable). While the subsequent PIL validation (lines 201-204) provides a secondary security layer that ensures the file is actually a valid image, the MIME-type check alone becomes bypassable.

Consider adding a comment documenting this security trade-off, or logging a more specific warning when fallback is used for a particular upload (not just at startup).
💡 Optional: Add per-upload warning when using fallback
     if not detected_mime:
         # Fallback: trust FastAPI's content_type header or guess from filename
         detected_mime = file.content_type or mimetypes.guess_type(file.filename or "")[0]
+        if detected_mime:
+            logger.debug(
+                f"MIME type for {file.filename} detected via fallback (content_type/extension): {detected_mime}. "
+                "Content will be validated by PIL."
+            )
226-231: Use logger.exception for automatic traceback inclusion.

logger.exception automatically includes the exception traceback, which is more informative than logger.error with manual formatting. The broad exception catch is acceptable here since PIL can raise various exception types for malformed images.
♻️ Proposed fix
     except Exception as pil_error:
-        logger.error(f"PIL validation failed for {file.filename}: {pil_error}")
+        logger.exception(f"PIL validation failed for {file.filename}: {pil_error}")
         raise HTTPException(
             status_code=400,
             detail="Invalid image file. The file appears to be corrupted or not a valid image."
         )

backend/main.py

github-actions · 2026-02-05T16:25:41Z

🔍 Quality Reminder

Thanks for the updates! Please ensure:
- Your changes don't break existing functionality
- All tests still pass
- Code quality standards are maintained

*The maintainers will verify that the overall project flow remains intact.*

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

backend/main.py (1)
424-428: ⚠️ Potential issue | 🔴 Critical

Add await to all validate_uploaded_file() calls; security validation is completely skipped without it.

validate_uploaded_file is async. Calling it without await in lines 427, 582, 606, 632, 658, and 684 creates a coroutine object that never executes, allowing unvalidated files through security checks entirely.
🛠️ Proposed fixes

Line 427:
        if image:
-           validate_uploaded_file(image)
+           await validate_uploaded_file(image)
Line 582:
-   validate_uploaded_file(image)
+   await validate_uploaded_file(image)
Line 606:
-   validate_uploaded_file(image)
+   await validate_uploaded_file(image)
Line 632:
-   validate_uploaded_file(image)
+   await validate_uploaded_file(image)
Line 658:
-   validate_uploaded_file(image)
+   await validate_uploaded_file(image)
Line 684:
-   validate_uploaded_file(image)
+   await validate_uploaded_file(image)

🤖 Fix all issues with AI agents

In `@backend/main.py`:
- Around line 181-208: The image validation/resizing block advances the file
pointer (Image.open(file.file) and img.verify()) but only rewinds when a resize
is performed; add a file.file.seek(0) after the PIL validation/resizing logic so
the pointer is always reset before returning (ensure this happens whether or not
the resize branch ran); update the code around Image.open(file.file) /
img.verify() and the replacement of file.file (and after recording resized_size)
so process_and_detect() and any subsequent Image.open(image.file) read from the
start of the stream.

backend/main.py

coderabbitai

Actionable comments posted: 6

🤖 Fix all issues with AI agents

In `@backend/main.py`:
- Around line 1-11: The file is missing imports used by check_upload_limits and
_validate_uploaded_file_sync and also duplicates the httpx import; add "import
io" and "from datetime import datetime, timedelta" at the top of backend/main.py
so io.BytesIO(), datetime.now(), and timedelta(...) are defined, and remove the
duplicate httpx import (keep a single import). Update any import grouping to
avoid Ruff F811 redefinition errors and ensure check_upload_limits and
_validate_uploaded_file_sync can reference these symbols.
- Around line 104-123: The check_upload_limits function performs a non-atomic
get→filter→check→append→set on user_upload_cache which permits a TOCTOU race;
fix it by making the read-modify-write atomic for the given identifier—either
implement a per-identifier threading.Lock (e.g., a dict of locks keyed by
identifier) and wrap the body that reads current_uploads, filters old
timestamps, checks length, appends now, and calls user_upload_cache.set inside
that lock, or replace the list-based approach with an atomic counter in a remote
store (e.g., Redis INCR with TTL) and update check_upload_limits to use that
atomic increment/decrement TTL semantics instead of get/set on
user_upload_cache. Ensure you reference and update the logic inside
check_upload_limits and use user_upload_cache.get/set or the new atomic store
consistently.
- Around line 252-253: Remove the unguarded module-level call to
Base.metadata.create_all(bind=engine) so table creation does not run at import
time; the same operation with error handling already exists in lifespan(), so
delete the Base.metadata.create_all line (and any direct uses of engine for
schema creation at module scope) to avoid duplicate/ungarded startup DB access
and rely on the existing lifecycle-managed creation inside lifespan().
- Around line 235-242: The call to validate_image_for_processing(pil_image) in
the async block is missing an import; update backend/main.py to import
validate_image_for_processing from backend.pothole_detection (same symbol used
in backend/routers/detection.py) so the
run_in_threadpool(validate_image_for_processing, pil_image) call resolves, or
alternatively remove that call if you decide the existing synchronous PIL checks
in _validate_uploaded_file_sync are sufficient—ensure the chosen approach keeps
run_in_threadpool(Image.open, image.file) and the surrounding HTTPException
handling intact.
- Around line 57-59: Remove the duplicate import lines that re-import
EXCEPTION_HANDLERS, Base, engine, get_db, SessionLocal, and Issue; locate the
redundant imports (the second occurrences of EXCEPTION_HANDLERS and the group
importing Base, engine, get_db, SessionLocal, Issue) and delete them so each
symbol is imported only once (keep the original imports where they first appear
and remove the later duplicate lines).
- Around line 192-196: Before calling img.resize capture the original format
(e.g., original_format = img.format) and after resizing pass that
original_format into img.save instead of using img.format or blindly falling
back to 'JPEG'; if original_format is None, choose a safe fallback that
preserves alpha (use 'PNG' when img.mode indicates an alpha channel like 'RGBA'
or 'LA', otherwise 'JPEG'), and keep the existing quality parameter when saving;
update the block around img.resize, img.format, and img.save (referencing
img.resize, img.format, img.save, new_width/new_height) to implement this.

🧹 Nitpick comments (2)

backend/main.py (2)
206-211: Chain the re-raised HTTPException with from for proper tracebacks (Ruff B904).

Inside except blocks, raising a new exception without from makes the original cause invisible in tracebacks. This applies here and similarly on lines 242 and 250.
♻️ Proposed fix
     except Exception as pil_error:
         logger.error(f"PIL validation failed for {file.filename}: {pil_error}")
         raise HTTPException(
             status_code=400,
             detail="Invalid image file. The file appears to be corrupted or not a valid image."
-        )
+        ) from pil_error
146-176: Security note: MIME fallback path is header/extension-based and trivially spoofable.

When python-magic is absent, both file.content_type (client-supplied header) and mimetypes.guess_type (extension-based) can be faked by an attacker. The downstream PIL verify() check partially compensates, but a file with a valid image header followed by malicious payload could still pass. Consider adding a comment documenting this accepted trade-off, so future maintainers understand the risk surface when python-magic is missing.

coderabbitai · 2026-02-06T07:13:20Z

backend/main.py

+from fastapi import FastAPI, UploadFile, File, Form, HTTPException, Depends, BackgroundTasks, Request, Query
+from fastapi.responses import JSONResponse
+from fastapi.middleware.cors import CORSMiddleware
+from fastapi.middleware.gzip import GZipMiddleware
+from sqlalchemy.orm import Session
+from sqlalchemy import func
+from contextlib import asynccontextmanager
+from functools import lru_cache
+from typing import List, Optional, Any
+from PIL import Image
+import mimetypes


⚠️ Potential issue | 🔴 Critical

Missing import io and from datetime import datetime, timedelta will cause NameError at runtime.

io.BytesIO() is used on line 195, datetime.now() on line 109, and timedelta(hours=1) on line 110, but none of these are imported anywhere in the file. Both check_upload_limits and the image-resize path inside _validate_uploaded_file_sync will crash at runtime.

Also, import httpx on line 40 is a redefinition of the identical import on line 32 (Ruff F811).

🐛 Proposed fix — add missing imports and remove duplicate

from fastapi import FastAPI, UploadFile, File, Form, HTTPException, Depends, BackgroundTasks, Request, Query from fastapi.responses import JSONResponse from fastapi.middleware.cors import CORSMiddleware from fastapi.middleware.gzip import GZipMiddleware from sqlalchemy.orm import Session from sqlalchemy import func from contextlib import asynccontextmanager from functools import lru_cache from typing import List, Optional, Any from PIL import Image import mimetypes +import io +from datetime import datetime, timedelta

try: import magic except ImportError: magic = None -import httpx from async_lru import alru_cache

Also applies to: 34-41

🤖 Prompt for AI Agents

In `@backend/main.py` around lines 1 - 11, The file is missing imports used by check_upload_limits and _validate_uploaded_file_sync and also duplicates the httpx import; add "import io" and "from datetime import datetime, timedelta" at the top of backend/main.py so io.BytesIO(), datetime.now(), and timedelta(...) are defined, and remove the duplicate httpx import (keep a single import). Update any import grouping to avoid Ruff F811 redefinition errors and ensure check_upload_limits and _validate_uploaded_file_sync can reference these symbols.

coderabbitai · 2026-02-06T07:13:20Z

backend/main.py

+from backend.exceptions import EXCEPTION_HANDLERS
+from backend.database import Base, engine, get_db, SessionLocal
+from backend.models import Issue


🛠️ Refactor suggestion | 🟠 Major

Remove duplicate imports (Ruff F811).

Lines 58–59 re-import Base, engine, get_db, SessionLocal, and Issue that are already imported on lines 44–45. Line 57 re-imports EXCEPTION_HANDLERS already imported on line 65. These redefinitions are flagged by Ruff and add confusion.

♻️ Proposed fix — remove the duplicate lines

-from backend.exceptions import EXCEPTION_HANDLERS -from backend.database import Base, engine, get_db, SessionLocal -from backend.models import Issue

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

from backend.exceptions import EXCEPTION_HANDLERS

from backend.database import Base, engine, get_db, SessionLocal

from backend.models import Issue

from backend.exceptions import EXCEPTION_HANDLERS

from backend.database import Base, engine, get_db, SessionLocal

from backend.models import Issue

🧰 Tools

🪛 Ruff (0.14.14)

[error] 58-58: Redefinition of unused Base from line 44: Base redefined here

Remove definition: Base

(F811)

[error] 58-58: Redefinition of unused engine from line 44: engine redefined here

Remove definition: engine

(F811)

[error] 58-58: Redefinition of unused get_db from line 44: get_db redefined here

Remove definition: get_db

(F811)

[error] 58-58: Redefinition of unused SessionLocal from line 44: SessionLocal redefined here

Remove definition: SessionLocal

(F811)

[error] 59-59: Redefinition of unused Issue from line 45: Issue redefined here

Remove definition: Issue

(F811)

🤖 Prompt for AI Agents

In `@backend/main.py` around lines 57 - 59, Remove the duplicate import lines that re-import EXCEPTION_HANDLERS, Base, engine, get_db, SessionLocal, and Issue; locate the redundant imports (the second occurrences of EXCEPTION_HANDLERS and the group importing Base, engine, get_db, SessionLocal, Issue) and delete them so each symbol is imported only once (keep the original imports where they first appear and remove the later duplicate lines).

coderabbitai · 2026-02-06T07:13:20Z

backend/main.py

+def check_upload_limits(identifier: str, limit: int) -> None:
+    """
+    Check if the user/IP has exceeded upload limits using thread-safe cache.
+    """
+    current_uploads = user_upload_cache.get(identifier) or []
+    now = datetime.now()
+    one_hour_ago = now - timedelta(hours=1)
+
+    # Filter out old timestamps (older than 1 hour)
+    recent_uploads = [ts for ts in current_uploads if ts > one_hour_ago]
+
+    if len(recent_uploads) >= limit:
+        raise HTTPException(
+            status_code=429,
+            detail=f"Upload limit exceeded. Maximum {limit} uploads per hour allowed."
+        )
+
+    # Add current timestamp and update cache atomically
+    recent_uploads.append(now)
+    user_upload_cache.set(recent_uploads, identifier)


⚠️ Potential issue | 🟡 Minor

TOCTOU race in upload-limit enforcement.

The get → filter → check → append → set sequence on lines 108–123 is not atomic. Two concurrent requests for the same identifier can both read the same list, both pass the limit check, and both write back — allowing more uploads than limit. If user_upload_cache does not provide a compare-and-swap or locking primitive, consider wrapping this in a threading.Lock keyed by identifier, or using an atomic counter (e.g., Redis INCR with TTL) for correctness under load.

🧰 Tools

🪛 Ruff (0.14.14)

[error] 109-109: Undefined name datetime

(F821)

[error] 110-110: Undefined name timedelta

(F821)

🤖 Prompt for AI Agents

In `@backend/main.py` around lines 104 - 123, The check_upload_limits function performs a non-atomic get→filter→check→append→set on user_upload_cache which permits a TOCTOU race; fix it by making the read-modify-write atomic for the given identifier—either implement a per-identifier threading.Lock (e.g., a dict of locks keyed by identifier) and wrap the body that reads current_uploads, filters old timestamps, checks length, appends now, and calls user_upload_cache.set inside that lock, or replace the list-based approach with an atomic counter in a remote store (e.g., Redis INCR with TTL) and update check_upload_limits to use that atomic increment/decrement TTL semantics instead of get/set on user_upload_cache. Ensure you reference and update the logic inside check_upload_limits and use user_upload_cache.get/set or the new atomic store consistently.

coderabbitai · 2026-02-06T07:13:20Z

backend/main.py

+            img = img.resize((new_width, new_height), Image.Resampling.LANCZOS)
+
+            # Save resized image back to file
+            output = io.BytesIO()
+            img.save(output, format=img.format or 'JPEG', quality=85)


⚠️ Potential issue | 🟠 Major

img.format is None after resize() — fallback to 'JPEG' may corrupt alpha-channel images.

Image.resize() returns a new Image object whose .format attribute is None. The or 'JPEG' fallback will always activate. For PNG/GIF/WebP uploads with an alpha channel, saving as JPEG will either raise an error (cannot write mode RGBA as JPEG) or silently drop transparency.

Capture the original format before resizing and use it in img.save():

🐛 Proposed fix

img = Image.open(file.file) if img.width > 1024 or img.height > 1024: + original_format = img.format or 'JPEG' # Calculate new size maintaining aspect ratio ratio = min(1024 / img.width, 1024 / img.height) new_width = int(img.width * ratio) new_height = int(img.height * ratio) img = img.resize((new_width, new_height), Image.Resampling.LANCZOS) # Save resized image back to file output = io.BytesIO() - img.save(output, format=img.format or 'JPEG', quality=85) + img.save(output, format=original_format, quality=85)

🧰 Tools

🪛 Ruff (0.14.14)

[error] 195-195: Undefined name io

(F821)

🤖 Prompt for AI Agents

In `@backend/main.py` around lines 192 - 196, Before calling img.resize capture the original format (e.g., original_format = img.format) and after resizing pass that original_format into img.save instead of using img.format or blindly falling back to 'JPEG'; if original_format is None, choose a safe fallback that preserves alpha (use 'PNG' when img.mode indicates an alpha channel like 'RGBA' or 'LA', otherwise 'JPEG'), and keep the existing quality parameter when saving; update the block around img.resize, img.format, and img.save (referencing img.resize, img.format, img.save, new_width/new_height) to implement this.

coderabbitai · 2026-02-06T07:13:20Z

backend/main.py

+        pil_image = await run_in_threadpool(Image.open, image.file)
+        # Validate image for processing
+        await run_in_threadpool(validate_image_for_processing, pil_image)
+    except HTTPException:
+        raise  # Re-raise HTTP exceptions from validation
    except Exception as e:
-        logger.error(f"Error during background initialization: {e}", exc_info=True)
+        logger.error(f"Invalid image file during processing: {e}", exc_info=True)
+        raise HTTPException(status_code=400, detail="Invalid image file")


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# Search for definition of validate_image_for_processing rg -nP --type=py 'def validate_image_for_processing'

Repository: RohanExploit/VishwaGuru

Length of output: 141

🏁 Script executed:

# Search for imports of validate_image_for_processing rg -nP --type=py 'from .* import.*validate_image_for_processing|import.*validate_image_for_processing'

Repository: RohanExploit/VishwaGuru

Length of output: 380

🏁 Script executed:

# Check the imports and context in backend/main.py around line 237 head -n 250 backend/main.py | tail -n +1 | head -n 50

Repository: RohanExploit/VishwaGuru

Length of output: 1939

🏁 Script executed:

# Get a broader view of imports at the top of backend/main.py head -n 100 backend/main.py

Repository: RohanExploit/VishwaGuru

Length of output: 3750

🏁 Script executed:

# Check all imports in backend/main.py to see if validate_image_for_processing appears anywhere rg -n 'from backend|import.*validate_image_for_processing' backend/main.py

Repository: RohanExploit/VishwaGuru

Length of output: 987

🏁 Script executed:

# Verify the actual code around line 237 in backend/main.py sed -n '230,245p' backend/main.py

Repository: RohanExploit/VishwaGuru

Length of output: 692

Import validate_image_for_processing from backend.pothole_detection.

Line 237 calls validate_image_for_processing(pil_image), but this function is not imported in backend/main.py. The function is defined in backend.pothole_detection and already imported in other files like backend/routers/detection.py. Add it to the imports at the top of the file, or remove the call if the PIL validation in _validate_uploaded_file_sync is sufficient.

🧰 Tools

🪛 Ruff (0.14.14)

[error] 237-237: Undefined name validate_image_for_processing

(F821)

[warning] 242-242: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

🤖 Prompt for AI Agents

In `@backend/main.py` around lines 235 - 242, The call to validate_image_for_processing(pil_image) in the async block is missing an import; update backend/main.py to import validate_image_for_processing from backend.pothole_detection (same symbol used in backend/routers/detection.py) so the run_in_threadpool(validate_image_for_processing, pil_image) call resolves, or alternatively remove that call if you decide the existing synchronous PIL checks in _validate_uploaded_file_sync are sufficient—ensure the chosen approach keeps run_in_threadpool(Image.open, image.file) and the surrounding HTTPException handling intact.

coderabbitai · 2026-02-06T07:13:20Z

backend/main.py

+# Create tables if they don't exist
+Base.metadata.create_all(bind=engine)


⚠️ Potential issue | 🟠 Major

Redundant and unguarded Base.metadata.create_all at module level.

This synchronous call runs at import time without error handling, blocking the module load and crashing if the database is unreachable. The same operation is already performed inside lifespan() (line 265) with proper error handling. Remove this line to avoid the duplicate call and the risk of an unhandled startup crash.

🐛 Proposed fix

-# Create tables if they don't exist -Base.metadata.create_all(bind=engine)

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

# Create tables if they don't exist

Base.metadata.create_all(bind=engine)

🤖 Prompt for AI Agents

In `@backend/main.py` around lines 252 - 253, Remove the unguarded module-level call to Base.metadata.create_all(bind=engine) so table creation does not run at import time; the same operation with error handling already exists in lifespan(), so delete the Base.metadata.create_all line (and any direct uses of engine for schema creation at module scope) to avoid duplicate/ungarded startup DB access and rely on the existing lifecycle-managed creation inside lifespan().

Make upload MIME validation robust without python-magic

0c22fe5

github-actions bot added the size/m label Feb 5, 2026

github-actions bot reviewed Feb 5, 2026

View reviewed changes

coderabbitai bot reviewed Feb 5, 2026

View reviewed changes

backend/main.py Outdated Show resolved Hide resolved

arpit-k16 and others added 2 commits February 5, 2026 16:46

Fix resized upload size calculation

6507dbe

Merge branch 'main' into upload-mime-fallback-733f1

8467ce9

coderabbitai bot reviewed Feb 5, 2026

View reviewed changes

backend/main.py Show resolved Hide resolved

Merge branch 'main' into upload-mime-fallback-733f1

3e90f5e

coderabbitai bot reviewed Feb 6, 2026

View reviewed changes

		# Create tables if they don't exist
		Base.metadata.create_all(bind=engine)

Conversation

arpit-k16 commented Feb 5, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Testing

Summary by CodeRabbit

Uh oh!

netlify bot commented Feb 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

👷 Deploy request for fixmybharat pending review.

Uh oh!

github-actions bot commented Feb 5, 2026

🙏 Thank you for your contribution, @arpit-k16!

Uh oh!

coderabbitai bot commented Feb 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

github-actions bot commented Feb 5, 2026

🔍 Quality Reminder

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

arpit-k16 commented Feb 5, 2026 •

edited by coderabbitai bot

Loading

netlify bot commented Feb 5, 2026 •

edited

Loading

coderabbitai bot commented Feb 5, 2026 •

edited

Loading