Skip to content

Conversation

@Integer-Ctrl
Copy link
Contributor

@Integer-Ctrl Integer-Ctrl commented Dec 4, 2025

Summary by CodeRabbit

  • New Features

    • New delete command to remove groups, artifacts, versions or files with interactive confirmation, dry‑run and force modes; CLI accepts an API key for authenticated deletes.
  • Documentation

    • README adds a dedicated Delete section with examples (Python and Docker), updated TOC/anchors, help output samples and corrected formatting/links.
  • Improvements

    • Download and retrieval flows accept API‑key authenticated fetches, have tighter timeouts/retries, and more robust JSON‑LD handling.
  • Bug Fixes

    • Minor wording and typo corrections in CLI help and docs.

✏️ Tip: You can customize this high-level summary in your review settings.

@Integer-Ctrl Integer-Ctrl self-assigned this Dec 4, 2025
@coderabbitai
Copy link

coderabbitai bot commented Dec 4, 2025

📝 Walkthrough

Walkthrough

Adds a new deletion API and CLI command for Databus resources, new URI parsing and JSON‑LD fetch utilities, propagates optional databus_key authentication through the client download flow, and updates README with detailed Delete usage and examples.

Changes

Cohort / File(s) Change Summary
Documentation
README.md
Added "Delete" section under CLI Usage, updated Table of Contents and module anchor, fixed wording/formatting, and expanded delete examples (Python and Docker) and CLI help output samples.
Deletion API
databusclient/api/delete.py
New module implementing interactive and programmatic deletion: per-resource confirmation, single- and batch-delete helpers, artifact (delete versions → artifact) and group (resolve artifacts → delete each → group) workflows; supports dry-run, force, API-key validation, logging, and HTTP error handling. Exposes delete(databusURIs, databus_key, dry_run, force).
Utilities
databusclient/api/utils.py
New helpers: get_databus_id_parts_from_uri(uri) to parse Databus URIs into parts and get_json_ld_from_databus(uri, databus_key=None) to fetch JSON‑LD with optional X-API-KEY, timeout, and raise-on-error.
CLI integration
databusclient/cli.py
Added delete CLI command wired to the deletion API with options --databus-key, --dry-run, and --force; imports deletion API and fixed a small help-text typo.
Client download flow
databusclient/client.py
Propagated databus_key through download flow: updated function signatures (__query_sparql__, __handle_databus_file_query__, __download_list__, download, __handle_databus_collection__), replaced internal URI/JSON‑LD helpers with new utils, adjusted timeouts and authentication header usage, and changed some query result handling to generators.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant CLI as CLI
  participant DeleteAPI as databusclient.api.delete
  participant Utils as databusclient.api.utils
  participant Databus as Databus API

  CLI->>DeleteAPI: delete(databusURIs, databus_key, dry_run, force)
  DeleteAPI->>Utils: get_databus_id_parts_from_uri(uri)
  alt resource requires JSON-LD (artifact or group)
    DeleteAPI->>Utils: get_json_ld_from_databus(resource_uri, databus_key)
    Utils->>Databus: HTTP GET Accept: application/ld+json\n(X-API-KEY if provided)
    Databus-->>Utils: 200 + JSON-LD
    Utils-->>DeleteAPI: JSON-LD
    DeleteAPI->>DeleteAPI: parse JSON-LD (versions / artifacts)
  end
  loop delete resources
    DeleteAPI->>Databus: HTTP DELETE (resource URL, X-API-KEY if present) or log (dry-run)
    Databus-->>DeleteAPI: 200/204 or error
  end
  DeleteAPI-->>CLI: summary / exit code
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

  • Pay special attention to databusclient/api/delete.py (orchestration, user prompts vs forced operations, dry-run branching, HTTP error handling).
  • Verify header and timeout handling in databusclient/api/utils.py.
  • Review databusclient/client.py signature changes and generator vs list behavior for callers.
  • Confirm CLI option parsing and wiring in databusclient/cli.py.

Possibly related PRs

  • Finalize 256 #36 — Related changes around propagating databus_key through CLI and client code.

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'cli: delete datasets' is concise and clearly conveys the primary change: adding delete functionality to the CLI interface.
Docstring Coverage ✅ Passed Docstring coverage is 80.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch delete-datasets

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
databusclient/client.py (1)

525-548: Add timeouts to HTTP requests to prevent indefinite hangs.

Multiple requests.get calls in the download flow lack timeouts. This can cause the application to hang indefinitely if a server is unresponsive.

Apply these changes:

     # --- 2. Try direct GET ---
-    response = requests.get(url, stream=True, allow_redirects=True)
+    response = requests.get(url, stream=True, allow_redirects=True, timeout=30)
     www = response.headers.get('WWW-Authenticate', '')
     ...
         # --- 4. Retry with token ---
-        response = requests.get(url, headers=headers, stream=True)
+        response = requests.get(url, headers=headers, stream=True, timeout=30)
     
     # Databus API key required if only 401 Unauthorized
     elif response.status_code == 401:
         ...
-        response = requests.get(url, headers=headers, stream=True)
+        response = requests.get(url, headers=headers, stream=True, timeout=30)
🧹 Nitpick comments (3)
README.md (1)

70-71: Wrap bare URLs in angle brackets for proper Markdown rendering.

Bare URLs should be wrapped in <> for consistent Markdown formatting.

-1. If you do not have a DBpedia Account yet (Forum/Databus), please register at https://account.dbpedia.org
-2. Log in at https://account.dbpedia.org and create your token.
+1. If you do not have a DBpedia Account yet (Forum/Databus), please register at <https://account.dbpedia.org>
+2. Log in at <https://account.dbpedia.org> and create your token.
databusclient/api/delete.py (1)

154-154: Address TODO comment.

There's a TODO to add documentation to README.md. The delete command appears to be documented in the README, so this TODO may be outdated.

Would you like me to verify if the README documentation is complete, or should this TODO be removed?

databusclient/client.py (1)

727-743: Prefix unused host variable with underscore.

The host variable from URI parsing is unpacked but never used. Prefix with underscore to indicate intentional non-use.

         if localDir is None:
-            host, account, group, artifact, version, file = get_databus_id_parts_from_uri(url)
+            _host, account, group, artifact, version, file = get_databus_id_parts_from_uri(url)
             fileLocalDir = os.path.join(os.getcwd(), account, group, artifact, version if version is not None else "latest")
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b70f59a and cb91775.

📒 Files selected for processing (5)
  • README.md (7 hunks)
  • databusclient/api/delete.py (1 hunks)
  • databusclient/api/utils.py (1 hunks)
  • databusclient/cli.py (2 hunks)
  • databusclient/client.py (10 hunks)
🧰 Additional context used
🧬 Code graph analysis (3)
databusclient/cli.py (2)
databusclient/api/delete.py (1)
  • delete (155-188)
databusclient/client.py (1)
  • download (747-827)
databusclient/api/delete.py (2)
databusclient/api/utils.py (2)
  • get_databus_id_parts_from_uri (4-18)
  • get_json_ld_from_databus (20-37)
databusclient/cli.py (1)
  • delete (121-134)
databusclient/client.py (2)
databusclient/api/utils.py (2)
  • get_databus_id_parts_from_uri (4-18)
  • get_json_ld_from_databus (20-37)
databusclient/cli.py (1)
  • download (102-114)
🪛 LanguageTool
README.md

[style] ~158-~158: To form a complete sentence, be sure to include a subject or ‘there’.
Context: ...LOADTARGET ``` - $DOWNLOADTARGET - Can be any Databus URI including collection...

(MISSING_IT_THERE)


[style] ~440-~440: You have already used this phrasing in nearby sentences. Consider replacing it to add variety to your writing.
Context: ...re proceeding with the deletion. If you want to skip this prompt, you can use the `--fo...

(REP_WANT_TO_VB)

🪛 markdownlint-cli2 (0.18.1)
README.md

20-20: Link fragments should be valid

(MD051, link-fragments)


70-70: Bare URL used

(MD034, no-bare-urls)


71-71: Bare URL used

(MD034, no-bare-urls)


87-87: Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)


442-442: Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)


444-444: Multiple headings with the same content

(MD024, no-duplicate-heading)

🪛 Ruff (0.14.7)
databusclient/api/utils.py

20-20: PEP 484 prohibits implicit Optional

Convert to T | None

(RUF013)


34-34: Probable use of requests call without timeout

(S113)

databusclient/api/delete.py

54-54: Avoid specifying long messages outside the exception class

(TRY003)


57-57: Avoid specifying long messages outside the exception class

(TRY003)


68-68: Probable use of requests call without timeout

(S113)


73-73: Create your own exception

(TRY002)


73-73: Avoid specifying long messages outside the exception class

(TRY003)


88-88: Unused function argument: force

(ARG001)


112-112: Avoid specifying long messages outside the exception class

(TRY003)


170-170: Unpacked variable host is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)


170-170: Unpacked variable account is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)

databusclient/client.py

525-525: Probable use of requests call without timeout

(S113)


545-545: Avoid specifying long messages outside the exception class

(TRY003)


548-548: Probable use of requests call without timeout

(S113)


719-719: PEP 484 prohibits implicit Optional

Convert to T | None

(RUF013)


724-724: Probable use of requests call without timeout

(S113)


729-729: PEP 484 prohibits implicit Optional

Convert to T | None

(RUF013)


730-730: PEP 484 prohibits implicit Optional

Convert to T | None

(RUF013)


731-731: PEP 484 prohibits implicit Optional

Convert to T | None

(RUF013)


732-732: PEP 484 prohibits implicit Optional

Convert to T | None

(RUF013)


736-736: Unpacked variable host is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)

🔇 Additional comments (7)
databusclient/api/utils.py (1)

4-18: LGTM - Clean URI parsing utility.

The function correctly strips the protocol prefix, splits on /, and pads to ensure 6 parts. The return type annotation properly documents the tuple structure.

databusclient/api/delete.py (1)

7-30: LGTM - Well-structured confirmation flow.

The confirmation function provides clear user prompts with appropriate options (confirm/skip/cancel) and handles invalid input gracefully.

databusclient/cli.py (2)

98-114: LGTM - Clean integration of authentication parameters.

The download command correctly passes vault_token and databus_key to the client. Note: the typo in line 99 ("donwload") was flagged in the README review.


116-134: LGTM - Well-structured delete command.

The delete command properly delegates to api_delete with all required parameters. The --dry-run flag as is_flag=True and required --databus-key are correctly configured.

databusclient/client.py (3)

11-12: LGTM - Clean import of new utilities.

The refactored utility functions are properly imported, replacing the previous internal implementations.


733-741: Potential bug: fileLocalDir reset only when localDir is None.

When localDir is provided, fileLocalDir is initialized to localDir before the loop but never updated inside. This is correct behavior. However, when localDir is None, the variable is recomputed for each URL, which correctly uses the URL's path structure.


768-811: Consistent propagation of databus_key throughout download flow.

The databus_key parameter is correctly passed to all nested functions including get_json_ld_from_databus, __handle_databus_collection__, and __download_list__. This ensures authenticated access works for all download scenarios.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
databusclient/client.py (1)

721-726: Add timeout and fix implicit Optional type hint.

The past review comment about missing timeout and implicit Optional type hint has not been addressed. The requests.get call can hang indefinitely without a timeout.

-def __handle_databus_collection__(uri: str, databus_key: str = None) -> str:
+def __handle_databus_collection__(uri: str, databus_key: str | None = None) -> str:
     headers = {"Accept": "text/sparql"}
     if databus_key is not None:
         headers["X-API-KEY"] = databus_key

-    return requests.get(uri, headers=headers).text
+    return requests.get(uri, headers=headers, timeout=30).text
🧹 Nitpick comments (4)
databusclient/api/delete.py (3)

21-30: Consider using in operator for cleaner conditionals.

The multiple or comparisons can be simplified for readability.

     while True:
         choice = input("Type 'yes'/'y' to confirm, 'skip'/'s' to skip this resource, or 'cancel'/'c' to abort: ").strip().lower()
-        if choice == "yes" or choice == "y":
+        if choice in ("yes", "y"):
             return "confirm"
-        elif choice == "skip" or choice == "s":
+        elif choice in ("skip", "s"):
             return "skip"
-        elif choice == "cancel" or choice == "c":
+        elif choice in ("cancel", "c"):
             return "cancel"
         else:
             print("Invalid input. Please type 'yes'/'y', 'skip'/'s', or 'cancel'/'c'.")

53-54: Consider using a custom exception instead of KeyboardInterrupt.

KeyboardInterrupt is conventionally used for Ctrl+C signals, not for programmatic flow control. Using it here could confuse exception handlers that specifically catch keyboard interrupts.

Consider defining a custom exception at the top of the file:

class DeletionCancelledError(Exception):
    """Raised when user cancels the deletion process."""
    pass

Then use it instead:

         if action == "cancel":
-            raise KeyboardInterrupt("Deletion cancelled by user.")
+            raise DeletionCancelledError("Deletion cancelled by user.")

The CLI layer can catch this specific exception and exit gracefully.


153-154: Remove stale TODO comment.

The README.md already includes comprehensive documentation for the delete command (lines 400-474), so this TODO is no longer needed.

-# TODO: add to README.md
 def delete(databusURIs: List[str], databus_key: str, dry_run: bool, force: bool):
databusclient/client.py (1)

737-739: Prefix unused variable with underscore.

The host variable is unpacked but never used. Prefix it with an underscore to indicate it's intentionally unused.

     for url in urls:
         if localDir is None:
-            host, account, group, artifact, version, file = get_databus_id_parts_from_uri(url)
+            _host, account, group, artifact, version, file = get_databus_id_parts_from_uri(url)
             fileLocalDir = os.path.join(os.getcwd(), account, group, artifact, version if version is not None else "latest")
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d86d06b and 9d6de50.

📒 Files selected for processing (5)
  • README.md (5 hunks)
  • databusclient/api/delete.py (1 hunks)
  • databusclient/api/utils.py (1 hunks)
  • databusclient/cli.py (3 hunks)
  • databusclient/client.py (9 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • databusclient/api/utils.py
  • databusclient/cli.py
🧰 Additional context used
🧬 Code graph analysis (1)
databusclient/client.py (1)
databusclient/api/utils.py (2)
  • get_databus_id_parts_from_uri (4-18)
  • get_json_ld_from_databus (20-37)
🪛 LanguageTool
README.md

[style] ~440-~440: You have already used this phrasing in nearby sentences. Consider replacing it to add variety to your writing.
Context: ...re proceeding with the deletion. If you want to skip this prompt, you can use the `--fo...

(REP_WANT_TO_VB)

🪛 Ruff (0.14.7)
databusclient/client.py

571-571: Avoid specifying long messages outside the exception class

(TRY003)


738-738: Unpacked variable host is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)

databusclient/api/delete.py

54-54: Avoid specifying long messages outside the exception class

(TRY003)


57-57: Avoid specifying long messages outside the exception class

(TRY003)


73-73: Create your own exception

(TRY002)


73-73: Avoid specifying long messages outside the exception class

(TRY003)


112-112: Avoid specifying long messages outside the exception class

(TRY003)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: build
  • GitHub Check: build
🔇 Additional comments (5)
README.md (1)

400-442: LGTM! Delete command documentation is well-structured.

The delete section provides comprehensive documentation with clear examples for version, artifact, group, and collection deletion. The anchor cli-delete now correctly matches the TOC link, and the examples follow the established pattern from the download section.

Minor note: Line 440 says "As security measure" - consider adding the article "a" for grammatical correctness ("As a security measure"), though this is a very minor polish item.

databusclient/api/delete.py (1)

120-151: LGTM! Group deletion logic is well-structured.

The function correctly:

  • Fetches the group JSON-LD with authentication
  • Filters to artifact URIs only (excluding versions)
  • Recursively deletes all artifacts and their versions
  • Finally deletes the group itself
databusclient/client.py (3)

11-12: Good refactoring to use shared utilities.

Extracting get_databus_id_parts_from_uri and get_json_ld_from_databus to a shared utils module and using them here improves code reuse and consistency across the codebase.


624-643: LGTM! SPARQL query authentication support.

The addition of databus_key parameter with proper header injection via setCustomHttpHeaders correctly enables authenticated SPARQL queries when an API key is provided.


771-830: LGTM! Consistent databus_key propagation throughout download flow.

The databus_key parameter is correctly propagated through all download paths:

  • Collection queries (line 783-785)
  • File downloads (line 788)
  • Version JSON-LD fetches (line 791)
  • Artifact JSON-LD fetches (lines 796, 799)
  • Group JSON-LD fetches (lines 805, 809, 812)
  • SPARQL query handling (line 829)

This ensures authenticated access works consistently across all download scenarios.

Comment on lines 100 to 112
artifact_body = get_json_ld_from_databus(databusURI, databus_key)

json_dict = json.loads(artifact_body)
versions = json_dict.get("databus:hasVersion")

# Single version case {}
if isinstance(versions, dict):
versions = [versions]
# Multiple versions case [{}, {}]

version_uris = [v["@id"] for v in versions if "@id" in v]
if not version_uris:
raise ValueError("No versions found in artifact JSON-LD")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Handle missing or None versions gracefully.

If get_json_ld_from_databus returns invalid JSON or databus:hasVersion is missing/None, the code will fail. The versions variable could be None if the key doesn't exist.

Apply this diff to handle the edge case:

     artifact_body = get_json_ld_from_databus(databusURI, databus_key)

     json_dict = json.loads(artifact_body)
     versions = json_dict.get("databus:hasVersion")

+    if versions is None:
+        raise ValueError(f"No versions found in artifact JSON-LD for {databusURI}")
+
     # Single version case {}
     if isinstance(versions, dict):
         versions = [versions]
     # Multiple versions case [{}, {}]

     version_uris = [v["@id"] for v in versions if "@id" in v]
     if not version_uris:
         raise ValueError("No versions found in artifact JSON-LD")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
artifact_body = get_json_ld_from_databus(databusURI, databus_key)
json_dict = json.loads(artifact_body)
versions = json_dict.get("databus:hasVersion")
# Single version case {}
if isinstance(versions, dict):
versions = [versions]
# Multiple versions case [{}, {}]
version_uris = [v["@id"] for v in versions if "@id" in v]
if not version_uris:
raise ValueError("No versions found in artifact JSON-LD")
artifact_body = get_json_ld_from_databus(databusURI, databus_key)
json_dict = json.loads(artifact_body)
versions = json_dict.get("databus:hasVersion")
if versions is None:
raise ValueError(f"No versions found in artifact JSON-LD for {databusURI}")
# Single version case {}
if isinstance(versions, dict):
versions = [versions]
# Multiple versions case [{}, {}]
version_uris = [v["@id"] for v in versions if "@id" in v]
if not version_uris:
raise ValueError("No versions found in artifact JSON-LD")
🧰 Tools
🪛 Ruff (0.14.7)

112-112: Avoid specifying long messages outside the exception class

(TRY003)

🤖 Prompt for AI Agents
In databusclient/api/delete.py around lines 100 to 112, the code assumes
get_json_ld_from_databus returns valid JSON and that "databus:hasVersion"
exists; if the fetch returns invalid JSON or the key is missing/None the code
will error. Wrap json.loads(artifact_body) in a try/except to catch
JSONDecodeError and raise a clear ValueError, then normalize versions by
checking if versions is None and setting it to an empty list (or raising a
ValueError with a clear message), handle single-version dict -> [dict] and
ensure versions is a list before iterating, and finally raise the existing
ValueError("No versions found...") if version_uris is empty.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
databusclient/client.py (1)

646-654: Type hint mismatch: function is a generator but declares List[str] return type.

The function uses yield making it a generator, but the return type annotation says List[str]. This works at runtime since callers iterate over results, but may cause type checker warnings.

-def __handle_databus_file_query__(endpoint_url, query, databus_key=None) -> List[str]:
+def __handle_databus_file_query__(endpoint_url, query, databus_key=None) -> Generator[str, None, None]:

Also add Generator to the imports:

-from typing import List, Dict, Tuple, Optional, Union
+from typing import List, Dict, Tuple, Optional, Union, Generator
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9d6de50 and 565ad59.

📒 Files selected for processing (2)
  • databusclient/api/delete.py (1 hunks)
  • databusclient/client.py (13 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
databusclient/client.py (1)
databusclient/api/utils.py (2)
  • get_databus_id_parts_from_uri (4-18)
  • get_json_ld_from_databus (20-37)
databusclient/api/delete.py (2)
databusclient/api/utils.py (2)
  • get_databus_id_parts_from_uri (4-18)
  • get_json_ld_from_databus (20-37)
databusclient/cli.py (1)
  • delete (121-134)
🪛 Ruff (0.14.7)
databusclient/client.py

571-571: Avoid specifying long messages outside the exception class

(TRY003)

databusclient/api/delete.py

54-54: Avoid specifying long messages outside the exception class

(TRY003)


57-57: Avoid specifying long messages outside the exception class

(TRY003)


73-73: Create your own exception

(TRY002)


73-73: Avoid specifying long messages outside the exception class

(TRY003)


114-114: Avoid specifying long messages outside the exception class

(TRY003)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: build
  • GitHub Check: build
🔇 Additional comments (10)
databusclient/api/delete.py (5)

7-30: LGTM!

The confirmation flow handles all user input cases correctly with clear messaging.


33-73: LGTM!

The timeout has been added to the DELETE request (line 68), addressing the previous review concern. The confirmation, dry-run, and error handling flow is correct.


76-85: LGTM!

Simple and correct batch deletion with proper parameter propagation.


122-153: LGTM!

The group deletion flow correctly cascades through artifacts and their versions before deleting the group itself, with proper parameter propagation.


155-188: LGTM!

The orchestrator correctly routes to appropriate deletion handlers based on URI structure. Previous review issues (shell-style variable, unused variables) have been addressed.

databusclient/client.py (5)

11-12: LGTM!

Consolidating URI parsing and JSON-LD fetching into shared utilities improves maintainability.


525-548: Good addition of timeouts.

Consistent 30-second timeouts on all request paths prevent indefinite hangs on network issues.


624-643: LGTM!

The optional databus_key parameter correctly propagates authentication to SPARQL queries.


721-726: LGTM!

Previous review concerns addressed: timeout added and explicit str | None type hint replaces implicit Optional.


771-830: LGTM!

The databus_key is consistently propagated through all download paths (collections, files, versions, artifacts, groups, and SPARQL queries), enabling authenticated access to protected resources.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (2)
databusclient/api/delete.py (2)

33-73: Tighten HTTP error handling and use a more specific exception type

The basic flow (confirm → build headers → dry‑run → DELETE) looks fine, but two aspects could be improved:

  • Wrap the requests.delete call in a try/except for requests.Timeout and requests.RequestException so network issues produce a concise, contextual error instead of a raw stack trace.
  • Avoid raising bare Exception; introduce a small, dedicated exception (e.g., DatabusDeleteError) or at least a more specific builtin like RuntimeError, and centralize the message there. This matches the Ruff TRY00x hints and makes failures easier to distinguish programmatically.

Example sketch:

+class DatabusDeleteError(RuntimeError):
+    """Deletion of a Databus resource failed."""
+
 def _delete_resource(databusURI: str, databus_key: str, dry_run: bool = False, force: bool = False):
@@
-    response = requests.delete(databusURI, headers=headers, timeout=30)
-
-    if response.status_code in (200, 204):
-        print(f"Successfully deleted: {databusURI}")
-    else:
-        raise Exception(f"Failed to delete {databusURI}: {response.status_code} - {response.text}")
+    try:
+        response = requests.delete(databusURI, headers=headers, timeout=30)
+        if response.status_code in (200, 204):
+            print(f"Successfully deleted: {databusURI}")
+        else:
+            raise DatabusDeleteError(
+                f"Failed to delete {databusURI}: {response.status_code} - {response.text}"
+            )
+    except requests.Timeout as exc:
+        raise DatabusDeleteError(f"Timeout while deleting {databusURI}") from exc
+    except requests.RequestException as exc:
+        raise DatabusDeleteError(f"HTTP error while deleting {databusURI}") from exc
Confirm the recommended pattern for handling `requests` errors (Timeout and RequestException) and whether raising a custom exception that wraps `response.status_code` and `response.text` is idiomatic for CLI/library hybrid tools.

88-123: Harden JSON‑LD parsing and version handling in _delete_artifact

The overall flow (fetch JSON‑LD → collect versions → delete versions → delete artifact) is good, but JSON handling could be more defensive:

  • json.loads(artifact_body) will currently surface a raw JSONDecodeError if Databus returns malformed JSON‑LD.
  • versions = json_dict.get("databus:hasVersion") might be None or of an unexpected type; the current code gracefully logs None, but an unexpected non‑list/dict value would lead to confusing errors when iterated.

Consider normalizing and validating the structure explicitly, and mapping decode failures to a clearer error:

-    artifact_body = get_json_ld_from_databus(databusURI, databus_key)
-
-    json_dict = json.loads(artifact_body)
-    versions = json_dict.get("databus:hasVersion")
+    artifact_body = get_json_ld_from_databus(databusURI, databus_key)
+    try:
+        json_dict = json.loads(artifact_body)
+    except json.JSONDecodeError as exc:
+        raise ValueError(f"Invalid JSON-LD for artifact {databusURI}") from exc
+
+    versions = json_dict.get("databus:hasVersion")
@@
-    # Single version case {}
-    if isinstance(versions, dict):
-        versions = [versions]
-    # Multiple versions case [{}, {}]
-
-    # If versions is None or empty skip
-    if versions is None:
-        print(f"No versions found for artifact: {databusURI}")
-    else:
-        version_uris = [v["@id"] for v in versions if "@id" in v]
+    # Normalize: single version dict → list; unexpected types → error
+    if isinstance(versions, dict):
+        versions = [versions]
+    elif versions is None:
+        print(f"No versions found for artifact: {databusURI}")
+        versions = []
+    elif not isinstance(versions, list):
+        raise ValueError(f"Unexpected 'databus:hasVersion' format for {databusURI}: {type(versions)}")
+
+    if versions:
+        version_uris = [v["@id"] for v in versions if "@id" in v]
         if not version_uris:
             print(f"No version URIs found in artifact JSON-LD for: {databusURI}")
         else:
             # Delete all versions
             _delete_list(version_uris, databus_key, dry_run=dry_run, force=force)

This keeps your current behavior (log and continue when there are simply no versions) but fails fast with clearer messages on malformed JSON or schema drift.

🧹 Nitpick comments (2)
databusclient/api/delete.py (2)

124-155: Mirror JSON‑LD robustness in _delete_group

The group deletion logic is sound—collect artifact URIs from databus:hasArtifact, filter out versioned ones, delete artifacts (and their versions), then the group. Similar to _delete_artifact, you might want to:

  • Wrap json.loads(group_body) in a try/except json.JSONDecodeError and raise a ValueError with the group URI for context.
  • Validate that artifacts is a list of dicts before iterating, and fail with a clear error if the JSON‑LD shape changes (rather than a cryptic AttributeError on .get).

This would align error handling between artifact and group deletion and give callers consistent diagnostics when the Databus response is not as expected.


157-190: Routing and force/dry‑run propagation look correct

The main delete dispatcher correctly:

  • Parses URIs via get_databus_id_parts_from_uri.
  • Routes by specificity (file → version → artifact → group), with a special case for collections.
  • Propagates dry_run and force consistently into _delete_resource, _delete_artifact, and _delete_group.
  • Clearly communicates unsupported cases with print.

If you expect this to be used as a library as well as a CLI, a future enhancement could be to return a per‑URI result object (success/skip/error) instead of only printing, but the current behavior is coherent for a CLI‑focused API.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 565ad59 and b6eb33e.

📒 Files selected for processing (1)
  • databusclient/api/delete.py (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
databusclient/api/delete.py (2)
databusclient/api/utils.py (2)
  • get_databus_id_parts_from_uri (4-18)
  • get_json_ld_from_databus (20-37)
databusclient/cli.py (1)
  • delete (121-134)
🪛 Ruff (0.14.7)
databusclient/api/delete.py

54-54: Avoid specifying long messages outside the exception class

(TRY003)


57-57: Avoid specifying long messages outside the exception class

(TRY003)


73-73: Create your own exception

(TRY002)


73-73: Avoid specifying long messages outside the exception class

(TRY003)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: build
  • GitHub Check: build
🔇 Additional comments (2)
databusclient/api/delete.py (2)

7-30: Interactive confirmation helper looks solid

The confirmation flow is clear, loops until valid input, and exposes a small, well-defined API ("confirm" | "skip" | "cancel"). No issues here.


76-86: Batch deletion helper is straightforward and correct

The list helper just delegates to _delete_resource while propagating dry_run and force, which keeps behavior consistent and avoids duplicated logic.

@Integer-Ctrl Integer-Ctrl merged commit ace93c5 into main Dec 4, 2025
3 checks passed
@coderabbitai coderabbitai bot mentioned this pull request Dec 9, 2025
@Integer-Ctrl Integer-Ctrl deleted the delete-datasets branch December 9, 2025 14:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants