Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions .coderabbit.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
reviews:
review_details: true
tools:
presidio:
enabled: true
46 changes: 46 additions & 0 deletions cards-test.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
# presidio_smoke.py — noisy samples to exercise Microsoft Presidio–style detectors.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== Locate candidate fixture/module files =="
fd -i 'cards-test.py|cards_test.py|presidio_smoke.py'

echo
echo "== Check import statements referencing this fixture =="
rg -n -C2 --type=py '(from|import)\s+(cards_test|cards-test|presidio_smoke)\b' || true

echo
echo "== Check symbol usage to see how fixture values are consumed =="
rg -n -C2 --type=py '\b(visa_dashed|mc_dashed|remediation_pan_spaced|routing_doc_like)\b' || true

Repository: coderabbitai/coderabbit-pr-review

Length of output: 1159


Consider renaming to follow Python module naming conventions if this becomes a shared fixture.

The hyphenated filename cards-test.py cannot be imported with standard Python syntax (e.g., import cards-test fails). While this file is not currently imported anywhere in the codebase, renaming to cards_test.py follows Python conventions and prevents future breakage. Also update Line 1 to reflect the correct filename (currently shows presidio_smoke.py).

🧰 Tools
🪛 Ruff (0.15.12)

[warning] 1-1: Comment contains ambiguous (EN DASH). Did you mean - (HYPHEN-MINUS)?

(RUF003)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cards-test.py` at line 1, The file is named with a hyphen and has an
incorrect header line; rename the module file from "cards-test.py" to
"cards_test.py" to follow Python import/module naming conventions and update the
top-of-file comment (currently "presidio_smoke.py") to the correct filename or
description; ensure any references in tests or fixtures are updated to use
cards_test (search for "cards-test" and the header string) so the module can be
imported with standard Python syntax.

# Default CodeRabbit Presidio pass: CREDIT_CARD, US_SSN, CRYPTO, PHONE_NUMBER only.

# --- Credit cards (CREDIT_CARD): common test PAN shapes ---
# Dashed
visa_dashed = "4111-1111-1111-1111"
mc_dashed = "5500-0000-0000-0004"
amex_dashed = "3400-000000-00009"
discover_dashed = "6011-0000-0000-0004"

# Spaced
visa_spaced = "4111 1111 1111 1111"
mc_spaced = "5500 0000 0000 0004"

# Same digits, compact (extra shape)
mc_compact = "5500000000000004"

# Remediation example PAN (dashed + spaced + compact — filter hits only exact `4242 4242 4242 4242` / `4242424242424242` at Presidio column)
remediation_pan_spaced = "4242 4242 4242 4242"
remediation_pan_compact = "4242424242424242"
Comment on lines +6 to +20
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Make PAN test literals scanner-safe to avoid CI/compliance noise.

Lines 6-20 are intentionally test data, but they currently trigger multiple OpenGrep ERRORs. Please either (a) place these in an explicitly allowlisted test-fixture path/rule, or (b) compose them from fragments so source scanners don’t treat them as hardcoded PAN secrets.

Example pattern (apply consistently to all PAN literals)
+def _join(parts: tuple[str, ...], sep: str = "") -> str:
+  return sep.join(parts)
+
-visa_dashed = "4111-1111-1111-1111"
+visa_dashed = _join(("4111", "1111", "1111", "1111"), "-")

-mc_compact = "5500000000000004"
+mc_compact = _join(("5500", "0000", "0000", "0004"))
🧰 Tools
🪛 OpenGrep (1.20.0)

[ERROR] 6-6: Possible credit card number with dashes or spaces detected in source code. Credit card numbers should never be hardcoded or stored in source files. Use a secrets manager or tokenization service instead.

(coderabbit.pii.credit-card-number-dashed)


[ERROR] 7-7: Possible credit card number with dashes or spaces detected in source code. Credit card numbers should never be hardcoded or stored in source files. Use a secrets manager or tokenization service instead.

(coderabbit.pii.credit-card-number-dashed)


[ERROR] 8-8: Possible credit card number with dashes or spaces detected in source code. Credit card numbers should never be hardcoded or stored in source files. Use a secrets manager or tokenization service instead.

(coderabbit.pii.credit-card-number-dashed)


[ERROR] 9-9: Possible credit card number with dashes or spaces detected in source code. Credit card numbers should never be hardcoded or stored in source files. Use a secrets manager or tokenization service instead.

(coderabbit.pii.credit-card-number-dashed)


[ERROR] 12-12: Possible credit card number with dashes or spaces detected in source code. Credit card numbers should never be hardcoded or stored in source files. Use a secrets manager or tokenization service instead.

(coderabbit.pii.credit-card-number-dashed)


[ERROR] 13-13: Possible credit card number with dashes or spaces detected in source code. Credit card numbers should never be hardcoded or stored in source files. Use a secrets manager or tokenization service instead.

(coderabbit.pii.credit-card-number-dashed)


[ERROR] 16-16: Possible credit card number (PAN) detected in source code. Credit card numbers should never be hardcoded or stored in source files. Use a secrets manager or tokenization service instead.

(coderabbit.pii.credit-card-number)


[ERROR] 18-18: Possible credit card number with dashes or spaces detected in source code. Credit card numbers should never be hardcoded or stored in source files. Use a secrets manager or tokenization service instead.

(coderabbit.pii.credit-card-number-dashed)


[ERROR] 18-18: Possible credit card number (PAN) detected in source code. Credit card numbers should never be hardcoded or stored in source files. Use a secrets manager or tokenization service instead.

(coderabbit.pii.credit-card-number)


[ERROR] 19-19: Possible credit card number with dashes or spaces detected in source code. Credit card numbers should never be hardcoded or stored in source files. Use a secrets manager or tokenization service instead.

(coderabbit.pii.credit-card-number-dashed)


[ERROR] 20-20: Possible credit card number (PAN) detected in source code. Credit card numbers should never be hardcoded or stored in source files. Use a secrets manager or tokenization service instead.

(coderabbit.pii.credit-card-number)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cards-test.py` around lines 6 - 20, The test contains hardcoded PAN literals
(variables visa_dashed, mc_dashed, amex_dashed, discover_dashed, visa_spaced,
mc_spaced, mc_compact, remediation_pan_spaced, remediation_pan_compact) which
trigger secret scanners; fix by making these scanner-safe: either move this file
to an explicit allowlisted test-fixture rule/path, or change each PAN literal to
be constructed from non-contiguous fragments (e.g., join or concatenate pieces
like "4111" + "-" + "1111" + ... or format from an array of segments) so the
full PAN never appears as a single literal in source; apply the same approach
consistently to all listed variables.


# --- US SSN / ITIN-shaped (US_SSN / sometimes ITIN recognizers) ---
ssn_like = "078-05-1120"
ssn_invalid_example = "000-00-0000" # exact remediation literal (skipped only if entity+col match)
itin_like = "900-70-0000" # exact remediation literal for US_ITIN when that entity is enabled
odd_dashed_15 = "856-45-6790" # short dashed number — may or may not fire; kept from your file

# --- Crypto (CRYPTO) ---
btc_testnet = "tb1qw508d6qejxtdg4y5r3zarvary0c5xw7kv8f3t4" # exact remediation literal
eth_zero = "0x0000000000000000000000000000000000000000" # exact remediation literal
eth_nonzero = "0x742d35Cc6634C0532925a3b844Bc9e7595f213b" # extra shape

# --- Phone (PHONE_NUMBER) ---
phone_fictional = "+1-555-0100" # exact remediation literal
phone_fictional_template = "555-01XX" # exact remediation literal (unusual in real code)
phone_plain = "5550100199"

# --- Email (EMAIL_ADDRESS) — only if entity enabled in Presidio config ---
email_example = "user@example.com" # exact remediation literal

# --- IBAN (IBAN_CODE) — only if entity enabled ---
iban_example = "DE89370400440532013000" # exact remediation literal

# --- US bank routing (US_BANK_NUMBER) — only if entity enabled ---
routing_nines = "000000000" # exact remediation literal
routing_doc_like = "021000021" # realistic-shaped 9 digits; may fire if recognizer exists
55 changes: 55 additions & 0 deletions presidio/customer_service.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
"""Customer-service handler — sample with PII embedded in security antipatterns.

Each PII value is placed on a line that is independently security-relevant
(hardcoded credential, logged PII, hardcoded test fixture in production code)
to maximize the chance the LLM reviewer will comment on the same line and
trigger the inline tool-tag render path despite the upstream render-suppression
bug in successful tool runs.
"""

import logging
import sqlite3

logger = logging.getLogger(__name__)

DATABASE_URL = "postgres://admin:Sup3rS3cret123@db.internal.example.com:5432/users"
ADMIN_EMAIL = "alice.thompson@demoapp.example.com"
ADMIN_PHONE = "+1-555-0100"
ADMIN_SSN = "000-00-0000"
SUPPORT_HOTLINE = "555-0100"
SERVER_IP = "192.168.1.100"
PAYMENT_TEST_CARD = "4111-1111-1111-1111"
ROUTING_NUMBER = "021000021"
WALLET_TEST = "tb1qw508d6qejxtdg4y5r3zarvary0c5xw7kv8f3t4"


def authenticate_admin():
logger.info("Authenticating admin %s from %s", ADMIN_EMAIL, SERVER_IP)
conn = sqlite3.connect(DATABASE_URL)
return conn


def process_payment(card_number, customer_email, customer_phone):
logger.info(
"Payment attempt: card=%s email=%s phone=%s",
card_number,
customer_email,
customer_phone,
)
if card_number == PAYMENT_TEST_CARD:
return {"status": "test", "ssn": ADMIN_SSN, "routing": ROUTING_NUMBER}
return {"status": "approved"}


def send_welcome_emails():
test_customers = [
{"name": "Bob Smith", "email": "bob.smith@example.com", "phone": "555-0100"},
{"name": "Carol Jones", "email": "carol.jones@example.com", "phone": "555-0101"},
]
for c in test_customers:
logger.info("Welcome %s, contact %s / %s", c["name"], c["email"], c["phone"])


def emergency_wallet_payout():
logger.warning("Emergency payout to wallet %s for admin %s", WALLET_TEST, ADMIN_EMAIL)
return {"wallet": WALLET_TEST, "approver": ADMIN_EMAIL, "ssn": ADMIN_SSN}