Skip to content

ADFA-3912 | Fix fuzzy attribute parsing bounds and UI element tag filtering#1275

Open
jatezzz wants to merge 1 commit intostagefrom
fix/ADFA-3912-parsing-bounds-and-tag-filter-experimental
Open

ADFA-3912 | Fix fuzzy attribute parsing bounds and UI element tag filtering#1275
jatezzz wants to merge 1 commit intostagefrom
fix/ADFA-3912-parsing-bounds-and-tag-filter-experimental

Conversation

@jatezzz
Copy link
Copy Markdown
Collaborator

@jatezzz jatezzz commented May 6, 2026

Description

Fixed an out-of-bounds string extraction error in FuzzyAttributeParser by adding a safeguard check (current.valueStart > valueEnd) before processing annotations. Additionally, updated the UI element filtering logic in YoloToXmlConverter so that the isTag exclusion only applies to elements explicitly labeled as "text". This ensures valid UI elements are not improperly excluded during conversion.

Details

document_5179472272527722523.mp4

Ticket

ADFA-3912

Observation

⚠️ Must be merged after #1274

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 6, 2026

📝 Walkthrough

Release Notes - ADFA-3912 | Fix fuzzy attribute parsing bounds and UI element tag filtering

Features & Fixes

  • FuzzyAttributeParser bounds check: Added defensive validation in parseByColonScanning at line 312 (if (current.valueStart > valueEnd) continue) to prevent StringIndexOutOfBoundsException during substring extraction. Processing is skipped when computed valueEnd precedes valueStart, gracefully handling malformed attribute annotations.

  • YoloToXmlConverter refactoring: Refactored XML generation pipeline with improved modularity by extracting inline logic into private helper functions (extractUiCandidates, scaleDetections, associateTextToWidgets, finalizeUiElements, extractCanvasTags). Each helper encapsulates a specific transformation step with clear responsibility separation.

  • UI element tag filtering correction: Updated isTag filtering logic in finalizeUiElements to apply exclusively to text elements. The filter condition (it.label != "text" || !annotationMatcher.isTag(it.text)) ensures that non-text UI elements (buttons, images, checkboxes, etc.) are no longer subject to tag-based exclusion, only actual text labels recognized as reference tags are filtered.

  • PlaceholderUtils extension function: Added buildPlaceholderOverrides extension on List<ScaledBox> to construct a mapping from sorted placeholder boxes to drawable resource references using indexed keys ("ph_0", "ph_1", etc.), replacing the previous inline approach.

Implementation Details

  • FuzzyAttributeParser.kt: +2 lines (single bounds check guard)
  • YoloToXmlConverter.kt: +74/-37 lines (pipeline refactored into 5 modular helper functions)
  • PlaceholderUtils.kt: +14 lines (new extension function with mapIndexedNotNull logic)

Dependencies & Integration

  • ⚠️ BLOCKING DEPENDENCY: Must be merged after PR #1274 to the stage branch. Ensure PR #1274 is fully integrated before deployment.

Risks & Considerations

  • ⚠️ High refactoring complexity in YoloToXmlConverter: The XML generation pipeline was significantly restructured with 5 new private helpers. While public API signatures remain unchanged, the internal data flow is distributed across function boundaries. Recommend extensive testing of:

    • UI candidate extraction (deduplication logic for switches vs. other widgets)
    • Text-to-widget association accuracy
    • Canvas tag extraction for margin annotations
    • Placeholder override mapping correctness
    • Edge cases with mixed text, tags, and metadata detections
  • ⚠️ Tag filtering logic specificity: The isTag check now only applies within finalizeUiElements to text-labeled elements. This represents a behavioral change for how mixed-type detections are filtered. Verify that UI elements previously incorrectly filtered due to tag metadata are now properly retained.

  • ⚠️ Silent skipping of malformed annotations: FuzzyAttributeParser now silently skips processing (via continue) when bounds violation is detected instead of throwing an exception. While this prevents crashes, it may mask attribute parsing edge cases. Consider monitoring logs for skipped annotations in production.

  • ⚠️ Dependency on WidgetAnnotationMatcher.isTag(): The tag filtering behavior now depends on the external isTag() method across multiple helper functions (associateTextToWidgets, finalizeUiElements, extractCanvasTags). Any changes to tag detection logic will affect all three filtering points.

  • ⚠️ New extension function API surface: buildPlaceholderOverrides is a new public extension function on List<ScaledBox>. Ensure this doesn't conflict with existing extension functions and document its expected usage pattern for other modules.

Walkthrough

The pull request refactors the XML generation pipeline in YoloToXmlConverter by extracting inline steps into private helper functions, adds a validation guard in FuzzyAttributeParser to prevent invalid substring extraction, and introduces a new extension function for building placeholder overrides.

Changes

Attribute Parsing Validation Guard

Layer / File(s) Summary
Input Validation
cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/FuzzyAttributeParser.kt
Guard added in parseByColonScanning loop to skip processing when valueStart > valueEnd, preventing invalid substring range extraction attempts.

XML Generation Pipeline Refactoring

Layer / File(s) Summary
New Utility Function
cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/utils/PlaceholderUtils.kt
Extension function buildPlaceholderOverrides added to map sorted placeholder boxes to drawable references using "ph_" keyed lookup.
Core Pipeline Refactoring
cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/YoloToXmlConverter.kt
generateXmlLayout refactored to delegate to private helper functions: extractUiCandidates, scaleDetections, associateTextToWidgets, finalizeUiElements, and extractCanvasTags. Placeholder handling switched from getSortedScaledPlaceholders to new buildPlaceholderOverrides extension, replacing inline buildSelectedImageOverrides.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

  • appdevforall/CodeOnTheGo#1185: Related — both PRs modify FuzzyAttributeParser, refactor YoloToXmlConverter by extracting helper functions, and add/update placeholder override utilities.
  • appdevforall/CodeOnTheGo#1230: Related — both PRs modify generateXmlLayout in YoloToXmlConverter, refactoring its detection and text-assignment flow.
  • appdevforall/CodeOnTheGo#1204: Related — both PRs modify YoloToXmlConverter and placeholder-override logic, with this PR extracting helpers and adding buildPlaceholderOverrides while the retrieved PR refactors into collaborators.

Suggested reviewers

  • avestaadfa
  • Daniel-ADFA

Poem

🐰 Fuzzy guards now keep the strings in place,
While pipelines split with elegant grace,
Placeholder maps dance in indexed arrays,
XML flows smooth through helper pathways!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The pull request title directly summarizes the main changes: fixing fuzzy attribute parsing bounds and updating UI element tag filtering logic.
Description check ✅ Passed The description clearly explains the two main fixes in the changeset: the out-of-bounds safeguard in FuzzyAttributeParser and the UI element filtering logic updates in YoloToXmlConverter.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/ADFA-3912-parsing-bounds-and-tag-filter-experimental

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@jatezzz jatezzz requested review from a team, Daniel-ADFA and avestaadfa May 6, 2026 17:50
@jatezzz jatezzz force-pushed the fix/ADFA-3912-parsing-bounds-and-tag-filter-experimental branch from 48bb962 to 35ac06e Compare May 6, 2026 19:56
Added bounds validation in FuzzyAttributeParser and updated UI element filter logic.
@jatezzz jatezzz force-pushed the fix/ADFA-3912-parsing-bounds-and-tag-filter-experimental branch from 35ac06e to efe479f Compare May 6, 2026 21:57
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/YoloToXmlConverter.kt`:
- Around line 87-95: In associateTextToWidgets change the parent selection so
you do not drop non-text boxes based on annotationMatcher.isTag: keep parents as
all boxes where it.label != "text" (remove the isTag check), while still
filtering initialTexts with annotationMatcher.isTag so only text boxes get
tag-filtered; then call geometryProcessor.assignTextToParents(parents,
initialTexts, scaledBoxes) and proceed with the existing remainingTexts and
geometryProcessor.assignNearbyTextToWidgets call unchanged (references:
associateTextToWidgets, annotationMatcher.isTag,
geometryProcessor.assignTextToParents,
geometryProcessor.assignNearbyTextToWidgets).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: b21178be-4d63-4592-b3a7-27fcbba14f63

📥 Commits

Reviewing files that changed from the base of the PR and between 69e03bf and efe479f.

📒 Files selected for processing (3)
  • cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/FuzzyAttributeParser.kt
  • cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/YoloToXmlConverter.kt
  • cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/utils/PlaceholderUtils.kt

Comment on lines +87 to +95
private fun associateTextToWidgets(scaledBoxes: List<ScaledBox>): List<ScaledBox> {
val parents = scaledBoxes.filter { it.label != "text" && !annotationMatcher.isTag(it.text) }
val initialTexts = scaledBoxes.filter { it.label == "text" && !annotationMatcher.isTag(it.text) }

val textAssignedBoxes = geometryProcessor.assignTextToParents(parents, initialTexts, scaledBoxes)

val remainingTexts = textAssignedBoxes.filter { it.label == "text" && !annotationMatcher.isTag(it.text) }
return geometryProcessor.assignNearbyTextToWidgets(textAssignedBoxes, remainingTexts)
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Restrict tag filtering to text boxes here too.

Line 88 still excludes any non-text box whose OCR text looks like a tag. That can drop valid widgets before text association, so the tag-filtering bug is still partially present despite the later fix in finalizeUiElements.

Suggested fix
 private fun associateTextToWidgets(scaledBoxes: List<ScaledBox>): List<ScaledBox> {
-    val parents = scaledBoxes.filter { it.label != "text" && !annotationMatcher.isTag(it.text) }
+    val parents = scaledBoxes.filter { it.label != "text" }
     val initialTexts = scaledBoxes.filter { it.label == "text" && !annotationMatcher.isTag(it.text) }
 
     val textAssignedBoxes = geometryProcessor.assignTextToParents(parents, initialTexts, scaledBoxes)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
private fun associateTextToWidgets(scaledBoxes: List<ScaledBox>): List<ScaledBox> {
val parents = scaledBoxes.filter { it.label != "text" && !annotationMatcher.isTag(it.text) }
val initialTexts = scaledBoxes.filter { it.label == "text" && !annotationMatcher.isTag(it.text) }
val textAssignedBoxes = geometryProcessor.assignTextToParents(parents, initialTexts, scaledBoxes)
val remainingTexts = textAssignedBoxes.filter { it.label == "text" && !annotationMatcher.isTag(it.text) }
return geometryProcessor.assignNearbyTextToWidgets(textAssignedBoxes, remainingTexts)
}
private fun associateTextToWidgets(scaledBoxes: List<ScaledBox>): List<ScaledBox> {
val parents = scaledBoxes.filter { it.label != "text" }
val initialTexts = scaledBoxes.filter { it.label == "text" && !annotationMatcher.isTag(it.text) }
val textAssignedBoxes = geometryProcessor.assignTextToParents(parents, initialTexts, scaledBoxes)
val remainingTexts = textAssignedBoxes.filter { it.label == "text" && !annotationMatcher.isTag(it.text) }
return geometryProcessor.assignNearbyTextToWidgets(textAssignedBoxes, remainingTexts)
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@cv-image-to-xml/src/main/java/org/appdevforall/codeonthego/computervision/domain/YoloToXmlConverter.kt`
around lines 87 - 95, In associateTextToWidgets change the parent selection so
you do not drop non-text boxes based on annotationMatcher.isTag: keep parents as
all boxes where it.label != "text" (remove the isTag check), while still
filtering initialTexts with annotationMatcher.isTag so only text boxes get
tag-filtered; then call geometryProcessor.assignTextToParents(parents,
initialTexts, scaledBoxes) and proceed with the existing remainingTexts and
geometryProcessor.assignNearbyTextToWidgets call unchanged (references:
associateTextToWidgets, annotationMatcher.isTag,
geometryProcessor.assignTextToParents,
geometryProcessor.assignNearbyTextToWidgets).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant