Intelligent HTML analysis component for Large Action Model (LAM) web automation systems
A powerful, production-ready DOM parser that transforms raw HTML into structured, analyzable data for AI-driven web automation. Built for seamless integration with browser controllers and LAM systems.
- 🔍 Multi-Parser HTML Processing - BeautifulSoup + lxml + html5lib with intelligent fallbacks
- 🎯 Interactive Element Detection - Automatic classification of clickable, form, and navigation elements
- 🧠 Semantic Content Analysis - Intelligent extraction of headings, paragraphs, navigation, and content blocks
- 📋 Advanced Form Analysis - Comprehensive form structure detection with field type classification
- ♿ Accessibility Analysis - ARIA roles, labels, and accessibility compliance checking
- 🏗️ Page Structure Mapping - Layout detection and hierarchical content organization
- 🔗 Robust Element Locators - CSS selectors and XPath generation for reliable element targeting
- 🤖 Browser Controller Ready - Direct integration with existing browser automation systems
- ⚡ Async/Await Support - Non-blocking operations for high-performance automation
- 💾 Smart Caching - Configurable caching to optimize repeated analysis
- 📊 Performance Insights - Page complexity analysis and optimization recommendations
- 🔄 Similarity Detection - Find similar elements for pattern-based automation
# Clone the repository
git clone https://github.com/Large-Action-Model/dom_parser.git
cd dom_parser
# Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install in development mode
pip install -e .
# Or install with development dependencies
pip install -e .[dev]pip install -r requirements.txt- Basic usage:
from dom_parser import DOMParser
# Initialize parser
parser = DOMParser()
# Analyze HTML content
analysis = await parser.parse_page(
html_source="<html>...</html>",
url="https://example.com"
)
# Access structured results
clickable_elements = analysis.get_clickable_elements()
forms = analysis.form_structures
semantic_blocks = analysis.semantic_blocksfrom dom_parser import DOMParser
from browser_controller import BrowserController
# Initialize components
browser = BrowserController(config)
parser = DOMParser()
async with browser:
session = await browser.create_session()
await session.navigate_to("https://example.com")
# Get page info from browser
page_info = await session.get_page_info()
# Analyze with DOM parser
analysis = await parser.parse_page_from_browser_info(page_info)
# Use results for automation
for element in analysis.get_clickable_elements():
print(f"Found clickable: {element.visible_text}")dom_parser/
├── dom_parser/ # Main package (moved from src/)
│ ├── __init__.py # Package initialization and exports
│ ├── core/ # Core parsing and analysis logic
│ │ ├── __init__.py
│ │ ├── dom_parser.py # Main orchestration class
│ │ ├── element_classifier.py # Element type detection
│ │ ├── semantic_extractor.py # Content structure analysis
│ │ └── structure_mapper.py # Page layout analysis
│ ├── analyzers/ # Specialized analysis modules
│ │ ├── __init__.py
│ │ ├── accessibility_analyzer.py # Accessibility features
│ │ ├── form_analyzer.py # Form structure analysis
│ │ └── html_analyzer.py # HTML parsing & cleaning
│ ├── types/ # Data types and models
│ │ ├── __init__.py
│ │ ├── dom_data_types.py # Core data structures
│ │ └── element_data_types.py # Element enums & types
│ └── utils/ # Utility functions
│ ├── __init__.py
│ ├── css_selector_generator.py # CSS selector creation
│ └── xpath_generator.py # XPath expression generation
├── tests/ # Test suite
├── examples/ # Usage examples
│ └── example_integration.py
├── docs/ # Documentation
├── scripts/ # Build and deployment scripts
├── setup.py # Package configuration
├── pyproject.toml # Development tools configuration
└── requirements.txt # Dependencies
The main orchestration class that coordinates all analysis components.
parser = DOMParser({
'enable_cache': True,
'include_hidden_elements': False,
'confidence_threshold': 0.7,
'max_cache_size': 100
})Comprehensive data structures for all analysis results:
DOMAnalysisResult- Complete analysis outputInteractiveElement- Clickable elements with locatorsFormStructure- Form analysis with field classificationSemanticBlock- Content blocks with semantic meaningPageStructure- Layout and hierarchical organizationAccessibilityInfo- ARIA and accessibility data
Intelligent element type detection:
# Element types
ElementType.BUTTON, ElementType.LINK, ElementType.FORM
ElementType.INPUT, ElementType.NAVIGATION, ElementType.MAIN
# Interaction types
InteractionType.CLICK, InteractionType.FORM_SUBMIT
InteractionType.TEXT_INPUT, InteractionType.SELECTIONconfig = {
'enable_cache': True, # Cache analysis results
'max_cache_size': 100, # Maximum cached analyses
'include_hidden_elements': False, # Include display:none elements
'confidence_threshold': 0.7, # Minimum confidence for classification
'max_depth': 50, # Maximum DOM traversal depth
'timeout_seconds': 30, # Analysis timeout
}
parser = DOMParser(config)- lxml (primary) - Fast C-based parser
- BeautifulSoup - Robust Python parser
- html5lib - Standards-compliant fallback
for element in analysis.interactive_elements:
print(f"Type: {element.element_type}")
print(f"Text: {element.visible_text}")
print(f"CSS: {element.locators['css']}")
print(f"XPath: {element.locators['xpath']}")
print(f"Confidence: {element.confidence_score}")for form in analysis.form_structures:
print(f"Form type: {form.form_type}")
print(f"Action: {form.action}")
print(f"Fields: {len(form.fields)}")
for field in form.fields:
print(f" {field.form_field_type}: {field.visible_text}")for block in analysis.semantic_blocks:
print(f"Type: {block.semantic_type}")
print(f"Content: {block.text_content[:100]}...")
print(f"Elements: {len(block.elements)}")Run the example integration to test functionality:
# Activate virtual environment
python -m venv venv
source venv/bin/activate # Linux/Mac
# or
venv\Scripts\activate # Windows
# Install dependencies
pip install -r requirements.txt
# Run example
python example_integration.pyclass WebAutomationAgent:
def __init__(self):
self.browser = BrowserController()
self.dom_parser = DOMParser()
async def analyze_page(self, url: str):
async with self.browser:
session = await self.browser.create_session()
await session.navigate_to(url)
page_info = await session.get_page_info()
analysis = await self.dom_parser.parse_page_from_browser_info(page_info)
return self._extract_actionable_elements(analysis)
def _extract_actionable_elements(self, analysis):
"""Extract elements suitable for AI decision-making"""
return {
'buttons': [elem for elem in analysis.get_clickable_elements()
if elem.element_type == ElementType.BUTTON],
'forms': analysis.form_structures,
'navigation': analysis.navigation_structure.primary_navigation,
'content': analysis.semantic_blocks
}# Find specific elements
search_boxes = [elem for elem in analysis.interactive_elements
if elem.form_field_type == FormFieldType.SEARCH]
submit_buttons = [elem for elem in analysis.interactive_elements
if elem.interaction_type == InteractionType.FORM_SUBMIT]
# Use similarity detection
similar_elements = await parser.find_similar_elements(target_element_id)- Fast Analysis: ~50-100ms for typical web pages
- Memory Efficient: Smart caching with configurable limits
- Scalable: Async processing for concurrent analysis
- Robust: Multiple parser fallbacks for reliability
- BeautifulSoup 4.12.0+ - HTML parsing and DOM manipulation
- lxml 4.9.0+ - Fast XML/HTML processing backend
- Pydantic 2.0.0+ - Data validation and type safety
- html5lib 1.1+ - Standards-compliant HTML parsing
- loguru 0.7.0+ - Structured logging
Comprehensive documentation is available in the docs/ directory:
- Getting Started - Installation and basic usage
- API Reference - Complete API documentation
- Configuration - Configuration options and customization
- Integration Guide - Browser controller and LAM integration
- Troubleshooting - Common issues and solutions
This DOM Parser was built as part of a Large Action Model web automation system. For questions, improvements, or integration support, please refer to the documentation or create an issue.
This project is part of a LAM automation system. Please refer to your project's license terms.
Built for intelligent web automation 🤖 | Powered by advanced DOM analysis 🔍 | Ready for LAM integration 🚀