Supermarket-Mercadona-Scraper is a Python-based web scraper designed to extract and organize product data from the Mercadona supermarket website. This tool automatically navigates through all product categories and subcategories, extracting comprehensive product information and saving it to CSV format.
- 🔄 Automated Category Navigation: Systematically browses through all Mercadona product categories and subcategories
- 📊 Comprehensive Data Extraction: Captures product name, price, image URL, and category information
- 🛡️ Anti-Detection Technology: Uses undetectable Chrome driver to avoid bot detection
- 📍 Postal Code Support: Handles location-based pricing and availability
- 💾 CSV Export: Automatically saves data with timestamped filenames
- ⚡ Robust Error Handling: Graceful handling of connection issues and page load problems
- 🎯 Smart Element Waiting: Intelligent waiting for dynamic content to load
- 🔀 Random Delays: Implements human-like browsing patterns to avoid detection
- ✅ Updated Dependencies: Migrated to latest SeleniumBase for better stability
- ✅ Enhanced Anti-Detection: Improved undetectable browsing capabilities
- ✅ Better Error Handling: More robust exception handling and recovery mechanisms
- ✅ Optimized Performance: Reduced unnecessary waits and improved scraping speed
- ✅ Code Documentation: Added comprehensive comments and docstrings
- ✅ Modular Architecture: Separated auxiliary functions for better maintainability
- ✅ Cross-Platform Compatibility: Tested on Windows, macOS, and Linux
- Python 3.8 or higher
- Chrome/Chromium browser
- Active internet connection
-
Clone the repository:
git clone https://github.com/[username]/supermarket-mercadona-scraper.git cd supermarket-mercadona-scraper -
Create a virtual environment (recommended):
python -m venv .venv source .venv/bin/activate # On Windows: .venv\Scripts\activate
-
Install dependencies:
pip install -r requirements.txt pip install seleniumbase
Note: The Chrome driver will be automatically downloaded on first run.
-
Run the scraper:
python scraper.py
-
Enter your postal code when prompted:
Introduce tu código postal (ej: 28001): 28001 -
Wait for the scraping to complete. The script will:
- Open Chrome browser
- Navigate to Mercadona website
- Accept cookies and enter postal code
- Systematically scrape all categories
- Save results to
mercadona_YYYY-MM-DD.csv
The scraper generates a CSV file with the following columns:
| Column | Description | Example |
|---|---|---|
titulo |
Product name | "Leche Entera Hacendado" |
imagen |
Product image URL | "https://..." |
precio |
Product price (€) | "1.25" |
categoria |
Product category | "Lácteos y Huevos" |
The scraper uses optimized browser settings for reliability:
driver = Driver(
browser="chrome", # Uses Chrome browser
uc=True, # Undetectable Chrome mode
headless2=False, # Visible browser (set True for headless)
incognito=False, # Regular browsing mode
agent='Mozilla/5.0...', # Custom user agent
do_not_track=True, # Privacy protection
undetectable=True # Advanced bot detection evasion
)- Headless Mode: Set
headless2=Truefor background execution - Custom Delays: Modify
time.sleep()values for different timing - Output Format: Customize the
mercadona_csv()function for different formats
supermarket-mercadona-scraper/
├── scraper.py # Main scraping script
├── funcionesAux.py # Auxiliary functions for element handling
├── requirements.txt # Python dependencies
├── README.md # This documentation
└── mercadona_YYYY-MM-DD.csv # Generated output files
Chrome Driver Issues:
- The driver downloads automatically on first run
- Ensure Chrome browser is installed and updated
Connection Timeouts:
- Check your internet connection
- Mercadona website might be temporarily unavailable
Postal Code Errors:
- Use valid Spanish postal codes (5 digits)
- Some areas might not have Mercadona delivery
Bot Detection:
- The script includes anti-detection measures
- If blocked, wait some time before retrying
Contributions are welcome! Please feel free to:
- Report bugs
- Suggest new features
- Submit pull requests
- Improve documentation
This tool is for educational and research purposes only. Please:
- Respect Mercadona's terms of service
- Use responsibly and avoid overloading their servers
- Consider the legal implications in your jurisdiction
If you encounter any issues or have questions:
- Check the troubleshooting section
- Review existing issues in the repository
- Create a new issue with detailed information
⭐ If this project helped you, please give it a star!
