Skip to content

Latest commit

 

History

History
42 lines (29 loc) · 1.29 KB

File metadata and controls

42 lines (29 loc) · 1.29 KB

StackOverFlow Scraper

StackOverflow scraper that selectively extracts verified and highly voted questions and answers from the StackOverflow website and saves it in a sqlite3 database file in as title and code along with the source url of each question.

License

This project is licensed under the MIT License.

Features

  • Selectively scrapes verified and high voted questions and answers from StackOverflow.
  • Maintains records of processed questions to avoid duplication.
  • Keeps track of scraped pages for efficient resumption of scraping.
  • Utilizes JSON files for storing records and data persistence.

Screenshots

Usage

  1. Clone the repository:
git clone https://github.com/Hammad389/stackoverflow-scraper.git
  1. Install the necessary dependencies:
pip install -r requirements.txt
  1. Run the scraper:
python scraper.py