Skip to content

Logs: incorrect number of pages to crawl #1682

@andrebastosdias

Description

@andrebastosdias

Logs in crawlee 1.0.0

[BeautifulSoupCrawler] INFO  Crawled 0/1 pages, 0 failed requests, desired concurrency 10.
[BeautifulSoupCrawler] INFO  Current request statistics:
┌───────────────────────────────┬────────┐
│ requests_finished0      │
│ requests_failed0      │
│ retry_histogram               │ [0]    │
│ request_avg_failed_durationNone   │
│ request_avg_finished_durationNone   │
│ requests_finished_per_minute0      │
│ requests_failed_per_minute0      │
│ request_total_duration0s     │
│ requests_total0      │
│ crawler_runtime42.7ms │
└───────────────────────────────┴────────┘
[crawlee._autoscaling.autoscaled_pool] INFO  current_concurrency = 0; desired_concurrency = 10; cpu = 0; mem = 0; event_loop = 0.0; client_info = 0.0
[BeautifulSoupCrawler] INFO  Crawled 22/70 pages, 0 failed requests, desired concurrency 10.
[BeautifulSoupCrawler] INFO  Crawled 61/70 pages, 0 failed requests, desired concurrency 11.
[crawlee._autoscaling.autoscaled_pool] INFO  Waiting for remaining tasks to finish
[BeautifulSoupCrawler] INFO  Final request statistics:
┌───────────────────────────────┬────────────┐
│ requests_finished70         │
│ requests_failed0          │
│ retry_histogram               │ [70]       │
│ request_avg_failed_durationNone       │
│ request_avg_finished_duration2.82s      │
│ requests_finished_per_minute196        │
│ requests_failed_per_minute0          │
│ request_total_duration3min 17.4s │
│ requests_total70         │
│ crawler_runtime21.38s     │
└───────────────────────────────┴────────────┘

Same code in crawlee 1.2.0 and 1.3.0

[BeautifulSoupCrawler] INFO  Current request statistics:
┌───────────────────────────────┬──────┐
│ requests_finished0    │
│ requests_failed0    │
│ retry_histogram               │ [0]  │
│ request_avg_failed_durationNone │
│ request_avg_finished_durationNone │
│ requests_finished_per_minute0    │
│ requests_failed_per_minute0    │
│ request_total_duration0s   │
│ requests_total0    │
│ crawler_runtime0s   │
└───────────────────────────────┴──────┘
[BeautifulSoupCrawler] INFO  Crawled 0/318 pages, 0 failed requests, desired concurrency 10.
[crawlee._autoscaling.autoscaled_pool] INFO  current_concurrency = 0; desired_concurrency = 10; cpu = 0; mem = 0; event_loop = 0.0; client_info = 0.0
[BeautifulSoupCrawler] INFO  Crawled 25/387 pages, 0 failed requests, desired concurrency 10.
[BeautifulSoupCrawler] INFO  Crawled 55/387 pages, 0 failed requests, desired concurrency 11.
[crawlee._autoscaling.autoscaled_pool] INFO  Waiting for remaining tasks to finish
[BeautifulSoupCrawler] INFO  Final request statistics:
┌───────────────────────────────┬────────────┐
│ requests_finished70         │
│ requests_failed0          │
│ retry_histogram               │ [70]       │
│ request_avg_failed_durationNone       │
│ request_avg_finished_duration3.37s      │
│ requests_finished_per_minute178        │
│ requests_failed_per_minute0          │
│ request_total_duration3min 55.7s │
│ requests_total70         │
│ crawler_runtime23.55s     │
└───────────────────────────────┴────────────┘

Metadata

Metadata

Assignees

Labels

bugSomething isn't working.t-toolingIssues with this label are in the ownership of the tooling team.

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions