Don't index sitemap in opensearch #1803
Answered
by
rzo1
JaspreetSinghChahal
asked this question in
Q&A
-
|
In stormcrawler, I want to fetch and parse sitemaps to discover new urls but don't want to index it in opensearch. I can't exclude it in regex as new urls won't be discovered then. |
Beta Was this translation helpful? Give feedback.
Answered by
rzo1
Feb 9, 2026
Replies: 1 comment 4 replies
-
|
The sitemap parser bolt will set a metadata key |
Beta Was this translation helpful? Give feedback.
4 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Yes, you are right. The current implementation is a bit weird in that sense.
From my POV, filtering is exclusion-based, not inclusion-based, i.e. “exclude documents matching these conditions”.
That means:
For each filter entry:
In that sense, I think, that we should fix it (or introduce a new config option to be backward compatbile).