Dissuade crawlers from downloading super.wasm#6664
Merged
Conversation
mattnibs
approved these changes
Feb 25, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What's Changing
This adds a https://superdb.org/robots.txt to dissuade crawlers from downloading the
/super.wasmfile.Why
Details
We just received a bill from Netlify in which we had to pay for an overage on bandwidth. I've dug into it and it looks like around January 5th a crawler in Singapore started aggressively hitting superdb.org (see this HackerNews link for examples of others dealing with the same). Since this
/super.wasmis so much larger than the other content on the site, nearly all of the overage bandwidth went toward serving that file repeatedly, so dissuading crawlers from downloading it will hopefully squash most of the problem.It sounds like there's no guarantee they'll obey this, though. I've already tactically deployed the
robots.txtand unfortunately the crawling from Singapore has not yet ceased. However guidance from Claude AI has been to wait at least 24 hours before making any conclusions since apparently crawlers will often finish flushing out an existing queue of work before they re-run from the top, at which time they'll hopefully pick up on therobots.txtand do the right thing. And if they still don't stop after 24 hours, I've got other more aggressive ideas to try to get them to back off.