Skip to content

URL validator: accept user/password, use urllib.urlparse#847

Open
ianw wants to merge 1 commit intopallets-eco:mainfrom
ianw:url-auth
Open

URL validator: accept user/password, use urllib.urlparse#847
ianw wants to merge 1 commit intopallets-eco:mainfrom
ianw:url-auth

Conversation

@ianw
Copy link
Copy Markdown

@ianw ianw commented Jun 24, 2024

I was hitting an issue in a build tool that was not letting me specify a URL to clone a git tree with a personal access token (e.g. [1]) in a wtform URL field.

I started looking at expanding the original regex, but there are tricks like multiple "@"'s in passwords that are hard to get right. I think that for this purpose, urllib.urlparse (urlparse/urlsplit doesn't seem to matter here) will just "do the right thing".

The test-cases are expanded with some coverage of username/passwords.

[1] https://docs.gitlab.com/ee/user/profile/personal_access_tokens.html#clone-repository-using-personal-access-token

I was hitting an issue in a build tool that was not letting me specify
a URL to clone a git tree with a personal access token (e.g. [1]) in a
wtform URL field.

I started looking at expanding the original regex, but there are
tricks like multiple "@"'s in passwords that are hard to get right.  I
think that for this purpose, urllib.urlparse (urlparse/urlsplit
doesn't seem to matter here) will just "do the right thing".

The test-cases are expanded with some coverage of username/passwords.

[1] https://docs.gitlab.com/ee/user/profile/personal_access_tokens.html#clone-repository-using-personal-access-token
@ThiefMaster
Copy link
Copy Markdown

This absolutely needs to be opt-in. In most cases people do NOT want to accept such URLs.

@ianw
Copy link
Copy Markdown
Author

ianw commented Jun 24, 2024

This absolutely needs to be opt-in. In most cases people do NOT want to accept such URLs.

I think my concern with that would be the alternative case, where you have opted-out, then implies that the URL will be sanitised in some way. There's certainly been CVE level issues in things like urlparse with things like newlines being translated through incoming URLs etc.

So my counter argument would be to just not play that game at all -- have this as a RFC-level validator around URLs and leave security up to the app when it has what it knows is a valid url?

@azmeuk
Copy link
Copy Markdown
Member

azmeuk commented Apr 21, 2026

I think I like having urllib doing the work instead of custom regexes. @ThiefMaster, what issues do you anticipate here?

@ThiefMaster
Copy link
Copy Markdown

If you let (untrusted) users specify URLs containing @, they can easily be written to look like something else. Think of something like https://accounts.google.com@secure-password-validation.xyz/ where the end user may very well think the link goes to https://accounts.google.com even though it actually doesn't.

Depending on the context and there the URL ends up, it could be used to trick users into clicking a nasty link thinking it points them to a trustworthy site.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants