[http] sanitise URL options before use them#22186
[http] sanitise URL options before use them#22186linev wants to merge 4 commits intoroot-project:masterfrom
Conversation
Test Results 22 files 22 suites 3d 12h 42m 0s ⏱️ Results for commit cc20b42. ♻️ This comment has been updated with latest results. |
| val = sval.Data(); | ||
| } | ||
|
|
||
| Bool_t sanitize = kFALSE; |
There was a problem hiding this comment.
I think it would be useful to have a comment here explaining when/why we do not sanitize.
There was a problem hiding this comment.
When processing POST data special value created directly in the code - including pointer casting.
Such values should not be touched.
Value also not sanitized when default value from method arguments list is extracted.
In other cases value will be checked
There was a problem hiding this comment.
Thanks! This as a code comment would be helpful.
There was a problem hiding this comment.
In general I don't think we can really trust all this code to produce properly "sanitized" input. Sanitizers are notoriously hard to get right and I see many suspicious places in the Sniffer code.
So if these checks are in place as a measure against accidental errors, they may be useful, but I wouldn't consider them a measure against actual attacks.
Regardless, I put a couple of more specific comments.
Also, we definitely need tests for this.
| if (!value || !*value) | ||
| return ""; | ||
|
|
||
| TString res = value; |
There was a problem hiding this comment.
How long can value be? Since we're trying to sanitize the input, shouldn't we also put a reasonable max length?
There was a problem hiding this comment.
There is no such limit and I cannot make here any assumption.
civetweb cuts URL length by 16K.
There was a problem hiding this comment.
The backend is not guaranteed to be civetweb in principle, is it?
We should put an artificial limit here so we don't have to worry about huge allocations/reallocations/copies/iterations and we could in principle even preallocate the output string once and do a single pass over the string to validate it.
| @@ -1303,13 +1303,40 @@ TString TRootSniffer::DecodeUrlOptionValue(const char *value, Bool_t remove_quot | |||
| res.ReplaceAll("%5D", "]"); | |||
| res.ReplaceAll("%3D", "="); | |||
There was a problem hiding this comment.
This goes over the string 8 times, whereas we could do all of this in 1 pass...(this is in addition to the TString constructor, which also does a full scan of the string to determine the length).
Depending on the max length we impose for the string and whether denial of service is a concern, we might want to fix this.
There was a problem hiding this comment.
Deny of access or deny of service is already problematic with pure civetweb.
It is not designed to protect from such attacks - it should be done in front of it.
There was a problem hiding this comment.
It's still a very suboptimal way to do this validation that scales superlinearly with the arguments' length. But if this is not a concern for this class, fine for me.
| clean.Append('\\'); | ||
| clean.Append(c); | ||
| } else if (c > 31) | ||
| // ignore all special symbols |
There was a problem hiding this comment.
I think c == 127 is a "special" symbol (DEL)
There was a problem hiding this comment.
Probably I just will use !std::iscntrl(c)
There was a problem hiding this comment.
I'm not sure what this excludes exactly (does it exclude all ASCII non-printable characters? Does it depend on the locale? The man page makes it look like the case), so I was actually more comfortable with the explicit check.
|
|
||
| void CreateMemFile(); | ||
|
|
||
| TString SanitzeArgument(Bool_t isstr, const char *value, Bool_t onlyquotes = kFALSE); |
There was a problem hiding this comment.
Also this should probably be static
|
|
||
| TString TRootSnifferFull::SanitzeArgument(Bool_t isstr, const char *value, Bool_t onlyquotes) | ||
| { | ||
| Int_t beg = 0, end = strlen(value); |
There was a problem hiding this comment.
Is value guaranteed to have a reasonable max length here? Otherwise strlen is potentially dangerous.
There was a problem hiding this comment.
Value created from URL and will not be longer than 16K
There was a problem hiding this comment.
Same comment as above: we should put our own limit as well
| if (!isstr && !onlyquotes) { | ||
| // for numeric types make very strict check | ||
| if (std::isalnum(c) || std::strchr(".:+-", c)) | ||
| sanitized.Append(c); |
There was a problem hiding this comment.
This will not necessarily produce a number, so I'm not sure what's the goal of this sanitization (what if the string is -4+3.2-245+++++67?)
There was a problem hiding this comment.
Goal to avoid symbols sequence which can be interpreted as C/C++ code invocation like gSystem->DoSomething()
There was a problem hiding this comment.
You're checking for isalnum though: did you mean to check for isdigit? Otherwise I wouldn't rule out "interesting" code invocations going through custom arithmetic operators...
There was a problem hiding this comment.
Even if every non-operator is a digit this will still cause N>=1 expressions to be evaluated, which I guess is safe, but it will still go through the interpreter to be executed before being passed to TMethodCall I think.
| if (std::isalnum(c) || std::strchr(".:+-", c)) | ||
| sanitized.Append(c); | ||
| } else if ((c == '\"') || (c == '\\')) { | ||
| // escape quotes inside string |
There was a problem hiding this comment.
This is quite bizarre if I understand correctly: if an argument is "ab"cdef" we will parse it as the string "ab\"cdef"?
There was a problem hiding this comment.
It will be delivered to C++ as
new TMethodCall(obj_cl, method_name, "ab\"cdef" );
There was a problem hiding this comment.
Shouldn't we rather require all internal quotes to be already escaped in the argument? If the incoming argument is supposed to be a valid string literal it would not be valid if it contains unescaped quotes, which may indicate some other kind of error (on the user side or otherwise)
There was a problem hiding this comment.
Shouldn't we rather require all internal quotes to be already escaped in the argument?
On each layer escape character can disappear.
Quotes are very special - because when they goes into C++ interpreter they can break argument list and start new command. By such transformation we want guarantee that in any case interpreter will take provided string as single argument. Actually exactly such exploit is shown in [THttpServer-4] issue.
So attempt here to produce string which not closed in between.
There was a problem hiding this comment.
On each layer escape character can disappear.
That sounds like a bug from the previous layers...
Regardless, it would be nice to have some tests on this and comments to make it clear why we're doing this and exactly what form of strings do we accept from the user side
There was a problem hiding this comment.
There are many differences how quotes escaped in URL and in C++.
Main problem - string is used again in C++ script. So one need to provide/create escape characters.
Remove any special symbols Add escape characters for quote and escape itself Try to avoid manipulation of arguments for method execution
Avoid special characters as draw arguments
Remove all special characters, check quotes, escape quotes inside If expecting numeric value - just alphanumeric, comma, +, -, allowed
URL syntax allow to provide different special symbols using
%12%20symbols.Also some situations un-escaped quotes can make a problems.
Therefore cleanup string from such cases before use it.