-
Notifications
You must be signed in to change notification settings - Fork 21
feat: add Tornado checker #92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Reviewer's GuideAdds a new Tornado-specific Python taint checker and utilities, wires it into the checker configuration, and implements route/handler discovery plus taint source propagation for Tornado request handlers. Sequence diagram for Tornado route discovery and entrypoint registrationsequenceDiagram
participant Analyzer
participant TornadoTaintChecker
participant AstUtil
Analyzer->>TornadoTaintChecker: triggerAtCompileUnit(analyzer, scope, node, state, info)
TornadoTaintChecker->>AstUtil: visit(node, visitors)
AstUtil-->>TornadoTaintChecker: AssignmentExpression / VariableDeclaration / ClassDefinition
TornadoTaintChecker->>TornadoTaintChecker: update fileCache for sourcefile
Analyzer-->>TornadoTaintChecker: return
Analyzer->>TornadoTaintChecker: triggerAtFuncCallSyntax(analyzer, scope, node, state, info)
TornadoTaintChecker->>tornado_util: isTornadoCall(node, 'Application' or 'add_handlers')
tornado_util-->>TornadoTaintChecker: boolean
alt Tornado Application or add_handlers
TornadoTaintChecker->>TornadoTaintChecker: collectTornadoEntrypointAndSource(analyzer, scope, state, routeList, fileName)
TornadoTaintChecker->>TornadoTaintChecker: normalizeRoutes(routeList, fileName)
loop each route
TornadoTaintChecker->>tornado_util: parseRoutePair(routeNode)
tornado_util-->>TornadoTaintChecker: RoutePair
TornadoTaintChecker->>TornadoTaintChecker: resolveSymbol(handlerName, file)
alt handler class found
TornadoTaintChecker->>Analyzer: processInstruction(scope, classAst, state)
Analyzer-->>TornadoTaintChecker: handlerSymVal (class)
TornadoTaintChecker->>TornadoTaintChecker: emitHandlerEntrypoints(analyzer, handlerSymVal, urlPattern, classAst, scope, state)
loop each http method (get, post, ...)
TornadoTaintChecker->>Analyzer: processInstruction(scope, fdef, state)
Analyzer-->>TornadoTaintChecker: fclos
TornadoTaintChecker->>tornado_util: extractParamsFromAst(fdef)
tornado_util-->>TornadoTaintChecker: ParamMeta[]
TornadoTaintChecker->>TornadoTaintChecker: register ParamMeta as PYTHON_INPUT in sourceScope
TornadoTaintChecker->>completeEntryPoint: completeEntryPoint(fclos)
completeEntryPoint-->>TornadoTaintChecker: entryPoint
TornadoTaintChecker->>Analyzer: push entryPoint to entryPoints
end
end
end
end
Class diagram for TornadoTaintChecker and Tornado utility typesclassDiagram
class PythonTaintAbstractChecker {
<<abstract>>
+checkerRuleConfigContent any
+sourceScope any
+getCheckerId() string
+addSourceTagForSourceScope(kind string, sourceScope any) void
+addSourceTagForcheckerRuleConfigContent(kind string, checkerRuleConfigContent any) void
+triggerAtIdentifier(analyzer any, scope any, node any, state any, info any) void
+triggerAtFunctionCallAfter(analyzer any, scope any, node any, state any, info any) void
+checkByNameMatch(node any, fclos any, argvalues any) void
}
class TornadoTaintChecker {
-fileCache Map~string, FileCache~
+TornadoTaintChecker(resultManager any)
-markAsTainted(value any) void
+triggerAtStartOfAnalyze(analyzer any, scope any, node any, state any, info any) void
+triggerAtCompileUnit(analyzer any, scope any, node any, state any, info any) bool
+triggerAtFuncCallSyntax(analyzer any, scope any, node any, state any, info any) bool
+triggerAtIdentifier(analyzer any, scope any, node any, state any, info any) void
+checkByNameMatch(node any, fclos any, argvalues any) void
+triggerAtFunctionCallAfter(analyzer any, scope any, node any, state any, info any) void
+triggerAtMemberAccess(analyzer any, scope any, node any, state any, info any) void
-resolveSymbol(name string, currentFile string) any
-normalizeRoutes(node any, currentFile string) RoutePair[]
-collectTornadoEntrypointAndSource(analyzer any, scope any, state any, routeList any, currentFile string) void
-emitHandlerEntrypoints(analyzer any, handlerSymVal any, urlPattern string, classAst any, scope any, state any) void
-buildClassSymbol(classNode any) any
}
class ImportSymbol {
+file string
+originalName string
}
class RoutePair {
+path string
+handlerName string
+file string
}
class FileCache {
+vars Map~string, any~
+classes Map~string, any~
+importedSymbols Map~string, ImportSymbol~
}
class ParamMeta {
+name string
+locStart number or 'all'
+locEnd number or 'all'
}
class tornado_util {
+tornadoSourceAPIs Set~string~
+passthroughFuncs Set~string~
+isRequestAttributeAccess(node any) boolean
+isRequestAttributeExpression(expr any) boolean
+isTornadoCall(node any, targetName string) boolean
+parseRoutePair(route any) RoutePair
+resolveImportPath(modulePath string, currentFile string) string
+extractImportEntries(stmt any) Array~any~
+extractParamsFromAst(funcNode any) ParamMeta[]
}
TornadoTaintChecker --|> PythonTaintAbstractChecker
TornadoTaintChecker ..> FileCache : uses
TornadoTaintChecker ..> RoutePair : uses
TornadoTaintChecker ..> tornado_util : uses helpers
TornadoTaintChecker ..> ParamMeta : uses
FileCache ..> ImportSymbol : contains
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
Summary of ChangesHello @Ris-1kd, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a new taint analysis checker specifically designed for Python applications built with the Tornado web framework. The primary goal is to enhance the security analysis capabilities for Tornado projects by accurately identifying application entry points and marking various forms of user-controlled input as potential sources of taint. It also incorporates advanced features like cross-file symbol resolution and intelligent taint propagation through common utility functions, leading to more precise and comprehensive security vulnerability detection. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey there - I've reviewed your changes - here's some feedback:
- The logic that detects
self.request.<attr>as a source is duplicated betweentriggerAtMemberAccessandisRequestAttributeAccess/isRequestAttributeExpressionintornado-util.ts; consider reusing the utility helpers in the checker to keep behavior consistent and reduce maintenance overhead. - In
triggerAtStartOfAnalyzeyou re-parseprocess.argvand reload the rule config on every analyze-start; if this hook is called frequently, it may be worth caching the resolvedruleConfigFileand loaded rule data to avoid repeated file I/O and argument parsing.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The logic that detects `self.request.<attr>` as a source is duplicated between `triggerAtMemberAccess` and `isRequestAttributeAccess`/`isRequestAttributeExpression` in `tornado-util.ts`; consider reusing the utility helpers in the checker to keep behavior consistent and reduce maintenance overhead.
- In `triggerAtStartOfAnalyze` you re-parse `process.argv` and reload the rule config on every analyze-start; if this hook is called frequently, it may be worth caching the resolved `ruleConfigFile` and loaded rule data to avoid repeated file I/O and argument parsing.
## Individual Comments
### Comment 1
<location> `src/checker/taint/python/tornado-taint-checker.ts:360-361` </location>
<code_context>
+ }
+
+ // 检查是否是 tornado source API 调用(如 get_argument)
+ if (funcName && tornadoSourceAPIs.has(funcName)) {
+ this.markAsTainted(ret)
+ }
+
</code_context>
<issue_to_address>
**issue (bug_risk):** Treating any get_argument-like call as a source ignores receiver context and may create false positives.
The check currently only matches on `funcName`, so calls like `some_other_obj.get_argument()` (or any monkey‑patched function named `get_argument`) will be treated as taint sources even when unrelated to Tornado handlers. To avoid these false positives, also validate the receiver (e.g., ensure it’s `self`, a known handler type, or otherwise consistent with Tornado request handlers) before marking the return value as tainted.
</issue_to_address>
### Comment 2
<location> `src/checker/taint/python/tornado-taint-checker.ts:371-361` </location>
<code_context>
+ // node.callee.object.type = 'MemberAccess' (body)
+ // node.callee.object.object.type = 'MemberAccess' (request)
+ // node.callee.object.object.object.name = 'self'
+ if (node.callee?.type === 'MemberAccess' && node.callee.object) {
+ const bodyNode = node.callee.object
+ if (
+ bodyNode.type === 'MemberAccess' &&
+ bodyNode.property?.name === 'body' &&
+ bodyNode.object?.type === 'MemberAccess' &&
+ bodyNode.object.property?.name === 'request' &&
+ bodyNode.object.object?.name === 'self'
+ ) {
+ // 直接标记返回值为 source(因为 self.request.body 是 source)
+ this.markAsTainted(ret)
+ return // 已经标记,不需要再检查 receiver
+ }
</code_context>
<issue_to_address>
**suggestion:** The special-case for self.request.body duplicates the logic in isRequestAttributeExpression and is brittle.
Since `tornado-util.ts` already provides `isRequestAttributeExpression` for recognizing Tornado request attributes (including call-expression forms), prefer using that helper here instead of duplicating the AST pattern for `self.request.body`. Centralizing this logic will keep it consistent and easier to evolve.
</issue_to_address>
### Comment 3
<location> `src/checker/taint/python/tornado-taint-checker.ts:268-269` </location>
<code_context>
+ valEnd === 'all' &&
+ val.scopeFile !== 'all' &&
+ val.scopeFunc === 'all' &&
+ typeof node.loc?.sourcefile === 'string' &&
+ node.loc.sourcefile.includes(val.scopeFile)
+ ) {
+ shouldMark = true
</code_context>
<issue_to_address>
**issue (bug_risk):** Using substring matching for scopeFile may accidentally match unrelated files.
Because `node.loc.sourcefile.includes(val.scopeFile)` does a raw substring match, a rule scoped to `user.py` will also trigger for `old_user.py`, `user.py.bak`, or any path containing that text. If `scopeFile` represents a file path, consider normalizing both paths and comparing full path segments (or using a suffix match that respects path separators) to avoid unintended matches.
</issue_to_address>
### Comment 4
<location> `src/checker/taint/python/tornado-util.ts:120-121` </location>
<code_context>
+ const { callee } = route
+ const isUrlHelper =
+ (callee.type === 'Identifier' && callee.name === 'url') ||
+ (callee.type === 'MemberAccess' &&
+ AstUtil.prettyPrint(callee).includes('url'))
+ if (isUrlHelper && Array.isArray(route.arguments)) {
+ const [first, second] = route.arguments
</code_context>
<issue_to_address>
**suggestion (bug_risk):** Detecting url-helper calls via prettyPrint string search is fragile and potentially slow.
Using `AstUtil.prettyPrint(callee).includes('url')` risks false positives (any member chain whose printed form contains "url") and adds unnecessary work in large, route-heavy files. Prefer matching specific AST node shapes and names (e.g., `Identifier`/`MemberAccess` paths for known helpers like `some_module.url` or `tornado.web.url`) instead of searching the pretty-printed text.
Suggested implementation:
```typescript
} else if (route.type === 'CallExpression' && route.callee) {
const { callee } = route
const isIdentifierUrlHelper =
callee.type === 'Identifier' && callee.name === 'url'
const isMemberAccessUrlHelper =
callee.type === 'MemberAccess' &&
(
// e.g. something.url(...)
(callee.member &&
callee.member.type === 'Identifier' &&
callee.member.name === 'url') ||
// or if the AST uses a `property` field instead of `member`
(callee.property &&
callee.property.type === 'Identifier' &&
callee.property.name === 'url')
)
const isUrlHelper = isIdentifierUrlHelper || isMemberAccessUrlHelper
if (isUrlHelper && Array.isArray(route.arguments)) {
const [first, second] = route.arguments
pathExpr = first
handlerNode = second
}
}
```
If your AST representation for member access uses different field names or nesting (for example, `callee.attr` or nested `MemberAccess` chains for `tornado.web.url`), you should adjust the `isMemberAccessUrlHelper` condition accordingly to:
1. Refer to the correct field(s) holding the final attribute/identifier node.
2. Optionally check the full member chain (e.g., that the object path matches known helpers such as `tornado.web`), if you want stricter matching.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
|
|
||
| // 检查是否是 tornado source API 调用(如 get_argument) | ||
| if (funcName && tornadoSourceAPIs.has(funcName)) { | ||
| this.markAsTainted(ret) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: The special-case for self.request.body duplicates the logic in isRequestAttributeExpression and is brittle.
Since tornado-util.ts already provides isRequestAttributeExpression for recognizing Tornado request attributes (including call-expression forms), prefer using that helper here instead of duplicating the AST pattern for self.request.body. Centralizing this logic will keep it consistent and easier to evolve.
| typeof node.loc?.sourcefile === 'string' && | ||
| node.loc.sourcefile.includes(val.scopeFile) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue (bug_risk): Using substring matching for scopeFile may accidentally match unrelated files.
Because node.loc.sourcefile.includes(val.scopeFile) does a raw substring match, a rule scoped to user.py will also trigger for old_user.py, user.py.bak, or any path containing that text. If scopeFile represents a file path, consider normalizing both paths and comparing full path segments (or using a suffix match that respects path separators) to avoid unintended matches.
| (callee.type === 'MemberAccess' && | ||
| AstUtil.prettyPrint(callee).includes('url')) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (bug_risk): Detecting url-helper calls via prettyPrint string search is fragile and potentially slow.
Using AstUtil.prettyPrint(callee).includes('url') risks false positives (any member chain whose printed form contains "url") and adds unnecessary work in large, route-heavy files. Prefer matching specific AST node shapes and names (e.g., Identifier/MemberAccess paths for known helpers like some_module.url or tornado.web.url) instead of searching the pretty-printed text.
Suggested implementation:
} else if (route.type === 'CallExpression' && route.callee) {
const { callee } = route
const isIdentifierUrlHelper =
callee.type === 'Identifier' && callee.name === 'url'
const isMemberAccessUrlHelper =
callee.type === 'MemberAccess' &&
(
// e.g. something.url(...)
(callee.member &&
callee.member.type === 'Identifier' &&
callee.member.name === 'url') ||
// or if the AST uses a `property` field instead of `member`
(callee.property &&
callee.property.type === 'Identifier' &&
callee.property.name === 'url')
)
const isUrlHelper = isIdentifierUrlHelper || isMemberAccessUrlHelper
if (isUrlHelper && Array.isArray(route.arguments)) {
const [first, second] = route.arguments
pathExpr = first
handlerNode = second
}
}If your AST representation for member access uses different field names or nesting (for example, callee.attr or nested MemberAccess chains for tornado.web.url), you should adjust the isMemberAccessUrlHelper condition accordingly to:
- Refer to the correct field(s) holding the final attribute/identifier node.
- Optionally check the full member chain (e.g., that the object path matches known helpers such as
tornado.web), if you want stricter matching.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a new taint analysis checker for the Python Tornado framework. The implementation is comprehensive, covering route discovery, entry point registration, and taint source tracking from request data. My review focuses on improving cross-file symbol resolution, code structure, and configuration handling. I've identified a high-severity issue with how absolute imports are resolved, which could impact the checker's effectiveness. I've also included a couple of medium-severity suggestions to improve maintainability and performance.
| export function resolveImportPath( | ||
| modulePath: string, | ||
| currentFile: string, | ||
| ): string | null { | ||
| if (!modulePath) return null | ||
| const currentDir = path.dirname(currentFile) | ||
| const leadingDots = modulePath.match(/^\.+/)?.[0] ?? '' | ||
| let baseDir = currentDir | ||
| if (leadingDots.length > 0) { | ||
| baseDir = path.resolve(currentDir, '../'.repeat(leadingDots.length - 1)) | ||
| } | ||
| const remainder = modulePath.slice(leadingDots.length) | ||
| const normalized = remainder ? remainder.split('.').join(path.sep) : '' | ||
| const resolved = normalized ? path.resolve(baseDir, normalized) : baseDir | ||
| return `${resolved}.py` | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current implementation of resolveImportPath does not correctly handle absolute Python imports. It treats them as relative to the current file's directory, which is incorrect. Absolute imports should be resolved from the project's source root(s). This can lead to failures in resolving symbols and incomplete taint analysis.
I suggest modifying the function to accept an optional mainDir (project root) and use it as the base for absolute imports.
Additionally, the current implementation doesn't handle package imports (directories with __init__.py). While the suggested change doesn't fully solve that, it's a step in the right direction. You might consider enhancing it further to check for __init__.py files.
You will also need to update the call site in tornado-taint-checker.ts to pass Config.maindir.
For example, at src/checker/taint/python/tornado-taint-checker.ts:148:
const resolved = resolveImportPath(modulePath, fileName, Config.maindir);export function resolveImportPath(
modulePath: string,
currentFile: string,
mainDir?: string,
): string | null {
if (!modulePath) return null
const currentDir = path.dirname(currentFile)
const leadingDots = modulePath.match(/^\.+/)?.[0] ?? ''
let baseDir: string
if (leadingDots.length > 0) {
baseDir = path.resolve(currentDir, '../'.repeat(leadingDots.length - 1))
} else if (mainDir) {
baseDir = mainDir
} else {
// Fallback for absolute imports when mainDir is not provided.
// This is the original behavior and is likely incorrect.
baseDir = currentDir
}
const remainder = modulePath.slice(leadingDots.length)
const normalized = remainder ? remainder.split('.').join(path.sep) : ''
const resolved = normalized ? path.resolve(baseDir, normalized) : baseDir
return `${resolved}.py`
}| if (!ruleConfigFile || ruleConfigFile === '') { | ||
| const args = process.argv | ||
| const ruleConfigIndex = args.indexOf('--ruleConfigFile') | ||
| if (ruleConfigIndex >= 0 && ruleConfigIndex < args.length - 1) { | ||
| ruleConfigFile = args[ruleConfigIndex + 1] | ||
| const path = require('path') | ||
| ruleConfigFile = path.isAbsolute(ruleConfigFile) | ||
| ? ruleConfigFile | ||
| : path.resolve(process.cwd(), ruleConfigFile) | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic to load the ruleConfigFile by parsing process.argv is fragile and considered a code smell. It indicates a potential issue with configuration management or initialization order, as the comment on line 62 suggests. This logic appears to be duplicated across different checkers and makes them less reliable and harder to test.
It would be better to refactor the configuration handling to ensure that Config.ruleConfigFile is fully resolved and available before the checker is instantiated, removing the need for this manual argument parsing.
| } | ||
| // 确保 finalEp 有 filePath | ||
| if (!finalEp.filePath && finalEp.fdef?.loc?.sourcefile) { | ||
| const FileUtil = require('../../../util/file-util') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are several require calls inside methods (e.g., triggerAtStartOfAnalyze, emitHandlerEntrypoints). This is inefficient as the module will be re-evaluated on every call (or at least checked in the cache), and it's generally better for readability and maintainability to have all imports at the top of the file.
Please move all require calls to the top of the file. This applies to:
require('../../common/rules-basic-handler')on line 63require('path')on line 71require('../../../util/file-util')on line 80 and 683require('../../../util/common-util')on line 94
| if (!value) return | ||
| if (!value._tags) { | ||
| value._tags = new Set() | ||
| } | ||
| value._tags.add('PYTHON_INPUT') | ||
| value.hasTagRec = true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| if (!value) return | |
| if (!value._tags) { | |
| value._tags = new Set() | |
| } | |
| value._tags.add('PYTHON_INPUT') | |
| value.hasTagRec = true | |
| IntroduceTaint.setTaint(value, 'PYTHON_INPUT') |
This logic already exists in source-util.ts
| * @param info | ||
| */ | ||
| triggerAtStartOfAnalyze(analyzer: any, scope: any, node: any, state: any, info: any): void { | ||
| // 重新加载规则配置(因为可能在构造函数时还没有设置 ruleConfigFile) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this assert is not true.
Since TornadoTaintChecker extends Checker, it will load rule config to this.checkerRuleConfigContent in constructor
YASA-Engine/src/checker/common/checker.ts
Lines 22 to 28 in 48be5b2
| constructor(resultManager: any, checkerId: any) { | |
| this.checkerId = checkerId | |
| this.resultManager = resultManager | |
| this.checkerRuleConfigContent = {} | |
| initRules() | |
| this.loadRuleConfig(this) | |
| } |
At the point where
triggerAtStartOfAnalyze is called, the constructor must have been called, so IMO the code loading ruleConfig again in this function is redundant
| routeList: any, | ||
| currentFile: string | ||
| ) { | ||
| const processed = new Set<string>() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should be a member of class instead of a local variable of a function i think
| * @param state | ||
| * @param _info | ||
| */ | ||
| triggerAtFuncCallSyntax(analyzer: any, scope: any, node: any, state: any, _info: any): boolean | undefined { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would favor a Symbol-Interpret approach instead of AST-based approach. Here we can use triggerAtFunctionCallBefore and get the symbol value of arguments (especially the routeList) without any performance tradeoff, and with the symbol value of routeList, further processing will be more accurate. For example, when the routeList itself is actually retrieved from certain function call, current normalizeRoutes implementation cannot handle this situation properly as it doesn't take CallExpression into consideration, but with symbol value of routeList, we can get the "real" route object regardless of how it is retrieved from whatever expression type.
| * @param _info | ||
| */ | ||
| triggerAtCompileUnit(analyzer: any, scope: any, node: any, _state: any, _info: any): boolean | undefined { | ||
| const fileName = node.loc?.sourcefile |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| const fileName = node.loc?.sourcefile | |
| if (config.entryPointMode === 'ONLY_CUSTOM') return | |
| const fileName = node.loc?.sourcefile |
| if (!res._tags) { | ||
| res._tags = new Set() | ||
| } | ||
| res._tags.add('PYTHON_INPUT') | ||
| res.hasTagRec = true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| if (!res._tags) { | |
| res._tags = new Set() | |
| } | |
| res._tags.add('PYTHON_INPUT') | |
| res.hasTagRec = true | |
| this.markAsTainted(res) |
| * @param state | ||
| * @param info | ||
| */ | ||
| triggerAtIdentifier(analyzer: any, scope: any, node: any, state: any, info: any): void { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you clarify what does this hook do?
| */ | ||
| triggerAtFunctionCallAfter(analyzer: any, scope: any, node: any, state: any, info: any): void { | ||
| // 先调用基类方法处理规则配置中的 source | ||
| super.triggerAtFunctionCallAfter(analyzer, scope, node, state, info) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| super.triggerAtFunctionCallAfter(analyzer, scope, node, state, info) | |
| super.triggerAtFunctionCallAfter(analyzer, scope, node, state, info) | |
| if (config.entryPointMode === 'ONLY_CUSTOM') return |
| * @param info | ||
| */ | ||
| triggerAtMemberAccess(analyzer: any, scope: any, node: any, state: any, info: any): void { | ||
| const { res } = info |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| const { res } = info | |
| if (config.entryPointMode === 'ONLY_CUSTOM') return | |
| const { res } = info |
| 'get_argument', | ||
| 'get_query_argument', | ||
| 'get_body_argument', | ||
| 'get_query_arguments', | ||
| 'get_body_arguments', | ||
| 'get_cookie', | ||
| 'get_secure_cookie', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| 'get_argument', | |
| 'get_query_argument', | |
| 'get_body_argument', | |
| 'get_query_arguments', | |
| 'get_body_arguments', | |
| 'get_cookie', | |
| 'get_secure_cookie', | |
| 'get_argument', | |
| 'get_query_argument', | |
| 'get_body_argument', | |
| 'get_query_arguments', | |
| 'get_body_arguments', | |
| 'get_cookie', | |
| 'get_secure_cookie', | |
| 'get_arguments', | |
| 'get_json_body' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not resolved yet
| 'get_argument', | ||
| 'get_query_argument', | ||
| 'get_body_argument', | ||
| 'get_query_arguments', | ||
| 'get_body_arguments', | ||
| 'get_cookie', | ||
| 'get_secure_cookie', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not resolved yet
| entryPoint.entryPointSymVal?.parent | ||
| ) | ||
| } catch (e) { | ||
| console.error(`[DEBUG] Error executing entrypoint [${i}]: ${e}`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should inspect the error log for error details instead of printing the error directly onto the console
| * @param _info | ||
| */ | ||
| triggerAtFuncCallSyntax(analyzer: any, scope: any, node: any, state: any, _info: any): boolean | undefined { | ||
| const fileName = node.loc?.sourcefile |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not resolved yet
|
|
||
| if (isApp) { | ||
| // Application(...) -> first arg is routes | ||
| ;[routeListArgValue] = argvalues |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
redundant ;
| ;[routeListArgValue] = argvalues | ||
| } else if (isAddHandlers) { | ||
| // add_handlers(host, routes) -> second arg is routes | ||
| ;[, routeListArgValue] = argvalues |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
redundant ;
| * Flatten route lists (handles BinaryExpression +) | ||
| * @param node | ||
| * @param currentFile | ||
| * Extract route pairs from resolved symbol values (from argvalues) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the logic is too complex, I think several member access should be enough...
| if (receiver && (receiver.taint || receiver.hasTagRec || receiver._tags?.has('PYTHON_INPUT'))) { | ||
| this.markAsTainted(ret) | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing ONLY_CUSTOM check in triggerAtFunctionCallAfter
The triggerAtFunctionCallAfter method is missing the Config.entryPointMode === 'ONLY_CUSTOM' check that exists in both triggerAtCompileUnit (line 164) and triggerAtFunctionCallBefore (line 236). This inconsistency means that when entryPointMode is set to ONLY_CUSTOM, the Tornado-specific source API detection (like get_argument, get_cookie) and passthrough function handling will still execute, which is likely unintended behavior. The check should be added after the super call but before processing the fclos and ret values.
| const httpMethods = new Set(['get', 'post', 'put', 'delete', 'patch', 'head', 'options']) | ||
| const entrypoints = Object.entries(handlerSymVal.value) | ||
| .filter(([key, value]: [string, any]) => httpMethods.has(key) && value.vtype === 'fclos') | ||
| .map(([, value]: [string, any]) => value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potential crash when handlerSymVal.value is undefined
The emitHandlerEntrypoints method calls Object.entries(handlerSymVal.value) after only checking that handlerSymVal.vtype === 'class'. When pair.handlerSymVal with vtype === 'class' is used directly (lines 315-317) without going through processHandlerClass or buildClassSymbol, the value property may be undefined, causing Object.entries(undefined) to throw a TypeError. The code ensures handlerSymVal.field exists (line 341-343) but not handlerSymVal.value.
Additional Locations (1)
| // 如果有匹配的规则,调用基类方法处理 | ||
| if (matchedRule) { | ||
| super.checkByNameMatch(node, fclos, argvalues) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Partial matching in checkByNameMatch is ineffective
The checkByNameMatch override is documented to "support partial matching (e.g., os.system matches syslib_from.os.system)" but this feature doesn't work. The method uses string-based partial matching (callFull.endsWith('.os.system')) to find a matching rule, then delegates to super.checkByNameMatch which uses AST-based matchField that requires exact path matching. When a partial match is found, super's AST matching will fail because it doesn't support partial matching - for a call like mymodule.os.system with rule fsig: "os.system", the AST matcher walks the full path and fails when encountering the extra prefix. This results in partial matches never producing findings, defeating the stated purpose of the override.
| } | ||
| } | ||
| // Include parent class name in key to distinguish handlers with same method name | ||
| const parentName = entryPoint?.entryPointSymVal?.parent?.id || entryPoint?.entryPointSymVal?.parent?.name || '' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
parentName may serialize to [object Object] in key
The parentName extraction uses parent?.id || parent?.name directly, but in the AST these properties are often objects with a name sub-property (e.g., {type: 'Identifier', name: 'MyClass'}), not strings. When parent.id is an object, it's truthy so used as-is, and when concatenated into entryKey it becomes [object Object]. This causes different handler classes to produce identical keys, making the deduplication logic skip valid entrypoints. The params serialization nearby correctly uses p?.id?.name || p?.id pattern, but parentName doesn't follow this pattern.
| this.markAsTainted(ret) | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing ONLY_CUSTOM mode check in triggerAtFunctionCallAfter
The triggerAtFunctionCallAfter method lacks the if (Config.entryPointMode === 'ONLY_CUSTOM') return check that exists in triggerAtCompileUnit, triggerAtFunctionCallBefore, and triggerAtMemberAccess. This inconsistency means tornado-specific taint marking (for APIs like get_argument and passthrough functions) will still run in ONLY_CUSTOM mode, while all other tornado-specific behaviors are disabled. The PR discussion explicitly flagged this as "not resolved yet".
| const { fclos, ret } = info | ||
| if (!fclos || !ret) { | ||
| return | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing ONLY_CUSTOM mode check in triggerAtFunctionCallAfter
The triggerAtFunctionCallAfter method is missing the Config.entryPointMode === 'ONLY_CUSTOM' guard that exists in triggerAtCompileUnit, triggerAtFunctionCallBefore, and triggerAtMemberAccess. This inconsistency means when using ONLY_CUSTOM mode, the custom taint source marking logic (for tornadoSourceAPIs and passthroughFuncs) will still execute, which may cause unintended behavior. The PR reviewer noted this suggestion was "not resolved yet".
| } | ||
| } | ||
| // Include parent class name in key to distinguish handlers with same method name | ||
| const parentName = entryPoint?.entryPointSymVal?.parent?.id || entryPoint?.entryPointSymVal?.parent?.name || '' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Parent name extraction may produce [object Object] in key
The parentName extraction accesses parent.id or parent.name directly, but these are typically AST node objects (with nested .name properties), not strings. If parent.id is an object, it will be truthy and used as parentName, becoming [object Object] in the template string at line 149. This defeats the purpose of distinguishing handlers by class name in the deduplication key, potentially causing different handlers from different classes to be incorrectly deduplicated. The code pattern at line 215 shows the correct approach: ast?.id?.name. This should use parent?._sid or parent?._qid or drill down to parent?.ast?.id?.name.
| const pathNode = elements[0] | ||
| const handlerNode = elements[1] | ||
|
|
||
| const pathValue = pathNode?.type === 'Literal' ? pathNode.value : null |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing StringLiteral handling in parseRouteTuple method
The parseRouteTuple method only checks for pathNode?.type === 'Literal' when extracting route paths, but the utility function parseRoutePair in tornado-util.ts handles both 'StringLiteral' and 'Literal' types (lines 125-126). Similarly, extractStringFromSymbolValue also handles both types (line 881). This inconsistency means routes inside ObjectExpression structures (which use parseRouteTuple) will fail to be extracted if the AST parser produces StringLiteral nodes instead of Literal nodes, while the same routes in ArrayExpression format would work correctly.
| if (routeListSymVal.value.length === 2) { | ||
| const [pathSymVal, handlerSymVal] = routeListSymVal.value | ||
| const pathValue = this.extractStringFromSymbolValue(pathSymVal) | ||
| const handlerName = 'Handler' // Placeholder name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hardcoded placeholder replaces actual handler name extraction
Medium Severity
When handling union types that represent a tuple, the code extracts handlerSymVal from the union but then sets handlerName = 'Handler' as a hardcoded placeholder instead of calling this.extractHandlerNameFromSymbolValue(handlerSymVal). This is inconsistent with how tuple types and object types are handled (lines 484 and 505), where the actual handler name is extracted. The hardcoded value causes incorrect route deduplication via dedupKey and may prevent proper handler identification.
| return | ||
| } | ||
| const httpMethods = new Set(['get', 'post', 'put', 'delete', 'patch', 'head', 'options']) | ||
| const entrypoints = Object.entries(handlerSymVal.value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing null check before Object.entries call
Medium Severity
The Object.entries(handlerSymVal.value) call can throw a TypeError if value is undefined. The preceding check only verifies handlerSymVal.vtype === 'class' but not the existence of handlerSymVal.value. When handlerSymVal is obtained directly from pair.handlerSymVal (line 223) without going through buildClassSymbol, the value property may not exist, causing a crash at runtime.
| const httpMethods = new Set(['get', 'post', 'put', 'delete', 'patch', 'head', 'options']) | ||
| const handlers = Object.entries(handlerSymVal.value).filter( | ||
| ([key, value]: [string, any]) => httpMethods.has(key) && value.vtype === 'fclos' | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing null check for handlerSymVal.value causes TypeError
Medium Severity
Object.entries(handlerSymVal.value) at line 473 will throw a TypeError if value is undefined. When pair.handlerSymVal is a class symbol (lines 166-168), it's assigned directly to handlerSymVal without verifying that value exists. The defensive check at lines 175-178 only initializes field, not value. If a class symbol from external sources lacks a value property, this crashes the checker.
Additional Locations (2)
| if (!expr) return false | ||
| if (expr.type === 'MemberAccess') return isRequestAttributeAccess(expr) | ||
| if (expr.type === 'CallExpression' && expr.callee?.type === 'MemberAccess') { | ||
| return isRequestAttributeAccess(expr.callee.object) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Non-recursive call loses taint in chained function calls
Medium Severity
The isRequestAttributeExpression function calls isRequestAttributeAccess(expr.callee.object) instead of recursively calling itself. Since isRequestAttributeAccess only accepts MemberAccess nodes and returns false for CallExpression nodes, chained passthrough function calls like self.request.body.decode().strip() will not be recognized as tainted request data. The inner call expression self.request.body.decode() is a CallExpression, not a MemberAccess, so the check fails. This causes taint to be lost when two or more passthrough functions (decode, strip, replace, etc.) are chained, leading to false negatives in vulnerability detection.
| export function extractTornadoParams(pattern: string): { named: string[]; positionalCount: number } { | ||
| if (!pattern) return { named: [], positionalCount: 0 } | ||
| const named = Array.from(pattern.matchAll(/\(\?P<(\w+)>/g)).map((m) => m[1]) | ||
| if (named.length > 0) return { named, positionalCount: 0 } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Early return ignores positional groups when named groups exist
Low Severity
The extractTornadoParams function returns early with positionalCount: 0 when any named groups are found. This ignores positional capturing groups in URL patterns that mix both styles, like /user/(?P<user_id>\w+)/items/(\d+). The positional group (\d+) would not be counted, so the corresponding handler parameter (e.g., item_id) would not be marked as a tainted source. While mixed URL patterns are uncommon, they are valid in Tornado and this causes false negatives in taint tracking for such handlers.
| if (!h) return | ||
|
|
||
| if (h.vtype === 'union' && Array.isArray(h.value)) h = h.value[0] | ||
| if (h.vtype !== 'class' && h.ast?.type === 'ClassDefinition') { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing null check after union extraction causes crash
Low Severity
In finishRoute, line 145 assigns h = h.value[0] when h is a union. If h.value is an empty array, h becomes undefined. Line 146 then accesses h.vtype without optional chaining, which throws a TypeError. The inconsistency is evident because line 154 uses h?.vtype with optional chaining, indicating awareness that h could be undefined, but line 146 lacks this protection. An empty union passed as a handler class would crash the analysis.
| if (Config.entryPointMode !== 'ONLY_CUSTOM' && isRequestAttributeAccess(node)) { | ||
| markTaintSource(info.res, { path: node, kind: 'PYTHON_INPUT' }) | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Member access hook not invoked for Python analyzer
High Severity
The triggerAtMemberAccess method is dead code for Python analysis. The base analyzer.ts calls checkAtMemberAccess in its processMemberAccess implementation (at line 1382-1384), but python-analyzer.ts completely overrides processMemberAccess without calling the checker hook. This means patterns like self.request.body, self.request.headers, and other request attribute accesses that isRequestAttributeAccess is designed to detect will never trigger taint marking. The tornado checker's intent to mark these as taint sources won't work, potentially causing security vulnerabilities to go undetected.
| if (!h) return | ||
|
|
||
| if (h.vtype === 'union' && Array.isArray(h.value)) h = h.value[0] | ||
| if (h.vtype !== 'class' && h.ast?.type === 'ClassDefinition') { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing null check after union value extraction causes crash
Medium Severity
In finishRoute, after extracting the first value from a union with h = h.value[0] on line 144, the code immediately accesses h.vtype on line 145 without checking if h is still defined. If the union's value array is empty, h.value[0] returns undefined, and the subsequent h.vtype !== 'class' access throws a TypeError. The initial null check on line 142 doesn't protect against this since h is reassigned after that check.
| * @param path | ||
| */ | ||
| private registerEntryPoints(analyzer: any, cls: any, path: string) { | ||
| const methods = ['get', 'post', 'put', 'delete', 'patch'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing HTTP methods compared to Django checker
Low Severity
The methods array is missing head and options HTTP methods, which are included in the analogous Django checker (['get', 'post', 'put', 'delete', 'patch', 'head', 'options']). Tornado's RequestHandler supports these methods, and if a handler implements a custom head or options method that processes user input, it won't be registered as an entry point. This could result in missed taint flows for handlers using these less common but valid HTTP methods.
| if (!h) return | ||
|
|
||
| if (h.vtype === 'union' && Array.isArray(h.value)) h = h.value[0] | ||
| if (h.vtype !== 'class' && h.ast?.type === 'ClassDefinition') { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Crash when union handler has empty value array
Medium Severity
In finishRoute, after the assignment h = h.value[0], if the union's value array is empty, h becomes undefined. The immediately following line accesses h.vtype without re-checking if h is defined, which would throw a TypeError: Cannot read property 'vtype' of undefined. The early return check if (!h) return only happens before the reassignment, not after.
| private finishRoute(analyzer: any, scope: any, state: any, h: any, path: string) { | ||
| if (!h) return | ||
|
|
||
| if (h.vtype === 'union' && Array.isArray(h.value)) h = h.value[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Union handler only processes first element, missing others
Medium Severity
When a handler is a union type (e.g., from conditional assignment like handler = A if cond else B), finishRoute only processes the first element via h = h.value[0], discarding all other potential handlers. This causes two problems: entry points from other handlers in the union are never registered, and if union element ordering varies between analysis runs, different handlers get processed, leading to non-deterministic results where different bugs are discovered on subsequent runs.
| if (Config.entryPointMode !== 'ONLY_CUSTOM' && isRequestAttributeAccess(node)) { | ||
| markTaintSource(info.res, { path: node, kind: 'PYTHON_INPUT' }) | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing null check causes crash on unresolved member access
Medium Severity
The triggerAtMemberAccess method calls markTaintSource(info.res, ...) without checking if info.res is defined. When member access resolution fails (common in static analysis for unresolved references), info.res is undefined. The markTaintSource function then calls setTaint which attempts res._tags = res._tags || new Set(), throwing a TypeError. Unlike triggerAtFunctionCallAfter which guards with if (!ret) return, this method lacks the equivalent if (!info.res) return check.
| if (Config.entryPointMode !== 'ONLY_CUSTOM' && isRequestAttributeAccess(node)) { | ||
| markTaintSource(info.res, { path: node, kind: 'PYTHON_INPUT' }) | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing null check before marking taint source
Medium Severity
The triggerAtMemberAccess method calls markTaintSource(info.res, ...) without checking if info.res is defined. Looking at markTaintSource in source-util.ts, it calls setTaint(unit, kind) which attempts to set res._tags = res._tags || new Set(). If info.res is undefined (when getMemberValue returns undefined), this will throw a TypeError. The pattern in triggerAtFunctionCallAfter correctly guards against this with if (!ret) return, but this guard is missing here.
| if (rules) { | ||
| this.processRoutes(analyzer, scope, state, rules) | ||
| return | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rule objects incorrectly processed as route containers
Medium Severity
The code treats Rule and RuleRouter objects identically when extracting routes at lines 92-98. For RuleRouter(rules), extracting val.value['0'] correctly gets the rules list. However, for Rule(pattern, handler), val.value['0'] is the URL pattern, not a list of routes. When a Rule object is processed, the pattern is incorrectly passed to processRoutes recursively, and the function returns early without ever extracting the handler from val.value['1']. This causes routes defined via Rule objects to fail registration because the handler is never processed.
| if (!h) return | ||
|
|
||
| if (h.vtype === 'union' && Array.isArray(h.value)) h = h.value[0] | ||
| if (h.vtype !== 'class' && h.ast?.type === 'ClassDefinition') { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing null check after extracting from union array
Medium Severity
After line 193 extracts h.value[0] from a union with an array value, h becomes undefined if the array is empty. The subsequent line 194 immediately accesses h.vtype without re-checking if h is defined, which would throw a TypeError. The guard at line 202 (if (path && h)) comes too late - the crash occurs before reaching it. This would cause analysis to fail when encountering a union type with an empty value array.
| const names = [targetName] | ||
| if (targetName === 'Application') { | ||
| names.push('RuleRouter', 'Rule') | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isTornadoCall incorrectly matches Rule when checking for Application
Medium Severity
The isTornadoCall function adds 'Rule' to the checked names when targetName is 'Application'. This causes isTornadoCall(node, 'Application') to return true for standalone Rule(pattern, handler) calls. In triggerAtFunctionCallBefore, when isApp is true, the code sets routes = argvalues[0], expecting a list of routes. However, for Rule calls, argvalues[0] is the pattern (not a routes list) and argvalues[1] is the handler. This mismatch causes routes defined via standalone Rule calls to fail registration, potentially missing entry points for taint analysis.
Additional Locations (1)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.
| private finishRoute(analyzer: any, scope: any, state: any, h: any, path: string) { | ||
| if (!h) return | ||
| if (h.vtype === 'union' && Array.isArray(h.value)) h = h.value[0] | ||
| if (h.vtype !== 'class' && h.ast?.type === 'ClassDefinition') { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing null check after union value extraction
Medium Severity
In finishRoute, when h is a union type with an empty value array, the expression h = h.value[0] sets h to undefined. The next line then accesses h.vtype, which throws a TypeError. A null check is needed after extracting from the union array, or the condition needs to verify the array has elements before accessing h.value[0].
| * @param path | ||
| */ | ||
| private registerEntryPoints(analyzer: any, cls: any, path: string) { | ||
| const methods = ['get', 'post', 'put', 'delete', 'patch'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing HTTP methods in Tornado entry point detection
Medium Severity
The methods array only includes ['get', 'post', 'put', 'delete', 'patch'], but is missing head and options which are valid HTTP methods that Tornado RequestHandler supports. The Django checker correctly includes all seven methods. Handlers implementing head() or options() won't be registered as entry points, potentially causing missed taint flows and security vulnerabilities to go undetected.
Summary by Sourcery
Introduce a Python taint analysis checker for Tornado web handlers and supporting utilities to detect and register request-derived data as taint sources and HTTP handler methods as entrypoints.
New Features:
Note
Introduces Tornado support in Python taint analysis and wires it into configs.
checker/taint/python/tornado-taint-checker.tsto discover Tornado routes, register handler HTTP methods as entrypoints, and mark request-derived data (argument getters andself.request.*attributes) asPYTHON_INPUTsourceschecker/taint/python/tornado-util.tswith helpers (isTornadoCall,extractTornadoParams,isRequestAttributeAccess,tornadoSourceAPIs)checker-config.json, includes it in Python checker packs inchecker-pack-config.json, and adds it toresource/example-rule-config/rule_config_python.jsonpython-analyzer.ts) to callcheckAtMemberAccessduringprocessMemberAccess, enabling source marking on attribute accessWritten by Cursor Bugbot for commit 0d63baf. This will update automatically on new commits. Configure here.