Skip to content

Escape file_identifier in code generators to prevent injection via .bfbs#9083

Open
kagancapar wants to merge 1 commit intogoogle:masterfrom
kagancapar:fix/escape-file-identifier-codegen
Open

Escape file_identifier in code generators to prevent injection via .bfbs#9083
kagancapar wants to merge 1 commit intogoogle:masterfrom
kagancapar:fix/escape-file-identifier-codegen

Conversation

@kagancapar
Copy link
Copy Markdown

Summary

The file_identifier field is embedded unescaped into generated string literals across 10 of 11 code generators. When flatc processes a crafted .bfbs schema (which has no length limit on file_ident, unlike .fbs which enforces exactly 4 bytes), an attacker-controlled file_identifier can break out of the string literal and inject arbitrary code into the generated source.

This is the same class of codegen injection that was fixed for string defaults in #8964 — but file_identifier was not covered by that patch.

Attack chain

  1. Attacker crafts a .bfbs file with a malicious file_ident field
  2. Victim runs flatc with any language flag on the malicious .bfbs
  3. Generated source contains injected code inside the identifier return value
  4. When compiled, the injected code executes

Fix

  • idl_parser.cpp: Reject .bfbs schemas where file_ident is not exactly kFileIdentifierLength (4) bytes — matching the validation already enforced for .fbs schemas
  • 10 generators (C++, Rust, Go, Java, C#, Kotlin, Kotlin KMP, PHP, TypeScript, Swift): Apply flatbuffers::EscapeString() to all file_identifier embedding sites (20+ locations), so that special characters (", \, control chars) are properly escaped before being placed inside string literals
  • Python generator: Already immune — it hex-escapes all characters

This is consistent with the approach used for string defaults in #8964.

Affected generators & sites patched

Generator Sites Key pattern
idl_gen_cpp.cpp 1 return "IDENT"
idl_gen_rust.cpp 1 = "IDENT"
idl_gen_go.cpp 1 const Identifier = "IDENT"
idl_gen_java.cpp 2 __has_identifier, finish
idl_gen_csharp.cpp 3 __has_identifier, VerifyBuffer, Finish
idl_gen_kotlin.cpp 3 __has_identifier, finish, finishSizePrefixed
idl_gen_kotlin_kmp.cpp 3 hasIdentifier, finish, finishSizePrefixed
idl_gen_php.cpp 2 return "IDENT", finish
idl_gen_ts.cpp 2 finish, __has_identifier
idl_gen_swift.cpp 1 static var id template

Related PRs

The `file_identifier` field was embedded raw into generated string
literals across all code generators. A crafted .bfbs schema with a
malicious file_ident value could inject arbitrary code into generated
source files (C++, Rust, Go, Java, C#, Kotlin, PHP, TypeScript, Swift).

This commit applies `flatbuffers::EscapeString` to all file_identifier
embedding sites (20+ locations across 10 generators), consistent with
the approach used for string defaults in PR google#8964.

Additionally, BFBS deserialization in idl_parser.cpp now rejects
file_identifier values that are not exactly 4 bytes, matching the
validation already enforced for .fbs schemas.

Python generator was already immune (hex-escapes all characters).
@google-cla
Copy link
Copy Markdown

google-cla Bot commented May 8, 2026

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@kagancapar
Copy link
Copy Markdown
Author

This is a variant of the codegen injection class fixed in #8964 (string defaults). That PR applied EscapeString to default string values but did not cover file_identifier, which follows the same raw embedding pattern across all generators.

Verified locally: built flatc with the patch, confirmed that a crafted .bfbs with an oversized file_ident payload is now rejected, and that a .fbs with special characters in a 4-byte file_identifier produces properly escaped output.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant