Skip to content

Set TextFormat default recursion limit to 100#26491

Closed
gladiator9797 wants to merge 1 commit intoprotocolbuffers:mainfrom
gladiator9797:fix-textformat-default-recursion-limit
Closed

Set TextFormat default recursion limit to 100#26491
gladiator9797 wants to merge 1 commit intoprotocolbuffers:mainfrom
gladiator9797:fix-textformat-default-recursion-limit

Conversation

@gladiator9797
Copy link
Copy Markdown

Summary

The TextFormat::Parser constructor initializes recursion_limit_ to std::numeric_limits<int>::max() (2,147,483,647) at text_format.cc:1803. This is inconsistent with:

  • Binary format parser: default 100 (CodedInputStream::default_recursion_limit_)
  • JSON parser: default 64-100
  • UPB binary decoder: default 100 (kUpb_WireFormat_DefaultDepthLimit)

A textproto with ~5,000-10,000 nested messages overflows the C stack before the recursion check ever triggers, causing SIGSEGV.

Fix

Change the default from std::numeric_limits<int>::max() to 100, matching CodedInputStream::default_recursion_limit_. Users who need a different limit can still call SetRecursionLimit().

Reproducer

// Parse deeply nested textproto
std::string textproto;
for (int i = 0; i < 10000; i++) textproto += "nested_type { ";
textproto += "name: \"x\"";
for (int i = 0; i < 10000; i++) textproto += " }";

DescriptorProto msg;
TextFormat::ParseFromString(textproto, &msg);  // SIGSEGV
$ ulimit -s 8192 && ./test
EXIT=139  (SIGSEGV - stack overflow)

Test plan

  • Verified crash with default limit (exit 139)
  • SetRecursionLimit() API still works for custom limits
  • Existing protobuf test suite passes

@zhangskz zhangskz added the 🅰️ safe for tests Mark a commit as safe to run presubmits over label Mar 20, 2026
@github-actions github-actions Bot removed the 🅰️ safe for tests Mark a commit as safe to run presubmits over label Mar 20, 2026
The TextFormat::Parser constructor initializes recursion_limit_ to
std::numeric_limits<int>::max() (2,147,483,647). This is inconsistent
with the binary format parser (default 100) and the JSON parser (default
64-100). A textproto with ~5,000-10,000 nested messages overflows the C
stack before the recursion check triggers, causing SIGSEGV.

Change the default to 100, matching CodedInputStream::default_recursion_limit_.
Users who need a different limit can still call SetRecursionLimit().

Reproducer:
  Parse textproto with 10000 levels of nested_type { ... }
  -> SIGSEGV (exit 139) due to stack overflow
@gladiator9797 gladiator9797 force-pushed the fix-textformat-default-recursion-limit branch from 9cd1ebe to 8991e76 Compare March 21, 2026 04:31
@esrauchg
Copy link
Copy Markdown
Contributor

Sorry, it's not clear that if we can accept this change or not, as it would be a breaking change for a large number of people. As a breaking change, it would need to be done with one of our breaking change releases which we only do in Q1 (we already did the one for 2026), if we want to move forward with this we likely couldn't release it until Q1-2027.

Security issues take precedence over breaking change cadence, but we don't view this specific issue as a security topic.

The reason why is that TextProto is intended to be a debugging and configuration format rather than a wire format. It generally should not used in adversarial/untrusted inputs, and the lack of depth enforcement is working as intended.

The opt-in depth limit on the C++ TextFormat parser is primarily available as a convenience knob to avoid confusion/ergonomic problems when you do parseText > serializeBinary > parseBinary since the first moment of the depth limit enforcement is that third step, which can be confusing.

In general because of the second property, we actually probably will update our guidance that new TextProto parsers should enforce depth by default with an easily available disable, but likely we would view 'edge case depth bypass' issues as a vanilla minor bug rather than as a security-sensitive topic from there.

In general, because AI tools are suddenly finding a lot of depth bypass concerns we are likely to clarify and document our stance on this, so please bear with us. Thanks!

@esrauchg
Copy link
Copy Markdown
Contributor

We are going to have more followups to clarify the security posture regarding TextProto (namely: that by default it is presumed to not be adversarial input, unlike Binary and JSON formats).

Even on trusted inputs, aligned depth has good ergonomic benefits because it is quite unfortunate to parse a config and then not be able to round trip it through binary wire format. But, after evaluating the fallout of how many preexisting cases would suddenly fail with this change, we definitely cannot take this change today unfortunately, as a significant breaking change that we can't justify doing under our intended security posture.

Thanks again for the candidate change.

@esrauchg esrauchg closed this Mar 25, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants