Skip to content

Commit 1d7646f

Browse files
Add a documentation page for proto edition 2024 symbol visibility and lay the
ground work for future expansion of functionality. PiperOrigin-RevId: 888711536
1 parent fc0b32d commit 1d7646f

File tree

1 file changed

+303
-0
lines changed

1 file changed

+303
-0
lines changed
Lines changed: 303 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,303 @@
1+
+++
2+
title = "Symbol Visibility"
3+
weight = 85
4+
linkTitle = "Symbol Visibility"
5+
description = "Explains the terminology and functionality of visibility, introduced in edition 2024."
6+
type = "docs"
7+
+++
8+
9+
This document describes the terminology and functionality of the symbol
10+
visibility system introduced in proto `edition = "2024"`
11+
12+
## Glossary {#glossary}
13+
14+
* **Symbol**: Any of `message`, `enum`, `service` or `extend <type>`, the
15+
allowed **Top-Level** types in a `.proto` file.
16+
* **Top-Level**: A **Symbol** defined at the root of a `.proto` file. This
17+
includes all `service` definitions and any `message`, `enum`, or `extend`
18+
block not nested in `message`.
19+
* **Visibility**: Property of a **Symbol** that controls whether it can be
20+
imported into another `.proto` file. Either set explicitly or derived from
21+
file defaults.
22+
* **Entry-Point**: A **Symbol** in a `.proto` file, which acts as the root of
23+
a local sub-graph of **Symbol**s in that file, through which either
24+
generated code or other `.proto` files "enter" the file.
25+
* **Symbol Waste**: For a given **Symbol** 'X', the set of waste **Symbol**s
26+
are those unreachable from 'X', but defined in files which are transitively
27+
imported to satisfy the processing of the file in which 'X' is defined.
28+
29+
## Introduction {#introduction}
30+
31+
Visibility in Edition 2024 and later provides a way for authors to use access
32+
modifiers for `message` and `enum` declarations. The visibility of a symbol
33+
controls the ability to reference that symbol from another `.proto` file via
34+
`import`. When a symbol is not visible outside of a given file, then references
35+
to it from another `.proto` file will fail with a compiler error.
36+
37+
This feature allows authors to control access to their symbols from external
38+
users and encourages smaller files, which may lead to a smaller code footprint
39+
when only a subset of defined symbols are required.
40+
41+
Visibility applies to any `message` or `enum` referenced via:
42+
43+
* `message` and `extend` field definitions.
44+
* `extend <symbol>`
45+
* `service` method request and response types.
46+
47+
Symbol visibility applies **only** to the proto language, controlling the proto
48+
compiler's ability to reference that symbol from another proto file. Visibility
49+
must not be reflected into any language-specific generated code.
50+
51+
## Detailed Usage {#syntax}
52+
53+
Visibility introduces a set of new file-level options and two new Protobuf
54+
keywords, `local` and `export`, which can be prefixed to `message` and `enum`
55+
types.
56+
57+
```proto
58+
local message MyLocal {...}
59+
60+
export enum MyExported {...}
61+
```
62+
63+
Each `.proto` file also has a default visibility controlled by the edition's
64+
defaults or file-level `option features.default_symbol_visibility`. This
65+
visibility impacts all `message` and `enum` definitions in the file.
66+
67+
**Values available:**
68+
69+
* `EXPORT_ALL`: This is the default prior to Edition 2024. All messages and
70+
enums are exported by default.
71+
* `EXPORT_TOP_LEVEL`: All top-level symbols default to `export`; nested
72+
default to `local`.
73+
* `LOCAL_ALL`: All symbols default to `local`.
74+
* `STRICT`: All symbols default to `local` and visibility keywords are only
75+
valid on top-level `message` and `enum` types. Nested types can no longer
76+
use those keywords are are always treated as `local`
77+
78+
**Default behavior per syntax/edition:**
79+
80+
Syntax/edition | Default
81+
-------------- | ------------------
82+
2024 | `EXPORT_TOP_LEVEL`
83+
2023 | `EXPORT_ALL`
84+
proto3 | `EXPORT_ALL`
85+
proto2 | `EXPORT_ALL`
86+
87+
Any top-level `message` and `enum` definitions can be annotated with explicit
88+
`local` or `export` keywords to indicate their intended use. Symbols nested in
89+
`message` may allow those keywords.
90+
91+
Example:
92+
93+
```proto
94+
// foo.proto
95+
edition = "2024";
96+
97+
// Symbol visibility defaults to EXPORT_TOP_LEVEL. Setting
98+
// default_symbol_visibility overrides these defaults
99+
option features.default_symbol_visibility = LOCAL_ALL;
100+
101+
// Top-level symbols are exported by default in Edition 2024; applying the local
102+
// keyword overrides this
103+
local message LocalMessage {
104+
int32 baz = 1;
105+
// Nested symbols are local by default in Edition 2024; applying the export
106+
// keyword overrides this
107+
enum ExportedNestedEnum {
108+
UNKNOWN_EXPORTED_NESTED_ENUM_VALUE = 0;
109+
}
110+
}
111+
112+
// bar.proto
113+
edition = "2024";
114+
115+
import "foo.proto";
116+
117+
message ImportedMessage {
118+
// The following is valid because the imported message explicitly overrides
119+
// the visibility setting in foo.proto
120+
LocalMessage bar = 1;
121+
122+
// The following is not valid because default_symbol_visibility is set to
123+
// `LOCAL_ALL`
124+
// LocalMessage.ExportedNestedEnum qux = 2;
125+
}
126+
```
127+
128+
### STRICT default_symbol_visibility {#strict}
129+
130+
When `default_symbol_visibility` is set to `STRICT`, more restrictive visibility
131+
rules are applied to the file. This mode is intended as a more optimal, but
132+
invasive type of visibility where nested types are not to be used outside their
133+
own file. In `STRICT` mode:
134+
135+
1. All symbols default to `local`.
136+
2. `local` and `export` may be used only on top-level `message` and `enum`
137+
declarations.
138+
3. `local` and `export` keywords on nested symbols will result in a syntax
139+
error.
140+
141+
A single carve-out exception to nested visibility keywords is made for specific
142+
wrapper `message` types used to address C++ namespace pollution. In this case a
143+
`export enum` is supported iff:
144+
145+
1. The top-level `message` is `local`
146+
2. All fields are `reserved`, preventing any field definitions, using `reserved
147+
1 to max;`
148+
149+
Example:
150+
151+
```proto
152+
local message MyNamespaceMessage {
153+
export enum Enum {
154+
MY_VAL = 1;
155+
}
156+
157+
// Ensure no fields are added to the message.
158+
reserved 1 to max;
159+
}
160+
```
161+
162+
## Purpose of Visibility {#purpose}
163+
164+
There are two purposes for the visibility feature. The first introduces access
165+
modifiers, a common feature of many popular programming languages, used to
166+
communicate and enforce the intent of authors about the usage of a given API.
167+
Visibility allows protos to have a limited set of proto-specific access control,
168+
giving proto authors some additional controls about the parts of their API that
169+
can be reused and defining the API's entry-points.
170+
171+
The second purpose is to encourage limiting the scope of a `.proto` file to a
172+
narrow set of definitions to reduce the need to process unnecessary definitions.
173+
The protobuf language definition requires that `Descriptor` type data be bundled
174+
at the `FileDescriptorSet` and `FileDescriptor` level. This means that all
175+
definitions in a single file, and its transitive dependencies via imports, are
176+
unconditionally processed whenever that `.proto` file is processed. In large
177+
definition sets, this can become a significant source of large unused blocks of
178+
definitions that both slow down processing and generate a large set of unused
179+
generated code. The best way to combat this sort of anti-pattern is to keep
180+
`.proto` files narrowly scoped, containing only the symbols needed for a given
181+
sub-graph located in the same file. With visibility added to `message` and
182+
`enum` types, we can enumerate the set of entry-points in a given file and
183+
determine where unrelated definitions can be split into different files to
184+
provide fine-grained dependencies.
185+
186+
## Entry-Points {#entry-point}
187+
188+
A `.proto` file entry-point is any symbol that acts as a starting point for a
189+
sub-graph of the full transitive closure of proto symbols processed for a given
190+
file. That sub-graph represents the full transitive closure of types required
191+
for the definition of that entry-point symbol, which is a subset of the symbols
192+
required to process the file in which the entry-point is defined.
193+
194+
There are generally 3 types of entry-points.
195+
196+
### Simple-type Entry-Point {#simple-entry-point}
197+
198+
`message` and `enum` types that have a visibility of `export` both can act as
199+
entry-points in a symbol graph. In the case of `enum`, that sub-graph is the
200+
null set, which means any other type definitions in the same file as that
201+
`export enum Type {}` can be considered 'waste' in reference to the symbol graph
202+
required for importing the enum.
203+
204+
For `message` types the message acts as an entry-point for the full transitive
205+
closure of types for the `message`'s field definitions and the recursive set of
206+
field definitions of any sub-messages referenced by the entry-point.
207+
208+
Importantly for both `enum` and `message` any `service` or `extend` definitions
209+
in the same file are unreferencable and can be considered 'waste' for users of
210+
those `message` and `enum` definitions.
211+
212+
When a `.proto` file contains multiple `local` messages with no shared
213+
dependencies, it is possible they can all act as independent entry-points when
214+
used from generated code. The Protobuf compiler and static analysis system have
215+
no way to determine the optimal entry-point layout of such a file. Without that
216+
context we assume that `local` messages in the same file are optimal for the
217+
intended generated code use-cases.
218+
219+
### Service Entry-Points {#service-entry-point}
220+
221+
`service` definitions act as an entry-point for all symbols referenced
222+
transitively from that `service`, which includes the full transitive closure of
223+
`message` and `enum` types for all `rpc` method definitions.
224+
225+
Service definitions cannot be referenced in any meaningful way from another
226+
proto, which means when a `.proto` file is imported and it contains a `service`
227+
definition, it and its transitive closure of method types can be considered a
228+
waste. This is true even if all `method` input and output types are referenced
229+
from the importing file.
230+
231+
### Extension Entry-Points {#extend-entry-point}
232+
233+
`extend <symbol>` is an Extend type entry-point. Similar to `service` there is
234+
no way to reference an extension field from a `message` or `enum`. An extension
235+
requires `import`-ing both the Extendee type for an `extend ExtendeeType` and
236+
the Extender type for `ExtenderType my_ext = ...`, which requires depending on
237+
the full transitive closure of symbols required for both `ExtendeeType` and
238+
`ExtenderType`. The symbols for `ExtendeeType` can be considered waste to any
239+
other entry-points in the same file as the `extend`.
240+
241+
Extensions exist to decouple the Extender type from the Extendee type, however
242+
when extensions are defined in the same file as other entry-points this creates
243+
a dependency inversion, forcing the other entry-points in the same file to
244+
depend on the Extendee type.
245+
246+
This dependency inversion can be harmless when the Extendee type is trivial,
247+
containing no imports or non-primitive types, but can also be a source of
248+
tremendous waste if the Extendee has a large transitive dependency set.
249+
250+
In some generated code, nesting extensions inside of a `message` provides a
251+
useful namespace. This can be accomplished safely with the use of visibility
252+
keywords in one of two ways. Example:
253+
254+
```proto
255+
import "expensive/extendee_type.proto";
256+
257+
// You can define a local message for your extension.
258+
local message ExtenderType {
259+
extend expensive.ExtendeeType {
260+
ExtenderType ext = 12345;
261+
}
262+
263+
string one = 1;
264+
int two = 2;
265+
}
266+
```
267+
268+
Alternatively, if you have a symbol you both want to use as an extension field
269+
and have that same type be used as a normal field in another messages, then that
270+
message should be defined in its own file and the `extend` defined in its own
271+
isolated file. You can still provide a friendly namespace, which is especially
272+
useful for languages like C++, for the extension with a `local` wrapper message
273+
with no fields:
274+
275+
```proto
276+
package my.pkg;
277+
278+
import "expensive/extendee_type.proto";
279+
import "other/foo_type.proto";
280+
281+
// Exclusively used to bind other.FooType as an extension to
282+
// expensive.ExtendeeType giving it a useful namespaced name
283+
for `ext` as `my.pkg.WrapperForFooMsg.ext`
284+
local message WrapperForFooMsg {
285+
extend expensive.ExtendeeType {
286+
other.FooMsg ext = 45678;
287+
}
288+
289+
// reserve all fields to ensure nothing ever uses this except for wrapping the
290+
// extension.
291+
reserved 1 to max;
292+
}
293+
294+
```
295+
296+
### Best-Practice: Maintain 1 Entry-Point per File
297+
298+
In general to avoid proto symbol waste striving for a 1:1 ratio between `.proto`
299+
files and entry-points can ensure that no excess proto compiler processing or
300+
code generation is being performed for any given symbol. This can be further
301+
extended to build systems by ensuring that a build step is only processing a
302+
single `.proto` file at a time. In Bazel, this would correspond to having only a
303+
single `.proto` file per `proto_library` rule.

0 commit comments

Comments
 (0)