Skip to content

Commit e114777

Browse files
committed
docs: add type inference blog post and news article
Add technical blog post explaining TypeScript-style type inference implementation for T-Ruby, along with a news announcement. Includes translations for Korean (ko) and Japanese (ja).
1 parent e31a0a4 commit e114777

File tree

6 files changed

+1287
-0
lines changed

6 files changed

+1287
-0
lines changed
Lines changed: 312 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,312 @@
1+
---
2+
slug: typescript-style-type-inference
3+
title: "Building TypeScript-Style Type Inference for T-Ruby"
4+
authors: [yhk1038]
5+
tags: [technical, type-inference, compiler]
6+
---
7+
8+
How we implemented TypeScript-inspired static type inference for T-Ruby, enabling automatic type detection without explicit annotations.
9+
10+
<!-- truncate -->
11+
12+
## The Problem
13+
14+
When writing T-Ruby code, developers had to explicitly annotate every return type:
15+
16+
```ruby
17+
def greet(name: String): String
18+
"Hello, #{name}!"
19+
end
20+
```
21+
22+
Without the `: String` return type, the generated RBS would show `untyped`:
23+
24+
```rbs
25+
def greet: (name: String) -> untyped
26+
```
27+
28+
This was frustrating. The return type is obviously `String` - why can't T-Ruby figure it out?
29+
30+
## Inspiration: TypeScript's Approach
31+
32+
TypeScript handles this elegantly. You can write:
33+
34+
```typescript
35+
function greet(name: string) {
36+
return `Hello, ${name}!`;
37+
}
38+
```
39+
40+
And TypeScript infers the return type as `string`. We wanted the same experience for T-Ruby.
41+
42+
### How TypeScript Does It
43+
44+
TypeScript's type inference is built on two key components:
45+
46+
1. **Binder**: Builds a Control Flow Graph (CFG) during parsing
47+
2. **Checker**: Lazily evaluates types when needed, using flow analysis
48+
49+
The magic happens in `getFlowTypeOfReference` - a 1200+ line function that determines a symbol's type at any point in the code by walking backwards through flow nodes.
50+
51+
### Our Simplified Approach
52+
53+
Ruby's control flow is simpler than JavaScript's. We don't need the full complexity of TypeScript's flow graph. Instead, we implemented:
54+
55+
- **Linear data flow analysis** - Ruby's straightforward execution model
56+
- **Separation of concerns** - IR Builder (Binder role) + ASTTypeInferrer (Checker role)
57+
- **Lazy evaluation** - Types computed only when generating RBS
58+
59+
## Architecture
60+
61+
```
62+
[Binder Stage - IR Builder]
63+
Source (.trb) → ParserIR Tree (with method bodies)
64+
65+
[Checker Stage - Type Inferrer]
66+
IR Node traversal → Type determination → Caching
67+
68+
[Output Stage]
69+
Inferred types → RBS generation
70+
```
71+
72+
### The Core Components
73+
74+
#### 1. BodyParser - Parsing Method Bodies
75+
76+
The first challenge was that our parser didn't analyze method bodies - it only extracted signatures. We built `BodyParser` to convert T-Ruby method bodies into IR nodes:
77+
78+
```ruby
79+
class BodyParser
80+
def parse(lines, start_line, end_line)
81+
statements = []
82+
# Parse each line into IR nodes
83+
# Handle: literals, variables, operators, method calls, conditionals
84+
IR::Block.new(statements: statements)
85+
end
86+
end
87+
```
88+
89+
Supported constructs:
90+
- Literals: `"hello"`, `42`, `true`, `:symbol`
91+
- Variables: `name`, `@instance_var`, `@@class_var`
92+
- Operators: `a + b`, `x == y`, `!flag`
93+
- Method calls: `str.upcase`, `array.map { |x| x * 2 }`
94+
- Conditionals: `if`/`unless`/`elsif`/`else`
95+
96+
#### 2. TypeEnv - Scope Chain Management
97+
98+
```ruby
99+
class TypeEnv
100+
def initialize(parent = nil)
101+
@parent = parent
102+
@bindings = {} # Local variables
103+
@instance_vars = {} # Instance variables
104+
end
105+
106+
def lookup(name)
107+
@bindings[name] || @instance_vars[name] || @parent&.lookup(name)
108+
end
109+
110+
def child_scope
111+
TypeEnv.new(self)
112+
end
113+
end
114+
```
115+
116+
This enables proper scoping - a method's local variables don't leak into other methods, but instance variables are shared across the class.
117+
118+
#### 3. ASTTypeInferrer - The Type Inference Engine
119+
120+
The heart of the system:
121+
122+
```ruby
123+
class ASTTypeInferrer
124+
LITERAL_TYPE_MAP = {
125+
string: "String",
126+
integer: "Integer",
127+
float: "Float",
128+
boolean: "bool",
129+
symbol: "Symbol",
130+
nil: "nil"
131+
}.freeze
132+
133+
def infer_expression(node, env)
134+
# Check cache first (lazy evaluation)
135+
return @type_cache[node.object_id] if @type_cache[node.object_id]
136+
137+
type = case node
138+
when IR::Literal
139+
LITERAL_TYPE_MAP[node.literal_type]
140+
when IR::VariableRef
141+
env.lookup(node.name)
142+
when IR::BinaryOp
143+
infer_binary_op(node, env)
144+
when IR::MethodCall
145+
infer_method_call(node, env)
146+
# ... more cases
147+
end
148+
149+
@type_cache[node.object_id] = type
150+
end
151+
end
152+
```
153+
154+
### Handling Ruby's Implicit Returns
155+
156+
Ruby's last expression is the implicit return value. This is crucial for type inference:
157+
158+
```ruby
159+
def status
160+
if active?
161+
"running"
162+
else
163+
"stopped"
164+
end
165+
end
166+
# Implicit return: String (from both branches)
167+
```
168+
169+
We handle this by:
170+
1. Collecting all explicit `return` types
171+
2. Finding the last expression (implicit return)
172+
3. Unifying all return types
173+
174+
```ruby
175+
def infer_method_return_type(method_node, env)
176+
# Collect explicit returns
177+
return_types, terminated = collect_return_types(method_node.body, env)
178+
179+
# Add implicit return (unless method always returns explicitly)
180+
unless terminated
181+
implicit_return = infer_implicit_return(method_node.body, env)
182+
return_types << implicit_return if implicit_return
183+
end
184+
185+
unify_types(return_types)
186+
end
187+
```
188+
189+
### Special Case: `initialize` Method
190+
191+
Ruby's `initialize` is a constructor. Its return value is ignored - `Class.new` returns the instance. Following RBS conventions, we always infer `void`:
192+
193+
```ruby
194+
class User
195+
def initialize(name: String)
196+
@name = name
197+
end
198+
end
199+
```
200+
201+
Generates:
202+
203+
```rbs
204+
class User
205+
def initialize: (name: String) -> void
206+
end
207+
```
208+
209+
### Built-in Method Type Knowledge
210+
211+
We maintain a table of common Ruby method return types:
212+
213+
```ruby
214+
BUILTIN_METHOD_TYPES = {
215+
%w[String upcase] => "String",
216+
%w[String downcase] => "String",
217+
%w[String length] => "Integer",
218+
%w[String to_i] => "Integer",
219+
%w[Array first] => "untyped", # Element type
220+
%w[Array length] => "Integer",
221+
%w[Integer to_s] => "String",
222+
# ... 200+ methods
223+
}.freeze
224+
```
225+
226+
## Results
227+
228+
Now this T-Ruby code:
229+
230+
```ruby
231+
class Greeter
232+
def initialize(name: String)
233+
@name = name
234+
end
235+
236+
def greet
237+
"Hello, #{@name}!"
238+
end
239+
240+
def shout
241+
@name.upcase
242+
end
243+
end
244+
```
245+
246+
Automatically generates correct RBS:
247+
248+
```rbs
249+
class Greeter
250+
@name: String
251+
252+
def initialize: (name: String) -> void
253+
def greet: () -> String
254+
def shout: () -> String
255+
end
256+
```
257+
258+
No explicit return types needed!
259+
260+
## Testing
261+
262+
We built comprehensive tests:
263+
264+
- **Unit tests**: Literal inference, operator types, method call types
265+
- **E2E tests**: Full compilation with RBS validation
266+
267+
```ruby
268+
it "infers String from string literal" do
269+
create_trb_file("src/test.trb", <<~TRB)
270+
class Test
271+
def message
272+
"hello world"
273+
end
274+
end
275+
TRB
276+
277+
rbs_content = compile_and_get_rbs("src/test.trb")
278+
expect(rbs_content).to include("def message: () -> String")
279+
end
280+
```
281+
282+
## Challenges & Solutions
283+
284+
| Challenge | Solution |
285+
|-----------|----------|
286+
| Method bodies not parsed | Built custom BodyParser for T-Ruby syntax |
287+
| Implicit returns | Analyze last expression in blocks |
288+
| Recursive methods | 2-pass analysis (signatures first, then bodies) |
289+
| Complex expressions | Gradual expansion: literals → variables → operators → method calls |
290+
| Union types | Collect all return paths and unify |
291+
292+
## Future Work
293+
294+
- **Generic inference**: `[1, 2, 3]``Array[Integer]`
295+
- **Block/lambda types**: Infer block parameter and return types
296+
- **Type narrowing**: Smarter types after `if x.is_a?(String)`
297+
- **Cross-method inference**: Use inferred types from other methods
298+
299+
## Conclusion
300+
301+
By studying TypeScript's approach and adapting it for Ruby's simpler semantics, we built a practical type inference system. The key insights:
302+
303+
1. **Parse method bodies** - You can't infer types without seeing the code
304+
2. **Lazy evaluation with caching** - Don't compute until needed
305+
3. **Handle Ruby idioms** - Implicit returns, `initialize`, etc.
306+
4. **Start simple** - Literals first, then build up complexity
307+
308+
Type inference makes T-Ruby feel more natural. Write Ruby code, get type safety - no annotations required.
309+
310+
---
311+
312+
*The type inference system is available in T-Ruby. Try it out and let us know what you think!*

0 commit comments

Comments
 (0)