Skip to content

Commit be7b2eb

Browse files
committed
Add documentation for nested subfunctions feature
- Provide implementation details and use cases for Oracle-compatible nested subfunctions
1 parent 6347a79 commit be7b2eb

File tree

6 files changed

+397
-1
lines changed

6 files changed

+397
-1
lines changed

CN/modules/ROOT/nav.adoc

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,7 @@
2626
**** xref:master/6.3.2.adoc[OUT 参数]
2727
**** xref:master/6.3.4.adoc[%TYPE、%ROWTYPE]
2828
**** xref:master/6.3.5.adoc[NLS 参数]
29+
**** xref:master/6.3.7.adoc[嵌套子函数]
2930
*** xref:master/6.4.adoc[国标GB18030]
3031
** Oracle兼容功能列表
3132
*** xref:master/7.1.adoc[1、框架设计]
@@ -45,6 +46,7 @@
4546
*** xref:master/7.15.adoc[15、OUT 参数]
4647
*** xref:master/7.16.adoc[16、%TYPE、%ROWTYPE]
4748
*** xref:master/7.17.adoc[17、NLS 参数]
49+
*** xref:master/7.19.adoc[19、嵌套子函数]
4850
** IvorySQL贡献指南
4951
*** xref:master/8.1.adoc[社区贡献指南]
5052
*** xref:master/8.2.adoc[asciidoc语法快速参考]
Lines changed: 113 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,113 @@
1+
:sectnums:
2+
:sectnumlevels: 5
3+
4+
:imagesdir: ./_images
5+
6+
= 嵌套子函数
7+
8+
== 目的
9+
10+
- 嵌套子函数:定义在函数、存储过程或匿名块内部的函数或存储过程,也称为 subproc 或 inner 函数。
11+
- 父函数:承载嵌套函数的外层函数、存储过程或匿名块,执行过程中负责实际触发子函数调用。
12+
13+
== 实现说明
14+
15+
=== 一、嵌套子函数语法识别
16+
17+
==== 1. 识别嵌套写法
18+
19+
当 `DECLARE` 块里出现 `function ... is/as begin ... end` 结构时,`pl_gram.y` 会调用 `plisql_build_subproc_function()`(对应创建普通函数,这一阶段相当于在 `pg_proc` 中更新 catalog 中的注册信息):
20+
21+
. 在 `PLiSQL_function` 的 `subprocfuncs[]` 数组中创建 `PLiSQL_subproc_function` 结构,记录名称、参数、返回类型、属性等,获得一个下标 `fno` 作为该子函数的标识。
22+
. 调用 `plisql_check_subprocfunc_properties()` 校验声明与定义的合法组合。
23+
24+
==== 2. 数据项保存
25+
26+
保存到父函数的 `datum` 表:编译期的 `PLiSQL_function->datums` 描述子函数里的变量、记录字段,`PLiSQL_execstate->datums` 保存着执行过程中的变量。
27+
28+
==== 3. 保存多态函数模板
29+
30+
如果子函数里使用了多态参数,在语法阶段保存到 `subprocfunc->src`,同时将 `has_poly_argument` 设成 `true`,执行时按不同实参类型重新编译。
31+
32+
=== 二、父函数重新编译
33+
34+
. 父函数的 `PLiSQL_function` 结构多了一个 `subprocfuncs` 数组,里面每个元素就是刚才创建的 `PLiSQL_subproc_function`。
35+
. 子函数结构体 `PLiSQL_subproc_function` 有一个哈希表指针 `HTAB *poly_tab`,默认为空。当子函数里使用了多态函数时,`has_poly_argument` 为 `true`,则会在初次编译时初始化 `poly_tab`。`poly_tab` 的 key 是 `PLiSQL_func_hashkey`,记录着子函数的 `fno`、输入参数类型等; value 是编译好的 `PLiSQL_function *`(`plisql` 函数的执行上下文)。
36+
37+
=== 三、调用时解析
38+
39+
. 编译过程中,`pg` 解析器会生成一个 `ParseState` 结构,`plisql_subprocfunc_ref()` 会通过 `ParseState->p_subprocfunc_hook()` 找到父函数的 `PLiSQL_function`,调用 `plisql_ns_lookup()` 找到所有同名子函数的 `fno`,根据参数个数、类型找到最合适的多态函数。
40+
. `FuncExpr` 结构构造时会对子函数进行标记,方便后期执行阶段识别:`function_from = FUNC_FROM_SUBPROCFUNC`,`parent_func` 指向父级 `PLiSQL_function`,`funcid = fno`。
41+
. `plisql_call_handler()` 当 `function_from == FUNC_FROM_SUBPROCFUNC`,会用 `parent_func + fno` 找到对应的 `PLiSQL_subproc_function`:
42+
.. 如果不是多态:直接复用 `subprocfunc->function` 里的动作树。
43+
.. 如果是多态:先在 `poly_tab` 查有没有编译结果;没有就调用 `plisql_dynamic_compile_subproc()` 编译,放进 `poly_tab` 缓存。
44+
. 子函数开始执行之前,`plisql_init_subprocfunc_globalvar()` 会把父函数的 `datum` 表中有关的变量 fork 一份,这样子函数可以获取到父函数的变量值,也不会污染父函数的变量空间;执行后由 `plisql_assign_out_subprocfunc_globalvar()` 把需要回写的变量更新到父函数的 `datum` 表。
45+
46+
== 模块设计
47+
48+
=== PL/iSQL 语法扩展
49+
50+
- `pl_gram.y` 新增子过程声明、嵌套定义的产生式,并在创建过程中记录 `lastoutvardno`、子过程信息等元数据。
51+
- 支持在子函数内引用父过程变量、子过程以及自定义类型。
52+
53+
当 DECLARE 块内出现 `function ... is/as begin ... end` 结构时,`pl_gram.y` 会调用 `plisql_build_subproc_function()` 进行编译:
54+
55+
. 在 `PLiSQL_function` 的 `subprocfuncs[]` 中创建 `PLiSQL_subproc_function` 结构,记录名称、参数、返回类型和属性,分配下标 `fno` 作为子函数标识。
56+
. 调用 `plisql_check_subprocfunc_properties()` 校验声明与定义属性组合是否合法,防止重复或缺失声明造成的语义错误。
57+
58+
=== 数据项保存
59+
60+
父函数的 Datum 表在编译期和执行期分别缓存子函数能访问的变量:
61+
62+
. `PLiSQL_function->datums` 保存子函数编译阶段可见的变量与记录字段信息。
63+
. `PLiSQL_execstate->datums` 在执行阶段持有实时的变量数值,实现运行期访问。
64+
65+
=== 多态函数模板
66+
67+
若子函数包含多态参数,语法阶段会:
68+
69+
. 将子函数源文本拷贝到 `subprocfunc->src`。
70+
. 设置 `has_poly_argument = true`,为后续按实参类型动态编译做好准备。
71+
72+
=== 父函数重新编译
73+
74+
- 父函数的 `PLiSQL_function` 结构新增 `subprocfuncs` 数组,每个元素对应一个 `PLiSQL_subproc_function`。
75+
- `PLiSQL_subproc_function` 持有 `HTAB *poly_tab` 指针;当 `has_poly_argument` 为 `true` 时,在首次编译时初始化该缓存,键为 `PLiSQL_func_hashkey`(子函数 `fno` + 实参类型),值为编译后的 `PLiSQL_function`。
76+
77+
=== 解析器钩子
78+
79+
编译期间 PostgreSQL 解析器会构造 `ParseState`,`plisql_subprocfunc_ref()` 通过 `ParseState->p_subprocfunc_hook()` 连接父函数,调用 `plisql_ns_lookup()` 找到同名子函数的全部 `fno`,并依据参数个数与类型挑选最佳候选,实现重载分发。
80+
81+
=== FuncExpr 标记
82+
83+
构造 `FuncExpr` 时会标记嵌套调用信息,便于执行阶段识别:
84+
85+
- `function_from = FUNC_FROM_SUBPROCFUNC`。
86+
- `parent_func` 指向父级 `PLiSQL_function`。
87+
- `funcid = fno`,用于快速定位子函数定义。
88+
89+
=== 嵌套函数查找机制
90+
91+
- `plisql_subprocfunc_ref()` 作为 `ParseState->p_subprocfunc_hook` 实现入口,复用名称空间查询逻辑。
92+
- `plisql_get_subprocfunc_detail()` 依据参数数量、类型与命名匹配规则挑选最优候选,是嵌套函数重载的关键。
93+
94+
=== 执行路径
95+
96+
. `plisql_call_handler()` 判断 `function_from` 后,通过 `parent_func + fno` 找到目标 `PLiSQL_subproc_function`。
97+
. 对普通子函数,直接复用 `subprocfunc->function` 缓存;
98+
. 对多态子函数,先查询 `poly_tab`,未命中时调用 `plisql_dynamic_compile_subproc()` 动态编译并写入缓存。
99+
100+
=== 变量同步
101+
102+
- `plisql_init_subprocfunc_globalvar()` 在子函数执行前拷贝父函数 Datum 表中的相关变量,保证子函数读取到外层最新状态。
103+
- `plisql_assign_out_subprocfunc_globalvar()` 在返回前回写 OUT/INOUT 变量,确保父子函数数据一致性且互不污染。
104+
105+
=== PSQL 端语句发送
106+
107+
- `psqlscan.l` 调整 `proc_func_define_level` 和 `begin_depth` 的入栈/出栈逻辑,确保嵌套函数体整体发送至 SQL 端。
108+
- 只有当嵌套层级回到 0 且遇到分号时,才触发发送,避免子函数块被拆分。
109+
110+
=== SQL 层返回值获取
111+
112+
- 普通函数通过 `funcid` 访问 `pg_proc`;嵌套函数依赖 `FuncExpr.parent_func` 承载的 `PLiSQL_function`。
113+
- 为此实现一组函数指针(`plisql_register_internal_func()` 注册)供 SQL 层回调,按需获取嵌套函数名称、返回类型与 OUT 参数信息。
Lines changed: 83 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,83 @@
1+
:sectnums:
2+
:sectnumlevels: 5
3+
4+
:imagesdir: ./_images
5+
6+
= 兼容Oracle 嵌套子函数
7+
8+
== 1.目的
9+
10+
- 在 IvorySQL 中使用 Oracle 风格嵌套子函数。
11+
12+
== 2.功能说明
13+
14+
- 支持在匿名块、函数或过程内部声明与调用子函数/子过程,作用域限定在父块内。
15+
- 子函数可读取及更新父级变量,也可定义自身局部变量;父级无法直接访问子函数内部状态。
16+
- 支持重载解析机制,按参数个数、类型或命名区分同名子程序。
17+
18+
== 3.测试用例
19+
20+
[source,sql]
21+
----
22+
DO $$
23+
DECLARE
24+
v_result integer;
25+
FUNCTION inner_square(p_value integer) RETURN integer IS
26+
BEGIN
27+
RAISE NOTICE 'inner_square called';
28+
RETURN p_value * p_value;
29+
END;
30+
BEGIN
31+
v_result := inner_square(10);
32+
RAISE NOTICE 'result=%', v_result;
33+
END;
34+
$$ LANGUAGE plisql;
35+
----
36+
37+
[source,sql]
38+
----
39+
DO $$
40+
DECLARE
41+
v_base_multiplier integer := 20;
42+
v_audit_counter integer := 0;
43+
v_result integer;
44+
FUNCTION inner_square(p_value integer) RETURN integer IS
45+
BEGIN
46+
RAISE NOTICE 'inner_square called';
47+
v_audit_counter := v_audit_counter + 1;
48+
RETURN v_base_multiplier * p_value;
49+
END;
50+
BEGIN
51+
v_result := inner_square(10);
52+
RAISE NOTICE 'result=%', v_result;
53+
RAISE NOTICE 'v_audit_counter=%', v_audit_counter;
54+
END;
55+
$$ LANGUAGE plisql;
56+
----
57+
58+
[source,sql]
59+
----
60+
-- Polymorphic nested function specializing on argument type
61+
DO $$
62+
DECLARE
63+
v_last_notice text := 'none';
64+
65+
FUNCTION describe_value(p_input anyelement) RETURN text IS
66+
BEGIN
67+
v_last_notice := format('polymorphic dispatch with %s', pg_typeof(p_input));
68+
RETURN v_last_notice;
69+
END;
70+
71+
FUNCTION describe_value(p_input anyarray, p_element anyelement) RETURN text IS
72+
BEGIN
73+
v_last_notice := format('array dispatch with %s', pg_typeof(p_input)::text);
74+
RETURN v_last_notice;
75+
END;
76+
BEGIN
77+
RAISE NOTICE '%', describe_value(100);
78+
RAISE NOTICE '%', describe_value('IvorySQL'::text); -- explicit cast avoids ambiguous literal
79+
RAISE NOTICE '%', describe_value(ARRAY[1,2,3], NULL::int); -- extra arg guides anyarray resolution
80+
RAISE NOTICE 'last notice=%', v_last_notice;
81+
END;
82+
$$ LANGUAGE plisql;
83+
----

EN/modules/ROOT/nav.adoc

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@
1111
** xref:master/4.3.adoc[Developer]
1212
** xref:master/4.4.adoc[Operation Management]
1313
** xref:master/4.5.adoc[Migration]
14-
* IvorySQL Ecosystem
14+
* IvorySQL Ecosystem
1515
** xref:master/5.1.adoc[PostGIS]
1616
** xref:master/5.2.adoc[pgvector]
1717
* IvorySQL Architecture Design
@@ -25,6 +25,7 @@
2525
*** xref:master/6.3.2.adoc[OUT Parameter]
2626
*** xref:master/6.3.4.adoc[%Type & %Rowtype]
2727
*** xref:master/6.3.5.adoc[NLS Parameters]
28+
*** xref:master/6.3.7.adoc[Nested Subfunctions]
2829
** xref:master/6.4.adoc[GB18030 Character Set]
2930
* List of Oracle compatible features
3031
** xref:master/7.1.adoc[1、Ivorysql frame design]
@@ -44,6 +45,7 @@
4445
** xref:master/7.15.adoc[15、OUT Parameter]
4546
** xref:master/7.16.adoc[16、%Type & %Rowtype]
4647
** xref:master/7.17.adoc[17、NLS Parameters]
48+
** xref:master/7.19.adoc[19、Nested Subfunctions]
4749
* xref:master/8.adoc[Community contribution]
4850
* xref:master/9.adoc[Tool Reference]
4951
* xref:master/10.adoc[FAQ]
Lines changed: 113 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,113 @@
1+
:sectnums:
2+
:sectnumlevels: 5
3+
4+
:imagesdir: ./_images
5+
6+
= Nested Subfunctions
7+
8+
== Objective
9+
10+
- Nested subfunctions refer to functions or procedures defined inside another function, stored procedure, or anonymous block; they are also called subprocs or inner functions.
11+
- Parent functions are the outer functions, stored procedures, or anonymous blocks that host nested subfunctions and are responsible for invoking them during execution.
12+
13+
== Implementation Notes
14+
15+
=== 1. Syntax Recognition for Nested Subfunctions
16+
17+
==== 1. Detecting Nested Definitions
18+
19+
When a `DECLARE` block contains a `function ... is/as begin ... end` construct, `pl_gram.y` calls `plisql_build_subproc_function()` (similar to creating a regular function and updating the entry in `pg_proc`):
20+
21+
. Create a `PLiSQL_subproc_function` entry in the parent `PLiSQL_function`'s `subprocfuncs[]` array to store the name, arguments, return type, and other attributes, and record the index `fno` as the identifier of this subfunction.
22+
. Call `plisql_check_subprocfunc_properties()` to validate the combination of declaration and definition attributes.
23+
24+
==== 2. Storing Datum Entries
25+
26+
Nested subfunctions share the parent's datum table. During compilation, `PLiSQL_function->datums` describes variables and record fields inside the subfunction, while `PLiSQL_execstate->datums` keeps the runtime values.
27+
28+
==== 3. Preserving Polymorphic Templates
29+
30+
If the subfunction uses polymorphic parameters, the parser stores its source code in `subprocfunc->src` and sets `has_poly_argument` to `true` so that the executor can recompile it for each distinct argument type.
31+
32+
=== 2. Recompiling the Parent Program
33+
34+
. The parent `PLiSQL_function` gains a `subprocfuncs` array, each element being the `PLiSQL_subproc_function` created earlier.
35+
. Each `PLiSQL_subproc_function` has a `HTAB *poly_tab` pointer that is initialized on the first compilation when `has_poly_argument` is `true`. The hash key is `PLiSQL_func_hashkey`, which records the subfunction's `fno` and input argument types; the value is the compiled `PLiSQL_function *` execution context.
36+
37+
=== 3. Name Resolution During Invocation
38+
39+
. PostgreSQL builds a `ParseState` structure during compilation. `plisql_subprocfunc_ref()` locates the parent `PLiSQL_function` through `ParseState->p_subprocfunc_hook()` and calls `plisql_ns_lookup()` to gather all `fno` values for subfunctions sharing the same name, then selects the best match based on argument count and types.
40+
. When `FuncExpr` nodes are created, the subfunction call is tagged for later execution: `function_from = FUNC_FROM_SUBPROCFUNC`, `parent_func` points to the parent `PLiSQL_function`, and `funcid = fno`.
41+
. In `plisql_call_handler()`, when `function_from == FUNC_FROM_SUBPROCFUNC`, the runtime fetches the appropriate `PLiSQL_subproc_function` via the pair `(parent_func, fno)`:
42+
.. For non-polymorphic subfunctions, reuse the precompiled action tree stored in `subprocfunc->function`.
43+
.. For polymorphic subfunctions, probe `poly_tab`; if there is no cached plan, call `plisql_dynamic_compile_subproc()` to compile one and store it in the cache.
44+
. Before execution, `plisql_init_subprocfunc_globalvar()` forks relevant entries from the parent's datum table so the subfunction can access the latest parent variables without polluting the parent scope. After execution, `plisql_assign_out_subprocfunc_globalvar()` writes back the necessary variables.
45+
46+
== Module Design
47+
48+
=== PL/iSQL Grammar Extensions
49+
50+
- `pl_gram.y` adds productions for subprocedure declarations and nested definitions, and records metadata such as `lastoutvardno` and subprocedure descriptors.
51+
- Nested subfunctions can reference variables from the parent scope, other subprocedures, and user-defined types.
52+
53+
Whenever a `function ... is/as begin ... end` construct is seen inside a `DECLARE` block, `pl_gram.y` invokes `plisql_build_subproc_function()`:
54+
55+
. Insert a `PLiSQL_subproc_function` entry into the parent `PLiSQL_function->subprocfuncs[]`, storing the name, arguments, return type, and other attributes, and assign an index `fno`.
56+
. Call `plisql_check_subprocfunc_properties()` to verify that declarations and definitions are consistent and to prevent duplicate or missing declarations from introducing semantic errors.
57+
58+
=== Datum Storage
59+
60+
The parent program's datum tables hold the variables accessible to nested subfunctions during compilation and execution:
61+
62+
. `PLiSQL_function->datums` preserves the variable and record metadata visible during compilation.
63+
. `PLiSQL_execstate->datums` carries the live values at runtime.
64+
65+
=== Polymorphic Templates
66+
67+
When a subfunction contains polymorphic arguments, the parser will:
68+
69+
. Copy the subfunction source text into `subprocfunc->src`.
70+
. Set `has_poly_argument = true` to prepare for dynamic recompilation based on actual argument types.
71+
72+
=== Parent Recompilation
73+
74+
- The parent `PLiSQL_function` includes a `subprocfuncs` array, with each element corresponding to a `PLiSQL_subproc_function`.
75+
- Each `PLiSQL_subproc_function` maintains an optional `HTAB *poly_tab`; when `has_poly_argument` is `true`, the cache is initialized on the first compile. Keys are `PLiSQL_func_hashkey` (subfunction `fno` plus argument types), and values are the compiled `PLiSQL_function` plans.
76+
77+
=== Parser Hooks
78+
79+
During compilation, PostgreSQL creates a `ParseState`. `plisql_subprocfunc_ref()` plugs into `ParseState->p_subprocfunc_hook`, reusing the namespace lookup logic to gather candidates. `plisql_get_subprocfunc_detail()` then chooses the best match based on argument count, types, and named parameters, enabling overloaded dispatch.
80+
81+
=== FuncExpr Annotation
82+
83+
When constructing `FuncExpr` nodes, the compiler attaches metadata so the executor can recognize nested calls:
84+
85+
- `function_from = FUNC_FROM_SUBPROCFUNC`.
86+
- `parent_func` references the owning `PLiSQL_function`.
87+
- `funcid = fno`, enabling direct lookup of the subfunction definition.
88+
89+
=== Nested Subfunction Lookup
90+
91+
- `plisql_subprocfunc_ref()` implements `ParseState->p_subprocfunc_hook` and reuses the namespace search to find nested subfunctions.
92+
- `plisql_get_subprocfunc_detail()` applies matching rules for argument count, type, and naming to pick the optimal overload.
93+
94+
=== Execution Path
95+
96+
. `plisql_call_handler()` checks `function_from`; if it is a nested subfunction, the handler locates `PLiSQL_subproc_function` via `(parent_func, fno)`.
97+
. For regular subfunctions, reuse the cached plan stored in `subprocfunc->function`.
98+
. For polymorphic subfunctions, consult `poly_tab`; on a miss, call `plisql_dynamic_compile_subproc()` to build and cache a specialized plan.
99+
100+
=== Variable Synchronization
101+
102+
- `plisql_init_subprocfunc_globalvar()` copies the relevant entries from the parent datum table before the subfunction runs to expose the latest state.
103+
- `plisql_assign_out_subprocfunc_globalvar()` writes back OUT/INOUT variables after execution to keep parent and child scopes consistent without mutual pollution.
104+
105+
=== Statement Dispatch in psql
106+
107+
- `psqlscan.l` adjusts the push/pop logic of `proc_func_define_level` and `begin_depth` so the nested subfunction body is transmitted to the SQL engine as a whole.
108+
- Statements are sent only when the nesting depth returns to zero and a semicolon is reached, avoiding partial dispatch of subfunction blocks.
109+
110+
=== Retrieving Return Information on the SQL Side
111+
112+
- Regular functions obtain metadata via `funcid` from `pg_proc`; nested subfunctions rely on `FuncExpr.parent_func`, which holds the parent `PLiSQL_function`.
113+
- A set of callback pointers (registered through `plisql_register_internal_func()`) allows the SQL layer to fetch nested subfunction names, return types, and OUT parameter information on demand.

0 commit comments

Comments
 (0)