(fix + refac): Method-aware output-eltype via kernel-shape op trait#146
Conversation
Treat Constant as `y[idx] + 0·(xq - x[idx])` so `Tq`'s carrier (Dual, Measurement, SVector, …) propagates naturally — the same rule Linear / Cubic / Quadratic / Hermite already follow. Drop the scalar `convert(_output_eltype, val)` wrapper and replace the persistent allocating batch + 1D Constant oneshot allocators with sample-first so the container fits whatever the kernel actually produces. Constant `_output_eltype` overrides and inline `T_out` expressions removed; ND Constant scalar oneshot drops its `::Tv` typeassert. Every method now uses the generic `_output_eltype(Tv, Tg, Tq)`. Constant kernels multiply the selection by `one(dL)` (carrier propagation); derivative branches use `0 * y_left * one(dL)` to preserve NaN/Inf propagation and keep the minimal `*(::Int, ::Tv)` + `*(::Tv, ::Real)` duck-type contract. Behavior change: `constant_interp` on `Int y` with a `Float xq` now returns `Float` (carrier propagates through zero-slope arithmetic). The fully-Int chain `(Int x, Int y, Int xq)` still returns `Int`. The 52a1f91 raw-Tv-for-promotable-Tq contract from #138 is intentionally relaxed to keep Constant on the same machinery as every other method — no Constant-specific helpers. Series / ND-oneshot-3-arg / hetero allocators remain on the old path and will migrate in a follow-up. New `test_duck_tv_dual_tq.jl` pins the Tv × Tq cross product across methods. Existing Constant / Complex / precision / promotion tests updated to the natural-promote contract. Closes #144.
Follow-up to #144. Close the deferred one-shot allocator gap (1D batch, ND batch, Hetero ND, Series) without touching the 12 individual call sites — and patch the Constant scalar/anchor short-circuits that broke `@inferred itp(1.0)` with Int y. * `_output_eltype` (src/core/utils.jl): the duck-type fallback now queries `Base.promote_op` on the universal kernel-shape op `yv * q + yv`, matching the Julia idiom used by `LinearAlgebra.mul` (`promote_op(matprod, eltype(A), eltype(B))`) and `Base.map` (`_return_type(f, eltype(A))`). Inference resolves the call to a constant at compile time (zero runtime cost — verified via LLVM IR: the function body is a single `ret`). For `SVector{Float64} × Dual` this returns `SVector{Dual}` where `promote_type` returned the non-concrete `Any` and the old fallback silently returned `Tv`. When the op is undefined (e.g., custom duck-type with no `*`), the legacy `Tv` fallback is preserved. * Constant short-circuits (constant_oneshot.jl, constant_anchor.jl): at `xi == last(x)` the scalar/anchor paths used to `return last(y)` raw — bypassing the kernel's `selected * one(dL)` carrier propagation. With `Int y` and `Float xq` that disagrees with the kernel's `Float` return, so `@inferred itp(1.0)` saw `Union{Float64, Int64}`. Short-circuits now multiply by `one(xi)` to match the kernel, restoring full type stability across both the point-based scalar path and the four anchor-dispatch branches. * linear_oneshot.jl: `_output_eltype(eltype(y), Tg)` was missing `Tq` entirely, so the trait couldn't see the carrier chain even after the duck fallback was correct. Now passes `eltype(x_targets)`. * hermite_oneshot.jl: dropping `eltype(dy)` from the trait call — including it collapses `promote_type(...)` to `Any` on duck inputs, which prevents the new `promote_op` fallback from resolving the carrier. `eltype(dy)` is typically the same shape as `Tv` so it contributes no extra carrier information. Side-effect fix: `_output_eltype(SVector{Int}, Float64, Float64)` now returns `SVector{Float64}` (was `SVector{Int}`). The linear/cubic/etc. kernels promote Int→Float internally via interval-slope division, so the old prediction caused `convert(::SVector{Int}, ::SVector{Float64})` errors that no test exercised — now correct. Tests: test_duck_tv_dual_tq.jl reorganized into three logical test items (Persistent path, Oneshot path, Type stability + raw-eltype contracts) covering scalar/batch/ND × duck-Tv × duck-Tq, the ForwardDiff.derivative MWE, `@inferred` Union-free assertions for the Constant short-circuit, the `_output_eltype` type table, and the Int→Int raw-eltype non-regression. Verified: smoke 42/42 PASS (master: 38/42, v0.4.9: 26/42); 35,587 tests pass on latest Julia. LTS run blocked by a pre-existing test/Manifest.toml resolution issue unrelated to this patch.
Surfaced by extended smoke covering Series / ND / Adjoint paths against
duck-Tv (SVector) × duck-Tq (Dual) combinations. Three regressions
fixed:
* Constant ND oneshot batch (src/constant/nd/constant_nd_oneshot.jl):
the allocator was `Vector{Tv}(undef, …)` and the outer wrapper
carried a `::Vector{Tv}` typeassert — both bypass kernel carrier
propagation, so `constant_interp((xg,yg), data_sv, queries_dual)`
raised `Float64(::Dual)`. Replaced with sample-first allocation
matching the 1D Constant pattern: pull `first_q` via
`_extract_query_point`, run the scalar oneshot once, size from
`typeof(first_val)`. The deriv-zero short-circuit now sizes from
`0 * first(data) * one(eltype(first_q))` for the same reason.
* 1D adjoint allocators (src/core/adjoint_protocol.jl:115, 125):
`Tv = promote_type(eltype(y_bar), Tg)` returned `Any` for
`SVector{3,Float64} × Float64` (no `promote_rule` between Base and
StaticArrays), so `zeros(Any, n)` raised `zero(::Type{Any})`.
Routing through `_output_eltype` reuses the `Base.promote_op` duck
fallback from b675e17 and resolves to `SVector{3,Float64}`. Linear,
Constant, Cubic, Quadratic, Hermite, Cardinal, PCHIP, Akima
adjoints now all accept SVector y_bar.
* ND adjoint allocators (src/core/adjoint_protocol.jl:353, 368):
same root cause, same `promote_type → _output_eltype` fix. Linear,
Constant, Cubic, Quadratic, Hetero ND adjoints all accept SVector
y_bar (`Matrix{SVector{3,Float64}}` return).
Tests:
- test_duck_tv_dual_tq.jl gains an Adjoint-path testitem covering all
7 1D adjoints + 4 ND adjoints × {SVector, Dual} y_bar.
- test_constant_eltype.jl updates two assertions that previously
pinned the broken ND oneshot return as a known limitation; they now
pin the carrier-propagated Float64 return.
Verified: extended smoke 116/116 PASS (Series × 4 methods + ND forward +
ND/1D adjoints × duck-Tv + slope-family OnTheFly); all affected adjoint
test files still pass.
Out of scope (separate follow-up): the ND scalar oneshot path
(`constant_interp((xg,yg), data, query_tuple)`) returns `Float64` for
all-Int inputs while the persistent path returns `Int` — pre-existing
asymmetry unrelated to this patch.
Stage 3 of the Issue #144 cleanup. Series one-shot and persistent callable allocators previously composed two traits as `_series_output_type(_output_eltype(Tv, Tg), Tq)`. For duck-Tv × duck-Tq combinations (e.g., `Series(SVector{3,Float64}, …)` queried with a ForwardDiff.Dual) the outer step computes `promote_type(SVector{Float}, Dual) = Any`, falls through to its fallback (`Tv`), and the kernel's actual `SVector{Dual}` return then either errors on `setindex!(::Vector{SVector{Float}}, ::SVector{Dual})` or — for Constant's selection kernel — silently produces `Vector{Vector{Any}}`. Switching to a single `_output_eltype(Tv, Tg, Tq)` call uses the b675e17 `Base.promote_op` fallback, which resolves `SVector{Float} × Dual → SVector{Dual}` at compile time. Six sites across linear/cubic/quadratic series (scalar and batch, persistent and one-shot) migrate the chain to the unified form. Constant series keeps its raw-Tv contract for promotable Tq (Int y + Float xq → Vector{Int}, preserved) but routes duck Tq through `_output_eltype(Tv, Tq)` so SVector × Dual no longer collapses to `Vector{Any}`. Four allocator sites updated. Regression coverage added to test_duck_tv_dual_tq.jl: a new `Series path — duck-Tv × duck-Tq carrier propagates` testitem pins the four methods × {scalar, batch} × {SVector y, Dual xq} grid, and the existing Oneshot testitem now also asserts the Constant ND oneshot batch path (closed in acfc95a). Verified: extended smoke 132/132 PASS (was 126/132 with 6 series duck-Tv × duck-Tq fails before this commit); all touched test files green on latest Julia.
Follow-up to the original Issue #144 PR, addressing five user-visible paths where carrier propagation still collapsed despite the unified `_output_eltype` trait. All share the same root-cause family but live in branches the prior commits did not reach. Sites addressed: - `cubic_series_interp.jl` — `CubicSeriesInterpolant` build-then-call path still chained `_series_output_type(_output_eltype(Tv, Tg), Tq)`, collapsing `promote_type(SVector{F}, Dual)` to `Any`. Migrated to the unified `_output_eltype(Tv, Tg, Tq)` (mirrors the prior 6-site migration). - `interpolant_protocol.jl::_eval_nd_at_point` deriv-zero short-circuit and `constant_nd_eval.jl::_eval_at_cell` deriv branch and `constant_nd_oneshot.jl` scalar + batch deriv branches — all returned `0 * <data>` without query-carrier propagation. Now multiply by `prod(one, query)` so per-axis carriers (homogeneous Dual, heterogeneous `(Float, Dual)`) survive — symmetric with the forward kernel's per-axis `* one(dL_d)`. Affects every ND method at 2nd+ derivative paths. - `hermite_oneshot.jl` — original PR dropped `eltype(dy)` to avoid `promote_type` collapse on SVector + Dual inputs, which re-introduced the MethodError commit b3a3379 fixed for `Float64 y + Vector{Dual} dy`. Replaced with a two-call `_output_eltype` pattern keeping `Tv` and `eltype(dy)` in disjoint promote chains — handles both cases. Coverage: - `CubicSeriesInterpolant` persistent path × SVector × Dual (scalar + batch) - Constant ND deriv-zero × persistent + one-shot scalar - Heterogeneous-axis (Float, Dual) × mixed-deriv: Constant ND persistent scalar + one-shot scalar + one-shot batch, Linear ND persistent 2nd-deriv - Hermite Float32 y + Float64 dy precision pin - Hermite Float64 y + Vector{Dual} dy + ForwardDiff.derivative on slopes
…tput_type
Constant and Quadratic 1D `itp(aq_vec)` (batch eval with pre-built anchors)
allocated `Vector{Tg}` — only the grid type, ignoring `Tq` in the anchor
parameterization. The Constant kernel now multiplies by `* one(dL::Tq)` and
the Quadratic kernel mixes anchored deltas with `Tq`, so a duck-typed `Tq`
(Dual, …) raised `MethodError` on `setindex!`. Linear's path already used
`_output_eltype(Tv, Tg, Tq)`; mirrored here.
`_series_output_type` is no longer called from any src file after the Cubic
Series persistent path migrated to the unified `_output_eltype(Tv, Tg, Tq)`
in commit 440be0b. Removed; its narrowed `promote_type → Tv` fallback is
superseded by the duck-aware `Base.promote_op` fallback in `_output_eltype`.
Coverage added to test/test_duck_tv_dual_tq.jl exercises both methods'
anchored-vector callable with `_ConstantAnchoredQuery{Tg, Dual}` and
`_QuadraticAnchoredQuery{Tg, Dual}` via `_fill_anchors!`.
…tion
The PR series that restored duck-Tv × duck-Tq carrier propagation changed
Constant's behavior from "selection kernel returns raw `Tv`" to "kernel
multiplies by `* one(dL::Tq)` so the return widens via promote_type".
Storage stays raw `{Tg, Tv}` (no `_promote_grid_float` indirection) — only
the return type widens.
Comments referencing the old "raw-Tv contract" / "Int in → Int out" return
behavior, or the deleted `_output_eltype(itp, Tq)` override in
`constant_nd_types.jl`, are updated to reflect the post-restoration model:
storage is raw, the kernel handles return-type widening per-callable.
No behavior change. Pure documentation alignment.
…fied) The protocol's 1D + ND batch allocating callables, and Constant's 1D + ND one-shot batch wrappers, called the scalar/persistent kernel for the first query just to read `typeof(first_val)` for the output buffer. That extra evaluation is per-batch overhead on a hot path — measurable at small N and strictly redundant once the trait predicts correctly. `_output_eltype(itp, Tq)` (and the 3-arg `_output_eltype(Tv, Tg, Tq)` used by one-shot wrappers) now drives all allocators directly: empty-batch and populated paths share one expression. The kernel still produces the same runtime values; only the container eltype is sized from types, not from sampling. Constant follows the same natural-promotion rule as every other method (no Constant-specific helper or override per the original Issue #144 PR's direction). The kernel's `* one(dL)` still preserves scalar-path semantics (Int×Int×Int scalar callable returns Int via kernel direct); batch and one-shot wrappers widen via the trait's Int→Float upgrade. Tests pinning the old sample-first `Vector{Int}` for fully-Int Constant batch are updated to `Vector{Float64}`. Locations: - src/core/interpolant_protocol.jl:60-79 (1D batch) - src/core/interpolant_protocol.jl:255-272 (ND batch) - src/constant/constant_oneshot.jl:277-298 (1D one-shot batch) - src/constant/nd/constant_nd_oneshot.jl:98-115 (ND one-shot batch)
`_output_eltype` previously enumerated standard numeric types via the
`_PromotableValue` union and hand-applied an `Int→Float` upgrade — a manual
proxy for "the arithmetic kernels divide". Selection kernels (Constant) don't
divide, so the upgrade was wrong for them and was masked by sample-first
allocators (one extra scalar eval per batch).
Add a method-shape overload that lets the kernel itself drive inference:
@inline _output_eltype(kernel_op::F, ::Type{Tv}, types::Type...) where {F, Tv} =
... Base.promote_op(kernel_op, Tv, types...) ...
Constant declares its kernel shape and routes its trait through the overload:
@inline _constant_kernel_shape(yv, q) = yv * one(q)
@inline _output_eltype(::ConstantInterpolant{Tg, Tv}, ::Type{Tq}) where {Tg, Tv, Tq} =
_output_eltype(_constant_kernel_shape, Tv, Tq)
Julia infers the exact kernel return type — no `_PromotableValue` enumeration,
no Float upgrade. Int×Int×Int matches the scalar callable (Int), SVector ×
Dual still resolves to SVector{Dual}. Constant ND mirrors the override. The
`_PromotableValue` definition stays (other paths — grid promotion, autocache
gating, FillExtrap auto-convert — use it independently).
Other methods (Linear/Cubic/Quadratic/Hermite/etc.) keep the existing trait —
their kernels divide, so the Float upgrade matches reality. Migrating them to
their own kernel shape op is a follow-up.
Tests:
- Constant Int contract reverted: `itp([2,5,8]) isa Vector{Int}` again,
matching `itp(3) === 30`. No scalar/batch asymmetry.
- New `@testitem "Type stability — duck-Tv × duck-Tq inference"`: `@inferred`
coverage for 1D persistent/oneshot × 5 methods (scalar + batch), ND
persistent/oneshot × 3 methods, ND deriv-zero per-axis carrier, Hermite
mixed eltype, CubicSeriesInterpolant persistent, anchored-vector callable.
Catches silent Union returns and `Any` fallbacks `isa` checks would miss.
Linear/Cubic/Quadratic/Hermite all share the same kernel-shape pattern at the
type level — every arithmetic kernel divides by `h`, which Float-upgrades Int
chains and propagates duck carriers through `(dL / h)`. Declare one shared
`_arithmetic_kernel_shape(yv, dL, h) = yv + yv * (dL / h)` and let Linear
opt into it, validating the pattern at a second method beyond Constant.
Migrated:
- `LinearInterpolant` + `LinearInterpolantND` `_output_eltype(itp, Tq)` →
`_output_eltype(_arithmetic_kernel_shape, Tv, Tq, Tg)`
- 1D + ND one-shot wrappers (`linear_interp(x, y, x_targets)` and the ND
scalar/batch variants) — buffer eltype from kernel-shape probe
- Anchored-vector callable (`itp(aq_vec)`)
Behavior parity for standard numerics (Int chain → Float, Float×Float → Float,
Complex{Int} → ComplexF64, mixed Int×Float → Float — all match the previous
`_PromotableValue`-driven Float upgrade). The trait's prediction now exactly
matches the kernel reality, so empirical kernel output and trait-allocated
buffer always agree.
A side benefit: `Rational × Rational × Rational` chain now predicts
`Rational{Int}` rather than `Float64` — the old trait over-predicted Float
because `_PromotableValue` enumeration included `Rational`, but Julia's
`Rational/Rational = Rational` (no Float upgrade) means the kernel actually
preserves Rational. No existing test pinned the old over-prediction.
Cubic/Quadratic/Hermite/PCHIP/Cardinal/Akima migration deferred to a
follow-up — same shape op, mechanical override addition per method type.
… Rational
`_arithmetic_kernel_shape` arguments reordered from `(yv, dL, h)` to
`(h, yv, dL)` so callsites read `_output_eltype(shape, Tg, Tv, Tq)` —
matching the codebase's standard `{Tg, Tv, Tq}` type-parameter order
(`Interpolant{Tg, Tv, ...}`, where-clauses, etc.). Five Linear callsites
updated correspondingly. Body unchanged — purely cosmetic.
Tests pin the trait's Rational behavior at two layers:
- Trait-level (`@inferred`): `_output_eltype(_arithmetic_kernel_shape,
Rational{Int}, Rational{Int}, Rational{Int}) === Rational{Int}`, plus
Int/Float/Complex baselines.
- Constant end-to-end: storage is raw `{Tg, Tv}` so the trait's Rational
prediction flows through to the user-visible Vector eltype.
- Linear end-to-end: `@test_broken` — `_promote_grid_float` in the
constructor lifts grid/values to Float64 at build time, shadowing the
trait's Rational prediction. Marked broken so a future relaxation
(storage following kernel reality) auto-surfaces as `now passing`.
The `@test_broken` pin documents the design gap explicitly: trait-level
correctness diverges from user-visible behavior at Linear/Cubic/etc.
because the constructor's Float-forcing predates the kernel-shape trait.
Resolving this is deferred to a follow-up.
Extend the kernel-shape op pattern (validated at Constant + Linear) to every
remaining arithmetic method (Cubic, Quadratic, Hermite, PCHIP, Cardinal,
Akima). One shared shape captures all of them — the division `y + y*(dL/h)`
that Float-upgrades Int chains and propagates duck carriers.
Trait wiring:
- `interpolant_protocol.jl`: default `_output_eltype(::AbstractInterpolant1D,
Tq)` (and ND mirror) now routes through `_arithmetic_kernel_shape`. Every
method's persistent-batch path is migrated by inheritance.
- `hermite_interpolant.jl`: `AbstractHermiteInterpolant1D` override applies
the two-call pattern over `Tv` and `eltype(itp.dy)`, keeping disjoint
promote chains so a duck-typed `dy` widens the result without poisoning
the `y` chain.
- `linear_interpolant.jl` + `linear_nd_types.jl`: explicit Linear overrides
removed — now redundant with the default.
Oneshot/series wrapper migrations (13 callsites): swap legacy
`_output_eltype(Tv, Tg, Tq)` → `_output_eltype(_arithmetic_kernel_shape,
Tg, Tv, Tq)`. Hermite oneshot's two-call already had the right shape; just
swap to the shared shape op.
Net effect: `_PromotableValue` enumeration + hand-coded Float upgrade are
no longer reached from any output-buffer allocation path. Internal storage
sites (coefficient/cache matrices, `Tz`/`Tc`) keep the legacy 2-arg trait —
their role is different (intermediate compute, not kernel output).
A latent consequence the pattern surfaces: Rational chains predict
`Rational{Int}` rather than `Float64` at the trait level (kernel reality —
Julia's `Rational/Rational = Rational` doesn't Float-upgrade). User-visible
preservation is still blocked by `_promote_grid_float` in non-Constant
constructors (pinned as `@test_broken` in test_duck_tv_dual_tq.jl) — that
gap is the deferred follow-up.
- utils.jl 2-arg _output_eltype docstring: replace the inaccurate "arithmetic-kernel allocators (Linear/Cubic/Quadratic/Hermite)" audience claim with the actual current users (internal coefficient eltype, adjoint allocators, hetero ND legacy paths). Direct readers to the method-aware kernel-op overload for output-buffer sizing. - constant_nd_eval.jl: drop reference to the deleted "sample-first allocator"; the protocol now uses trait-only sizing via _output_eltype. - constant_series_interp.jl: ConstantSeriesInterpolant scalar callable docstring claimed `promote_type` widening for ducks; the code now uses `_output_eltype` (Base.promote_op on the kernel shape) so SVector × Dual resolves correctly. Doc-only — no runtime behavior change.
Two missed migration sites prevented correct output-eltype propagation
under the kernel-shape op trait introduced earlier in this branch.
- constant_anchor.jl: the `ConstantInterpolant(aq_vec)` anchored-vector
callable was routing through the legacy 2-arg `_output_eltype(Tv, Tg, Tq)`
which Float-upgrades any `Tr <: _PromotableValue && !<:AbstractFloat`.
For a fully-Int chain (`Tv = Int, Tq = Int`) this allocated
`Vector{Float64}` while the kernel actually returns `Int` — silent
conversion and a contract violation. Every other Constant allocator
uses `_output_eltype(_constant_kernel_shape, Tv, Tq)`. Aligned.
- constant_series_interp.jl: `ConstantSeriesInterpolant`'s batch in-place
dispatch had `outputs::AbstractVector{<:AbstractVector{Tv}}`. The
parallel out-of-place form correctly widens `T_out` for duck `Tq`
(e.g., `Vector{Vector{SVector{3, Dual}}}` when `Tv = SVector{3,Float64}`,
`Tq = Dual`), but the widened buffer can't dispatch back into the
in-place form. Linear's and Quadratic's parallel forms already used
the relaxed `<:AbstractVector` constraint. Aligned.
Regression coverage added:
- "Constant Int-chain preservation" testset pins the anchored callable
contract (`Vector{Int}` output for fully-Int chain).
- "Linear/Quadratic/Constant SeriesInterpolant persistent — SVector ×
Dual carrier" testitem covers all 3 series persistent paths;
the Constant block is the dispatch regression.
Additive coverage extending the duck-Tv × duck-Tq test surface to
methods the PR description named but the test file missed.
- "Hermite-family one-shot — duck-Tq carrier through trait": PCHIP/
Cardinal/Akima 3-arg oneshot allocators were rewired to
`_output_eltype(_arithmetic_kernel_shape, ...)` in this branch but
weren't exercised by any test. PCHIP/Akima slope formulas use scalar
`sign`/`abs` on slope differences, so they can't accept `SVector y`
(StaticArrays doesn't define `sign(::SVector)`); their `Tq` duck path
is exercised via Float y + Dual xq. Cardinal's slopes are pure
averages — kept the stronger SVector × Dual case there.
- "PCHIP/Akima — custom scalar duck-Tv interface": minimal `DuckFloat`
wrapper around Float64 implementing the required slope-formula
interface (`sign`, `abs`, `zero`, `+`, `-`, `*`, `/`, ordering).
Demonstrates that PCHIP/Akima impose NO `<: AbstractFloat` constraint
on `Tv` — any duck type satisfying the interface flows through and
the trait predicts the duck wrapper type as the kernel return.
- "Cubic/Quadratic/Hermite-family — Rational user-visible blocked
(BROKEN)" testsets mirror the Linear @test_broken pins. Cubic's
3-arg oneshot line is omitted with a note: its output buffer is
correctly sized `Vector{Rational{Int}}` by the trait, but the
kernel's internal z coefficients are still Float64; the result only
round-trips through Rational for inputs whose Float values happen
to be exactly representable. Asserting `isa Vector{Rational{Int}}`
would pass for the wrong reason — pin removed until the
coefficient-eltype follow-up lands.
The previous comment in the Cubic @test_broken block was imprecise: it
suggested the 3-arg oneshot's `isa Vector{Rational{Int}}` would pass
"by accident" for exactly-representable Floats. Empirical trace shows
the actual mechanism is sharper and uniform — but the contract is
still broken.
Mechanism (verified via @inferred + runtime trace):
- The kernel-shape trait `_output_eltype(_arithmetic_kernel_shape, Tg,
Tv, Tq)` predicts `Rational{Int}` purely from input types at compile
time. Type-stable, @inferred passes.
- The buffer is allocated as `Vector{Rational{Int}}`.
- BUT the in-place `cubic_interp!` builds an internal cache via
`_promote_grid_float` which lifts grid + values to Float64. The
kernel computes in Float64 throughout (z coefficients are Float64).
- `output[i] = kernel_float_result` triggers
`convert(Rational{Int}, ::Float64)`, which uses the bit-exact Float64
representation. For dyadic results (k/2 inputs on x²/2 data) the
Float is exactly representable as a small-denominator Rational and
the result LOOKS clean. For non-dyadic queries the result is a
~2^52-denominator binary fraction approximation, NOT the true
Rational arithmetic value.
The trait says one thing, storage lift says another — they don't
coordinate. The next PR will decide between honoring the trait (relax
_promote_grid_float for arithmetic methods) or honoring storage (make
the trait aware of the lift). Either resolution is captured by the
semantic pin below.
Semantic pin: a cubic spline of `y = x` should reproduce `y = x`
exactly. The new `@test_broken cubic_interp(x, x, [3//7]) == [3//7]`
currently fails (Float64 path produces a 2^52-denom approximation)
and auto-surfaces when the storage/trait mismatch is resolved.
…c contract
Item A — canonical migration: route all surviving output-buffer sites through
the method-aware kernel-shape form. After this commit no output-allocation
path remains on the legacy 2/3-arg `_output_eltype(Tv, Tg, ...)` fallback.
Migrated callsites:
- linear_series_interp.jl: 2 sites (scalar + batch out-of-place callable)
- linear_oneshot_series.jl: 2 sites (3-arg scalar + batch oneshot)
- cubic_oneshot.jl:286: cache-variant 3-arg batch oneshot
- cubic_nd_oneshot.jl:42: ND scalar oneshot (Tv_p binding retained for the
separate `_resolve_extrap` use; trait uses raw Tv to match batch peer)
- quadratic_nd_oneshot.jl:163: ND scalar oneshot
- core/adjoint_protocol.jl: all 6 1D + ND adjoint allocator callables
(using `Tg` as both grid and dL slot — Tq is anchor-baked at this layer)
Item C — allocation contract pin: the persistent batch allocating callable
`itp(xq)` must allocate exactly one `Vector{T_out}` and nothing else. New
testitem in test_promotion_alloc.jl iterates over all 7 methods (linear/
cubic/quadratic/pchip/cardinal/akima/constant), comparing the call's
`@allocated` against the bare-Vector baseline + ALLOC_THRESHOLD slack. A
re-introduced sample-first eval (or any leaked scratch buffer) would
exceed this by at least one kernel-result-sized allocation per call.
Replace the baseline-Vector subtraction approach with a direct in-place
zero-alloc check. The persistent in-place callable `itp(out, xq)` must be
zero-alloc across all methods — that single contract covers both paths,
since the allocating form `itp(xq)` is just `Vector{T_out}(undef, n)` +
this in-place call. Matches the existing `_bench_linear!` pattern in the
same file (in-place + ALLOC_THRESHOLD), avoids baseline-subtraction noise,
and pins a stronger invariant.
FastInterpolations.jl BenchmarksAll benchmarks (50 total, click to expand)
|
| Benchmark | Current | Previous | Imm. Ratio | Grad. Ratio | Tier |
|---|---|---|---|---|---|
1_cubic_oneshot/q00001 |
565.3 ns |
567.0 ns |
0.997 |
1.18 |
gradual |
8_cubic_multi/construct_s001_q100 |
679.4 ns |
680.8 ns |
0.998 |
1.173 |
gradual |
8_cubic_multi/eval_s001_q100 |
815.6 ns |
856.3 ns |
0.953 |
1.119 |
gradual |
9_nd_oneshot/bicubic_2d |
44711.5 ns |
46253.3 ns |
0.967 |
1.108 |
gradual |
Thresholds: immediate > 1.1x (vs latest master), gradual > 1.1x (vs sliding window)
This comment was automatically generated by Benchmark workflow.
There was a problem hiding this comment.
Pull request overview
This PR fixes incorrect output-element-type prediction for “duck” value/query type combinations (notably SVector × ForwardDiff.Dual), by replacing the legacy promote_type/_PromotableValue-based trait with a method-aware kernel-shape approach using Base.promote_op. It also refactors allocators to size output buffers from the trait (removing “sample-first” evaluation) and expands tests to pin carrier propagation, inference, and allocation guarantees.
Changes:
- Introduces kernel-shape-based output eltype inference (
_arithmetic_kernel_shape,_constant_kernel_shape) and routes allocator sizing throughBase.promote_op. - Updates Constant interpolation to propagate query carriers via
* one(dL)(and adjusts boundary/derivative short-circuits accordingly), plus ND deriv-zero carrier folding. - Adds extensive new test coverage for duck
Tv × Tq(StaticArrays + ForwardDiff), allocation contracts, and Constant eltype behavior changes.
Reviewed changes
Copilot reviewed 44 out of 44 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| test/test_promotion_alloc.jl | Updates Constant batch eltype expectations; adds zero-allocation contract test for persistent in-place batch calls. |
| test/test_precision_vector_queries.jl | Updates Constant vector-call eltype expectation to track query carrier. |
| test/test_duck_tv_dual_tq.jl | New comprehensive duck-typing + AD + inference regression suite (SVector/Dual, Series, adjoints, ND carrier). |
| test/test_constant.jl | Updates Constant scalar behavior expectations for Int data + Float query; adds fully-Int-chain preservation checks. |
| test/test_constant_eltype.jl | Reworks Constant eltype contracts across 1D/ND, periodic modes, FillExtrap, and oneshot vs persistent behavior. |
| test/test_complex_constant.jl | Updates Constant Complex{Int} + Float query expectation (carrier propagation → ComplexF64). |
| test/test_bc_complex_int.jl | Updates Constant Complex{Int} + Float query expectation (carrier propagation → ComplexF64). |
| src/core/utils.jl | Refactors output-eltype trait logic and adds kernel-shape helpers for method-aware inference. |
| src/core/interpolant_protocol.jl | Routes default output trait through _arithmetic_kernel_shape; removes scalar-path convert; adjusts ND deriv-zero fill carrier folding. |
| src/core/adjoint_protocol.jl | Switches adjoint allocation eltype inference to kernel-shape-based _output_eltype. |
| src/linear/linear_interpolant.jl | Notes Linear uses default arithmetic kernel shape route. |
| src/linear/linear_anchor.jl | Anchored-vector allocator now uses kernel-shape output-eltype inference. |
| src/linear/linear_oneshot.jl | One-shot linear batch allocator uses kernel-shape output-eltype inference. |
| src/linear/linear_series_interp.jl | Linear Series allocators use kernel-shape output-eltype inference. |
| src/linear/linear_oneshot_series.jl | Linear Series oneshot allocators use kernel-shape output-eltype inference. |
| src/linear/nd/linear_nd_types.jl | Adds note that Linear ND inherits default arithmetic kernel shape route. |
| src/linear/nd/linear_nd_oneshot.jl | ND oneshot allocators use kernel-shape output-eltype inference. |
| src/quadratic/quadratic_anchor.jl | Quadratic anchored-vector callable now sizes output via kernel-shape trait. |
| src/quadratic/quadratic_oneshot.jl | Quadratic oneshot batch allocator uses kernel-shape output-eltype inference. |
| src/quadratic/quadratic_series_interp.jl | Quadratic Series allocators use kernel-shape output-eltype inference. |
| src/quadratic/quadratic_oneshot_series.jl | Quadratic Series oneshot allocators use kernel-shape output-eltype inference. |
| src/quadratic/nd/quadratic_nd_oneshot.jl | Quadratic ND oneshot allocators use kernel-shape output-eltype inference. |
| src/cubic/cubic_oneshot.jl | Cubic oneshot allocators use kernel-shape output-eltype inference. |
| src/cubic/cubic_series_interp.jl | Cubic Series allocators use kernel-shape output-eltype inference. |
| src/cubic/cubic_oneshot_series.jl | Cubic Series oneshot allocators use kernel-shape output-eltype inference. |
| src/cubic/nd/cubic_nd_oneshot.jl | Cubic ND oneshot allocators use kernel-shape output-eltype inference. |
| src/hermite/hermite_interpolant.jl | Adds Hermite-family _output_eltype override using a two-chain kernel-shape pattern over y and dy. |
| src/hermite/hermite_oneshot.jl | Mirrors Hermite-family two-chain output-eltype sizing in oneshot path. |
| src/pchip/pchip_oneshot.jl | PCHIP oneshot allocator uses kernel-shape output-eltype inference. |
| src/cardinal/cardinal_oneshot.jl | Cardinal oneshot allocator uses kernel-shape output-eltype inference. |
| src/akima/akima_oneshot.jl | Akima oneshot allocator uses kernel-shape output-eltype inference. |
| src/constant/constant_types.jl | Updates Constant storage commentary (raw storage; widening handled at evaluation). |
| src/constant/constant_interpolant.jl | Defines _constant_kernel_shape and routes Constant output-eltype trait through kernel shape. |
| src/constant/constant_kernels.jl | Changes Constant kernels to multiply selected value by one(dL) to propagate query carrier. |
| src/constant/constant_oneshot.jl | Updates right-edge short-circuit to preserve carrier; allocator sizes output via kernel-shape trait; updates docs. |
| src/constant/constant_anchor.jl | Ensures anchor short-circuits preserve carrier; anchored-vector allocator sizes output via kernel-shape trait. |
| src/constant/constant_series_interp.jl | Improves duck-query sizing for Series constant path; loosens in-place output container typing. |
| src/constant/constant_oneshot_series.jl | Uses _output_eltype fallback for duck queries in Constant Series oneshot. |
| src/constant/constant_adjoint.jl | Comment updates noting buffer eltype comes from adjoint protocol. |
| src/constant/nd/constant_nd_interpolant.jl | Comment updates on raw storage and kernel-based widening. |
| src/constant/nd/constant_nd_types.jl | Constant ND _output_eltype trait now routes through _constant_kernel_shape. |
| src/constant/nd/constant_nd_eval.jl | Propagates per-axis carrier in selection result; derivative-zero uses prod(one, q_eval). |
| src/constant/nd/constant_nd_oneshot.jl | Derivative branches fold per-axis carrier; batch allocator eltype uses kernel-shape trait; removes overly-narrow type assertions. |
| src/constant/nd/constant_nd_adjoint.jl | Comment update to align with protocol-sized adjoint buffers. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #146 +/- ##
==========================================
- Coverage 96.45% 96.43% -0.02%
==========================================
Files 143 143
Lines 11957 11978 +21
==========================================
+ Hits 11533 11551 +18
- Misses 424 427 +3
🚀 New features to boost your workflow:
|
…rait form
`_constant_kernel_shape(yv, q) = yv * one(q)` only saw the query type, so
`Dual` grid + `Float` xq paths under-predicted the kernel's true return
type — `_constant_vector_loop!` then tried to store a `Dual` kernel result
into a `Vector{Float64}` buffer (`MethodError` on Julia LTS Extensions CI).
Mirror the actual kernel arithmetic: `(xL, yv, xq) = yv * one(xq - xL)`.
`Base.promote_op` now sees `xq - xL` and infers the real `dL` carrier, so
duck grids (Dual, Measurement, …) flow through `promote_type(Tg, Tq)`
naturally. All callsites collapse to the canonical 4-arg `(kernel_op, Tg,
Tv, Tq)` form that Linear's `_arithmetic_kernel_shape` already used —
Constant was the lone outlier on the 2-arg `(Tv, Tq)` shape.
Int×Int×Int still infers `Int`; SVector × Dual still infers SVector{Dual};
the Rational chain pin and the Int batch-vs-scalar contract are unchanged
(only the trait's arg count changes, not its predictions).
…-arg trait
`test/ext/test_series_dual_grid.jl` on Julia LTS still tripped
`MethodError: Float64(::Dual)` after the kernel-shape rename — Constant
series oneshot + persistent paths were the last holdouts of the
`Tq <: _PromotableValue ? Tv : _output_eltype(Tv, Tq)` guard. That gate
ignores `Tg`, so `Dual` grid + `Float` xq short-circuited to `Vector{Tv}`
(Float-only) while the kernel returned `Dual` via `xq - xL`.
Migrate all 4 series sites to the canonical 4-arg trait
`_output_eltype(_constant_kernel_shape, Tg, Tv, Tq)`:
- `constant_oneshot_series.jl:83` — scalar oneshot
- `constant_oneshot_series.jl:288` — vector oneshot
- `constant_series_interp.jl:409` — persistent scalar callable
- `constant_series_interp.jl:461` — persistent vector callable
Linear/Cubic/Quadratic series already use the canonical form; Hermite
adds a disjoint Tdy promote chain on top. Constant was the only family
still on the legacy gate.
Test pins updated to match the new Series contract (1D plain path's
`Int y + Float xq → Float` already established in PR #144; Series now
agrees instead of pinning a stale `Vector{Int}` raw-Tv hold-out):
- `Vector{Int}` → `Vector{Float64}` for `Float xq` cases (lines 496, 622-624)
- Added explicit pin: fully-Int chain (`Int grid + Int y + Int xq`)
stays `Vector{Int}` — Series ↔ plain agree on both regimes.
PR description's "Out of scope" item that called out a Series storage
Float-forcing follow-up is now closed for Constant. Cubic/Quadratic Tz/Tc
coefficient widening is still on the legacy enumeration as documented.
PR #146 reviews (Codex P2 #1/#2 + Copilot) flagged that scalar OOB extrap paths returned raw `Tv` for non-Number value types — `SVector` y with `ClampExtrap` / `FillExtrap` + `Dual` xq returned `SVector{Float64}` scalar but `SVector{Dual}` from the trait-sized batch buffer, breaking the scalar/batch consistency the PR is meant to restore. Root cause: `_promote_extrap_val(val::Number, xq::Number)` carrier- propagates via `val + zero(xq) * zero(val)`, but the non-Number fallback `_promote_extrap_val(val, xq) = val` returns raw — the actual `SVector` boundary value passes through `_eval_extrapolation` → `_promote_extrap_val` → raw return, with no carrier added. ND scalar takes the same path via `_try_fill_oob` → `_fill_extrap_result` → `_promote_extrap_val`. Add an `AbstractArray` overload that broadcasts the same carrier-prop pattern (`val .+ zero(xq) .* zero(eltype(val))`); same idea for the derivative branch's `_promote_extrap_zero`. `SVector × Dual` now resolves to `SVector{Dual}` on scalar OOB, matching in-domain kernel output and the trait-sized batch buffer. Constant ND oneshot derivative empty-fast-path also missed the query carrier — `nq == 0 && return Vector{Tv}(undef, 0)` ignored Tq. The non-empty branch already synthesises `0 * first(data) * prod(one, first_q)`. Route the empty path through the same kernel-shape trait so eltypes agree. RED-then-GREEN pins added under "OOB carrier — scalar/batch consistency for non-Number Tv" + "ND oneshot derivative — empty vs non-empty query eltype" in `test/test_duck_tv_dual_tq.jl`. 36 OOB combinations (Linear/Cubic/Quadratic/Constant × Clamp/Fill × 1D scalar/batch + ND analogues) all green. Follow-up: the `AbstractArray` overload here is a stop-gap. The discussion under design item 1 in `claudedocs/TODO/output_eltype_carrier_followups.md` moves to a `Tr`-driven `_eval_extrapolation` that uses `convert(Tr, val)` instead of arithmetic carrier inference — eliminates broadcasting overhead, removes the `zero(Tr)` requirement, preserves NaN propagation via the Constant-kernel pattern `0 * convert(Tr, val)`. Out of scope here.
…pe contract Three pins in `test_complex_constant_series.jl` predated the Series canonical 4-arg trait migration (commit d28d7b0) and asserted the legacy raw-Tv contract: - "Integer grid + Complex values" (line 65, 69): `sitp(5.5)` with `Int` grid + `Complex{Int}` y → kernel-shape `yv * one(xq - xL)` promotes via `Float - Int = Float` → `ComplexF64`. Was pinned as `Vector{Complex{Int}}` / `5 + 10im`. Updated to `Vector{ComplexF64}` / `5.0 + 10.0im`. Added explicit fully-Int chain pin (`sitp(5)` → stays `Vector{Complex{Int}}`) mirroring the 1D plain-path pin in `test_constant_eltype.jl`. - "Eltype contract: selection kernel preserves Tv" (line 284): Float32 grid + Float32 y + Float64 xq → trait promotes to Float64. Was pinned as Float32 (legacy raw-Tv claim). Renamed to "kernel-shape trait promotes via xq - xL" and split into two pins: Float64 xq → Float64, Float32 xq → Float32 (fully-Float32 chain stays Float32). No source change — the new Series promote behavior was already established in commit d28d7b0; these tests just hadn't been refreshed. Local `test_complex_constant_series.jl` now passes on both Julia 1.x and the LTS-failure scenario reported on CI.
…contract Two more pins predated the Series canonical 4-arg trait migration (commit d28d7b0) and asserted the legacy raw-Tv contract under a Float-grid + Int-y + Float-xq chain — that chain now (correctly) promotes to Float via `xq - xL`. - "Eltype contract: Integer series stays Int" (line 144): `Float grid + Int y + Float xq` was pinned `Vector{Int}`. Renamed to "fully-Int chain stays Int" and migrated setup to `Int grid + Int y + Int xq` — matches the intent ("Int chain preserved") and the 1D plain-path pin in test_constant_eltype.jl's "Fully-Int chain — Series stays Int" testset. - "FillExtrap fill_value must be compatible with y eltype" (line 155): the InexactError contract requires an `Int` output buffer to catch the Float fill_value mismatch. Under the new promote rule, the original `Float grid + Int y + Float xq` setup gives a Float buffer that silently absorbs the 0.5 fill — exactly what this contract was trying to catch. Migrated to fully-Int chain so the Int buffer reinstates the strict-on-assign behavior. No source change — same theme as `test_complex_constant_series.jl` commit 82085fc.
Closes #144.
Summary
linear_interp(x, y::Vector{SVector{3,Float64}})(xq_dual_vec)raisedMethodError: Float64(::Dual). Root cause: the output-eltype trait predicted buffer types via apromote_type+_PromotableValueenumeration that couldn't see throughSVector × Dualand similar non-promote_rulecarrier chains. Allocated buffers under-predicted the kernel's actual return type.This PR closes the carrier-propagation contract end-to-end and replaces the hand-coded enumeration with a method-aware kernel-shape op trait: each method declares its kernel's representative shape, and Julia inference (
Base.promote_op) predicts the exact return type.What this PR introduces
Kernel-shape op trait
Two shapes cover every method:
_arithmetic_kernel_shape(h, yv, dL) = yv + yv*(dL/h)— Linear/Cubic/Quadratic/Hermite-family. Division byhFloat-upgrades Int chains naturally._constant_kernel_shape(yv, q) = yv * one(q)— Constant (selection kernel; Int chains stay Int).Wiring:
_output_eltype(::AbstractInterpolant1D, ::Type{Tq})(and ND mirror) routes through_arithmetic_kernel_shape.ConstantInterpolant1D + ND override route through_constant_kernel_shape.AbstractHermiteInterpolant1Doverride applies a two-call pattern overTvandeltype(itp.dy)— disjoint promote chains so a duck-typeddy(e.g., Float y + Dual dy from AD on slopes) doesn't poison the y-chain._series_output_typehelper deleted (orphaned).Net result:
_PromotableValueenumeration is no longer reached from any output-buffer allocation path. The trait's prediction is kernel-faithful by construction.Duck-Tv × duck-Tq end-to-end
Carrier propagation closed across every allocator that previously collapsed for non-
promote_ruleducks:Float y + Vector{Dual} dy(AD on slopes)prod(one, query), so heterogeneous queries like(Float, Dual)+ mixed-deriv propagate the non-axis-1 carrierConstantInterpolant+QuadraticInterpolantanchored-vector callableSample-first allocator removed
Protocol's persistent batch path no longer evaluates the kernel for the first query just to read
typeof(first_val)— trait-only sizing (n+1→nevaluations per batch).Behavior changes
constant_interpwithInt y + Float xqreturnsFloat(wasInt); fully-Int chain staysInt. Scalar and batch paths agree.hermite_interp(x, y::Float64, dy::Vector{Dual}, xq)no longer raisesMethodErroronsetindex!.CubicSeriesInterpolantpersistent callable handlesSeries(Vector{SVector{Float}}) × Dual.Rational/Rational/Rationalarithmetic chain predictsRational{Int}at the trait level (kernel-faithful —Rational/Rational = Rational). User-visibleVector{Rational}is still blocked by storage Float-forcing — see Out of scope. Cubic 3-arg batch oneshot is a partial outlier: the output buffer IS sized asVector{Rational{Int}}(trait correct), but values are bit-exactconvert(Rational, ::Float64)of the kernel's Float64 results — pinned by a semantic@test_broken(value equality, not justisa).Out of scope (explicit follow-ups)
The kernel-shape trait now predicts kernel-faithful return types, but two storage-side mechanisms still apply manual Float-forcing — surfacing as a trait-vs-storage gap:
_promote_grid_float: arithmetic-method constructors (Linear/Cubic/Quadratic/Hermite-family) liftRational/Intgrids to Float64 at construction time. Relaxing this requires search/cache pathways (_CachedRange.inv_hetc.) to handle non-Float types.Tzfor cubic z,Tcfor quadratic a/d, hetero coefficients): still use the legacy_output_eltype(Tv, Tg)enumeration. Same theme as storage Float-forcing — natural promote via method-specific coefficient-shape op is achievable.Both are pinned via
@test_broken; they auto-surface asnow passingonce the follow-up lands. Performance benchmarking for non-Float storage paths is part of that follow-up's scope.