Skip to content

(fix + refac): Method-aware output-eltype via kernel-shape op trait#146

Merged
mgyoo86 merged 25 commits into
masterfrom
fix/output-eltype-duck-restoration
May 20, 2026
Merged

(fix + refac): Method-aware output-eltype via kernel-shape op trait#146
mgyoo86 merged 25 commits into
masterfrom
fix/output-eltype-duck-restoration

Conversation

@mgyoo86
Copy link
Copy Markdown
Member

@mgyoo86 mgyoo86 commented May 20, 2026

Closes #144.

Summary

linear_interp(x, y::Vector{SVector{3,Float64}})(xq_dual_vec) raised MethodError: Float64(::Dual). Root cause: the output-eltype trait predicted buffer types via a promote_type + _PromotableValue enumeration that couldn't see through SVector × Dual and similar non-promote_rule carrier chains. Allocated buffers under-predicted the kernel's actual return type.

This PR closes the carrier-propagation contract end-to-end and replaces the hand-coded enumeration with a method-aware kernel-shape op trait: each method declares its kernel's representative shape, and Julia inference (Base.promote_op) predicts the exact return type.

What this PR introduces

Kernel-shape op trait

@inline function _output_eltype(kernel_op::F, ::Type{Tv}, types::Type...) where {F, Tv}
    Top = Base.promote_op(kernel_op, Tv, types...)
    (Top === Union{} || Top === Any) && return Tv
    return Top
end

Two shapes cover every method:

  • _arithmetic_kernel_shape(h, yv, dL) = yv + yv*(dL/h) — Linear/Cubic/Quadratic/Hermite-family. Division by h Float-upgrades Int chains naturally.
  • _constant_kernel_shape(yv, q) = yv * one(q) — Constant (selection kernel; Int chains stay Int).

Wiring:

  • Default _output_eltype(::AbstractInterpolant1D, ::Type{Tq}) (and ND mirror) routes through _arithmetic_kernel_shape.
  • ConstantInterpolant 1D + ND override route through _constant_kernel_shape.
  • AbstractHermiteInterpolant1D override applies a two-call pattern over Tv and eltype(itp.dy) — disjoint promote chains so a duck-typed dy (e.g., Float y + Dual dy from AD on slopes) doesn't poison the y-chain.
  • All output-buffer sites canonically routed through the method-aware kernel-shape form. After the follow-up cleanup in this PR (linear series ×4, cubic cache oneshot ×1, cubic/quadratic ND scalar ×2, 1D + ND adjoint allocators ×6), no output-buffer allocation remains on the legacy 2-arg fallback. _series_output_type helper deleted (orphaned).

Net result: _PromotableValue enumeration is no longer reached from any output-buffer allocation path. The trait's prediction is kernel-faithful by construction.

Duck-Tv × duck-Tq end-to-end

Carrier propagation closed across every allocator that previously collapsed for non-promote_rule ducks:

  • 1D + ND persistent batch (protocol-level)
  • 1D + ND one-shot batch (3-arg)
  • Cubic/Linear/Quadratic/Constant series persistent + one-shot
  • Hermite one-shot for Float y + Vector{Dual} dy (AD on slopes)
  • ND deriv-zero short-circuits — per-axis carrier via prod(one, query), so heterogeneous queries like (Float, Dual) + mixed-deriv propagate the non-axis-1 carrier
  • ConstantInterpolant + QuadraticInterpolant anchored-vector callable

Sample-first allocator removed

Protocol's persistent batch path no longer evaluates the kernel for the first query just to read typeof(first_val) — trait-only sizing (n+1n evaluations per batch).

Behavior changes

  • constant_interp with Int y + Float xq returns Float (was Int); fully-Int chain stays Int. Scalar and batch paths agree.
  • hermite_interp(x, y::Float64, dy::Vector{Dual}, xq) no longer raises MethodError on setindex!.
  • CubicSeriesInterpolant persistent callable handles Series(Vector{SVector{Float}}) × Dual.
  • Heterogeneous-axis ND deriv-zero queries propagate the Dual carrier per-axis.
  • Rational/Rational/Rational arithmetic chain predicts Rational{Int} at the trait level (kernel-faithful — Rational/Rational = Rational). User-visible Vector{Rational} is still blocked by storage Float-forcing — see Out of scope. Cubic 3-arg batch oneshot is a partial outlier: the output buffer IS sized as Vector{Rational{Int}} (trait correct), but values are bit-exact convert(Rational, ::Float64) of the kernel's Float64 results — pinned by a semantic @test_broken (value equality, not just isa).

Out of scope (explicit follow-ups)

The kernel-shape trait now predicts kernel-faithful return types, but two storage-side mechanisms still apply manual Float-forcing — surfacing as a trait-vs-storage gap:

  • _promote_grid_float: arithmetic-method constructors (Linear/Cubic/Quadratic/Hermite-family) lift Rational/Int grids to Float64 at construction time. Relaxing this requires search/cache pathways (_CachedRange.inv_h etc.) to handle non-Float types.
  • Internal coefficient eltype (Tz for cubic z, Tc for quadratic a/d, hetero coefficients): still use the legacy _output_eltype(Tv, Tg) enumeration. Same theme as storage Float-forcing — natural promote via method-specific coefficient-shape op is achievable.

Both are pinned via @test_broken; they auto-surface as now passing once the follow-up lands. Performance benchmarking for non-Float storage paths is part of that follow-up's scope.

mgyoo86 added 19 commits May 19, 2026 15:57
Treat Constant as `y[idx] + 0·(xq - x[idx])` so `Tq`'s carrier (Dual,
Measurement, SVector, …) propagates naturally — the same rule Linear /
Cubic / Quadratic / Hermite already follow. Drop the scalar
`convert(_output_eltype, val)` wrapper and replace the persistent
allocating batch + 1D Constant oneshot allocators with sample-first so
the container fits whatever the kernel actually produces. Constant
`_output_eltype` overrides and inline `T_out` expressions removed; ND
Constant scalar oneshot drops its `::Tv` typeassert. Every method now
uses the generic `_output_eltype(Tv, Tg, Tq)`.

Constant kernels multiply the selection by `one(dL)` (carrier
propagation); derivative branches use `0 * y_left * one(dL)` to
preserve NaN/Inf propagation and keep the minimal `*(::Int, ::Tv)` +
`*(::Tv, ::Real)` duck-type contract.

Behavior change: `constant_interp` on `Int y` with a `Float xq` now
returns `Float` (carrier propagates through zero-slope arithmetic).
The fully-Int chain `(Int x, Int y, Int xq)` still returns `Int`. The
52a1f91 raw-Tv-for-promotable-Tq contract from #138 is intentionally
relaxed to keep Constant on the same machinery as every other method —
no Constant-specific helpers. Series / ND-oneshot-3-arg / hetero
allocators remain on the old path and will migrate in a follow-up.

New `test_duck_tv_dual_tq.jl` pins the Tv × Tq cross product across
methods. Existing Constant / Complex / precision / promotion tests
updated to the natural-promote contract.

Closes #144.
Follow-up to #144. Close the deferred one-shot allocator gap (1D batch,
ND batch, Hetero ND, Series) without touching the 12 individual call
sites — and patch the Constant scalar/anchor short-circuits that broke
`@inferred itp(1.0)` with Int y.

* `_output_eltype` (src/core/utils.jl): the duck-type fallback now
  queries `Base.promote_op` on the universal kernel-shape op
  `yv * q + yv`, matching the Julia idiom used by `LinearAlgebra.mul`
  (`promote_op(matprod, eltype(A), eltype(B))`) and `Base.map`
  (`_return_type(f, eltype(A))`). Inference resolves the call to a
  constant at compile time (zero runtime cost — verified via LLVM IR:
  the function body is a single `ret`). For `SVector{Float64} × Dual`
  this returns `SVector{Dual}` where `promote_type` returned the
  non-concrete `Any` and the old fallback silently returned `Tv`. When
  the op is undefined (e.g., custom duck-type with no `*`), the legacy
  `Tv` fallback is preserved.

* Constant short-circuits (constant_oneshot.jl, constant_anchor.jl):
  at `xi == last(x)` the scalar/anchor paths used to `return last(y)`
  raw — bypassing the kernel's `selected * one(dL)` carrier
  propagation. With `Int y` and `Float xq` that disagrees with the
  kernel's `Float` return, so `@inferred itp(1.0)` saw
  `Union{Float64, Int64}`. Short-circuits now multiply by `one(xi)` to
  match the kernel, restoring full type stability across both the
  point-based scalar path and the four anchor-dispatch branches.

* linear_oneshot.jl: `_output_eltype(eltype(y), Tg)` was missing `Tq`
  entirely, so the trait couldn't see the carrier chain even after
  the duck fallback was correct. Now passes `eltype(x_targets)`.

* hermite_oneshot.jl: dropping `eltype(dy)` from the trait call —
  including it collapses `promote_type(...)` to `Any` on duck inputs,
  which prevents the new `promote_op` fallback from resolving the
  carrier. `eltype(dy)` is typically the same shape as `Tv` so it
  contributes no extra carrier information.

Side-effect fix: `_output_eltype(SVector{Int}, Float64, Float64)` now
returns `SVector{Float64}` (was `SVector{Int}`). The linear/cubic/etc.
kernels promote Int→Float internally via interval-slope division, so
the old prediction caused `convert(::SVector{Int}, ::SVector{Float64})`
errors that no test exercised — now correct.

Tests: test_duck_tv_dual_tq.jl reorganized into three logical test
items (Persistent path, Oneshot path, Type stability + raw-eltype
contracts) covering scalar/batch/ND × duck-Tv × duck-Tq, the
ForwardDiff.derivative MWE, `@inferred` Union-free assertions for the
Constant short-circuit, the `_output_eltype` type table, and the
Int→Int raw-eltype non-regression.

Verified: smoke 42/42 PASS (master: 38/42, v0.4.9: 26/42); 35,587
tests pass on latest Julia. LTS run blocked by a pre-existing
test/Manifest.toml resolution issue unrelated to this patch.
Surfaced by extended smoke covering Series / ND / Adjoint paths against
duck-Tv (SVector) × duck-Tq (Dual) combinations. Three regressions
fixed:

* Constant ND oneshot batch (src/constant/nd/constant_nd_oneshot.jl):
  the allocator was `Vector{Tv}(undef, …)` and the outer wrapper
  carried a `::Vector{Tv}` typeassert — both bypass kernel carrier
  propagation, so `constant_interp((xg,yg), data_sv, queries_dual)`
  raised `Float64(::Dual)`. Replaced with sample-first allocation
  matching the 1D Constant pattern: pull `first_q` via
  `_extract_query_point`, run the scalar oneshot once, size from
  `typeof(first_val)`. The deriv-zero short-circuit now sizes from
  `0 * first(data) * one(eltype(first_q))` for the same reason.

* 1D adjoint allocators (src/core/adjoint_protocol.jl:115, 125):
  `Tv = promote_type(eltype(y_bar), Tg)` returned `Any` for
  `SVector{3,Float64} × Float64` (no `promote_rule` between Base and
  StaticArrays), so `zeros(Any, n)` raised `zero(::Type{Any})`.
  Routing through `_output_eltype` reuses the `Base.promote_op` duck
  fallback from b675e17 and resolves to `SVector{3,Float64}`. Linear,
  Constant, Cubic, Quadratic, Hermite, Cardinal, PCHIP, Akima
  adjoints now all accept SVector y_bar.

* ND adjoint allocators (src/core/adjoint_protocol.jl:353, 368):
  same root cause, same `promote_type → _output_eltype` fix. Linear,
  Constant, Cubic, Quadratic, Hetero ND adjoints all accept SVector
  y_bar (`Matrix{SVector{3,Float64}}` return).

Tests:
- test_duck_tv_dual_tq.jl gains an Adjoint-path testitem covering all
  7 1D adjoints + 4 ND adjoints × {SVector, Dual} y_bar.
- test_constant_eltype.jl updates two assertions that previously
  pinned the broken ND oneshot return as a known limitation; they now
  pin the carrier-propagated Float64 return.

Verified: extended smoke 116/116 PASS (Series × 4 methods + ND forward +
ND/1D adjoints × duck-Tv + slope-family OnTheFly); all affected adjoint
test files still pass.

Out of scope (separate follow-up): the ND scalar oneshot path
(`constant_interp((xg,yg), data, query_tuple)`) returns `Float64` for
all-Int inputs while the persistent path returns `Int` — pre-existing
asymmetry unrelated to this patch.
Stage 3 of the Issue #144 cleanup. Series one-shot and persistent
callable allocators previously composed two traits as
`_series_output_type(_output_eltype(Tv, Tg), Tq)`. For duck-Tv × duck-Tq
combinations (e.g., `Series(SVector{3,Float64}, …)` queried with a
ForwardDiff.Dual) the outer step computes
`promote_type(SVector{Float}, Dual) = Any`, falls through to its
fallback (`Tv`), and the kernel's actual `SVector{Dual}` return then
either errors on `setindex!(::Vector{SVector{Float}}, ::SVector{Dual})`
or — for Constant's selection kernel — silently produces
`Vector{Vector{Any}}`.

Switching to a single `_output_eltype(Tv, Tg, Tq)` call uses the
b675e17 `Base.promote_op` fallback, which resolves
`SVector{Float} × Dual → SVector{Dual}` at compile time. Six sites
across linear/cubic/quadratic series (scalar and batch, persistent and
one-shot) migrate the chain to the unified form.

Constant series keeps its raw-Tv contract for promotable Tq (Int y +
Float xq → Vector{Int}, preserved) but routes duck Tq through
`_output_eltype(Tv, Tq)` so SVector × Dual no longer collapses to
`Vector{Any}`. Four allocator sites updated.

Regression coverage added to test_duck_tv_dual_tq.jl: a new
`Series path — duck-Tv × duck-Tq carrier propagates` testitem pins
the four methods × {scalar, batch} × {SVector y, Dual xq} grid, and
the existing Oneshot testitem now also asserts the Constant ND
oneshot batch path (closed in acfc95a).

Verified: extended smoke 132/132 PASS (was 126/132 with 6 series
duck-Tv × duck-Tq fails before this commit); all touched test files
green on latest Julia.
Follow-up to the original Issue #144 PR, addressing five user-visible paths
where carrier propagation still collapsed despite the unified `_output_eltype`
trait. All share the same root-cause family but live in branches the prior
commits did not reach.

Sites addressed:

- `cubic_series_interp.jl` — `CubicSeriesInterpolant` build-then-call path
  still chained `_series_output_type(_output_eltype(Tv, Tg), Tq)`, collapsing
  `promote_type(SVector{F}, Dual)` to `Any`. Migrated to the unified
  `_output_eltype(Tv, Tg, Tq)` (mirrors the prior 6-site migration).

- `interpolant_protocol.jl::_eval_nd_at_point` deriv-zero short-circuit and
  `constant_nd_eval.jl::_eval_at_cell` deriv branch and
  `constant_nd_oneshot.jl` scalar + batch deriv branches — all returned
  `0 * <data>` without query-carrier propagation. Now multiply by
  `prod(one, query)` so per-axis carriers (homogeneous Dual, heterogeneous
  `(Float, Dual)`) survive — symmetric with the forward kernel's per-axis
  `* one(dL_d)`. Affects every ND method at 2nd+ derivative paths.

- `hermite_oneshot.jl` — original PR dropped `eltype(dy)` to avoid
  `promote_type` collapse on SVector + Dual inputs, which re-introduced the
  MethodError commit b3a3379 fixed for `Float64 y + Vector{Dual} dy`.
  Replaced with a two-call `_output_eltype` pattern keeping `Tv` and
  `eltype(dy)` in disjoint promote chains — handles both cases.

Coverage:
- `CubicSeriesInterpolant` persistent path × SVector × Dual (scalar + batch)
- Constant ND deriv-zero × persistent + one-shot scalar
- Heterogeneous-axis (Float, Dual) × mixed-deriv: Constant ND persistent
  scalar + one-shot scalar + one-shot batch, Linear ND persistent 2nd-deriv
- Hermite Float32 y + Float64 dy precision pin
- Hermite Float64 y + Vector{Dual} dy + ForwardDiff.derivative on slopes
…tput_type

Constant and Quadratic 1D `itp(aq_vec)` (batch eval with pre-built anchors)
allocated `Vector{Tg}` — only the grid type, ignoring `Tq` in the anchor
parameterization. The Constant kernel now multiplies by `* one(dL::Tq)` and
the Quadratic kernel mixes anchored deltas with `Tq`, so a duck-typed `Tq`
(Dual, …) raised `MethodError` on `setindex!`. Linear's path already used
`_output_eltype(Tv, Tg, Tq)`; mirrored here.

`_series_output_type` is no longer called from any src file after the Cubic
Series persistent path migrated to the unified `_output_eltype(Tv, Tg, Tq)`
in commit 440be0b. Removed; its narrowed `promote_type → Tv` fallback is
superseded by the duck-aware `Base.promote_op` fallback in `_output_eltype`.

Coverage added to test/test_duck_tv_dual_tq.jl exercises both methods'
anchored-vector callable with `_ConstantAnchoredQuery{Tg, Dual}` and
`_QuadraticAnchoredQuery{Tg, Dual}` via `_fill_anchors!`.
…tion

The PR series that restored duck-Tv × duck-Tq carrier propagation changed
Constant's behavior from "selection kernel returns raw `Tv`" to "kernel
multiplies by `* one(dL::Tq)` so the return widens via promote_type".
Storage stays raw `{Tg, Tv}` (no `_promote_grid_float` indirection) — only
the return type widens.

Comments referencing the old "raw-Tv contract" / "Int in → Int out" return
behavior, or the deleted `_output_eltype(itp, Tq)` override in
`constant_nd_types.jl`, are updated to reflect the post-restoration model:
storage is raw, the kernel handles return-type widening per-callable.

No behavior change. Pure documentation alignment.
…fied)

The protocol's 1D + ND batch allocating callables, and Constant's 1D + ND
one-shot batch wrappers, called the scalar/persistent kernel for the first
query just to read `typeof(first_val)` for the output buffer. That extra
evaluation is per-batch overhead on a hot path — measurable at small N and
strictly redundant once the trait predicts correctly.

`_output_eltype(itp, Tq)` (and the 3-arg `_output_eltype(Tv, Tg, Tq)` used
by one-shot wrappers) now drives all allocators directly: empty-batch and
populated paths share one expression. The kernel still produces the same
runtime values; only the container eltype is sized from types, not from
sampling.

Constant follows the same natural-promotion rule as every other method (no
Constant-specific helper or override per the original Issue #144 PR's
direction). The kernel's `* one(dL)` still preserves scalar-path semantics
(Int×Int×Int scalar callable returns Int via kernel direct); batch and
one-shot wrappers widen via the trait's Int→Float upgrade. Tests pinning
the old sample-first `Vector{Int}` for fully-Int Constant batch are
updated to `Vector{Float64}`.

Locations:
- src/core/interpolant_protocol.jl:60-79 (1D batch)
- src/core/interpolant_protocol.jl:255-272 (ND batch)
- src/constant/constant_oneshot.jl:277-298 (1D one-shot batch)
- src/constant/nd/constant_nd_oneshot.jl:98-115 (ND one-shot batch)
`_output_eltype` previously enumerated standard numeric types via the
`_PromotableValue` union and hand-applied an `Int→Float` upgrade — a manual
proxy for "the arithmetic kernels divide". Selection kernels (Constant) don't
divide, so the upgrade was wrong for them and was masked by sample-first
allocators (one extra scalar eval per batch).

Add a method-shape overload that lets the kernel itself drive inference:

    @inline _output_eltype(kernel_op::F, ::Type{Tv}, types::Type...) where {F, Tv} =
        ... Base.promote_op(kernel_op, Tv, types...) ...

Constant declares its kernel shape and routes its trait through the overload:

    @inline _constant_kernel_shape(yv, q) = yv * one(q)
    @inline _output_eltype(::ConstantInterpolant{Tg, Tv}, ::Type{Tq}) where {Tg, Tv, Tq} =
        _output_eltype(_constant_kernel_shape, Tv, Tq)

Julia infers the exact kernel return type — no `_PromotableValue` enumeration,
no Float upgrade. Int×Int×Int matches the scalar callable (Int), SVector ×
Dual still resolves to SVector{Dual}. Constant ND mirrors the override. The
`_PromotableValue` definition stays (other paths — grid promotion, autocache
gating, FillExtrap auto-convert — use it independently).

Other methods (Linear/Cubic/Quadratic/Hermite/etc.) keep the existing trait —
their kernels divide, so the Float upgrade matches reality. Migrating them to
their own kernel shape op is a follow-up.

Tests:
- Constant Int contract reverted: `itp([2,5,8]) isa Vector{Int}` again,
  matching `itp(3) === 30`. No scalar/batch asymmetry.
- New `@testitem "Type stability — duck-Tv × duck-Tq inference"`: `@inferred`
  coverage for 1D persistent/oneshot × 5 methods (scalar + batch), ND
  persistent/oneshot × 3 methods, ND deriv-zero per-axis carrier, Hermite
  mixed eltype, CubicSeriesInterpolant persistent, anchored-vector callable.
  Catches silent Union returns and `Any` fallbacks `isa` checks would miss.
Linear/Cubic/Quadratic/Hermite all share the same kernel-shape pattern at the
type level — every arithmetic kernel divides by `h`, which Float-upgrades Int
chains and propagates duck carriers through `(dL / h)`. Declare one shared
`_arithmetic_kernel_shape(yv, dL, h) = yv + yv * (dL / h)` and let Linear
opt into it, validating the pattern at a second method beyond Constant.

Migrated:
- `LinearInterpolant` + `LinearInterpolantND` `_output_eltype(itp, Tq)` →
  `_output_eltype(_arithmetic_kernel_shape, Tv, Tq, Tg)`
- 1D + ND one-shot wrappers (`linear_interp(x, y, x_targets)` and the ND
  scalar/batch variants) — buffer eltype from kernel-shape probe
- Anchored-vector callable (`itp(aq_vec)`)

Behavior parity for standard numerics (Int chain → Float, Float×Float → Float,
Complex{Int} → ComplexF64, mixed Int×Float → Float — all match the previous
`_PromotableValue`-driven Float upgrade). The trait's prediction now exactly
matches the kernel reality, so empirical kernel output and trait-allocated
buffer always agree.

A side benefit: `Rational × Rational × Rational` chain now predicts
`Rational{Int}` rather than `Float64` — the old trait over-predicted Float
because `_PromotableValue` enumeration included `Rational`, but Julia's
`Rational/Rational = Rational` (no Float upgrade) means the kernel actually
preserves Rational. No existing test pinned the old over-prediction.

Cubic/Quadratic/Hermite/PCHIP/Cardinal/Akima migration deferred to a
follow-up — same shape op, mechanical override addition per method type.
… Rational

`_arithmetic_kernel_shape` arguments reordered from `(yv, dL, h)` to
`(h, yv, dL)` so callsites read `_output_eltype(shape, Tg, Tv, Tq)` —
matching the codebase's standard `{Tg, Tv, Tq}` type-parameter order
(`Interpolant{Tg, Tv, ...}`, where-clauses, etc.). Five Linear callsites
updated correspondingly. Body unchanged — purely cosmetic.

Tests pin the trait's Rational behavior at two layers:
- Trait-level (`@inferred`): `_output_eltype(_arithmetic_kernel_shape,
  Rational{Int}, Rational{Int}, Rational{Int}) === Rational{Int}`, plus
  Int/Float/Complex baselines.
- Constant end-to-end: storage is raw `{Tg, Tv}` so the trait's Rational
  prediction flows through to the user-visible Vector eltype.
- Linear end-to-end: `@test_broken` — `_promote_grid_float` in the
  constructor lifts grid/values to Float64 at build time, shadowing the
  trait's Rational prediction. Marked broken so a future relaxation
  (storage following kernel reality) auto-surfaces as `now passing`.

The `@test_broken` pin documents the design gap explicitly: trait-level
correctness diverges from user-visible behavior at Linear/Cubic/etc.
because the constructor's Float-forcing predates the kernel-shape trait.
Resolving this is deferred to a follow-up.
Extend the kernel-shape op pattern (validated at Constant + Linear) to every
remaining arithmetic method (Cubic, Quadratic, Hermite, PCHIP, Cardinal,
Akima). One shared shape captures all of them — the division `y + y*(dL/h)`
that Float-upgrades Int chains and propagates duck carriers.

Trait wiring:
- `interpolant_protocol.jl`: default `_output_eltype(::AbstractInterpolant1D,
  Tq)` (and ND mirror) now routes through `_arithmetic_kernel_shape`. Every
  method's persistent-batch path is migrated by inheritance.
- `hermite_interpolant.jl`: `AbstractHermiteInterpolant1D` override applies
  the two-call pattern over `Tv` and `eltype(itp.dy)`, keeping disjoint
  promote chains so a duck-typed `dy` widens the result without poisoning
  the `y` chain.
- `linear_interpolant.jl` + `linear_nd_types.jl`: explicit Linear overrides
  removed — now redundant with the default.

Oneshot/series wrapper migrations (13 callsites): swap legacy
`_output_eltype(Tv, Tg, Tq)` → `_output_eltype(_arithmetic_kernel_shape,
Tg, Tv, Tq)`. Hermite oneshot's two-call already had the right shape; just
swap to the shared shape op.

Net effect: `_PromotableValue` enumeration + hand-coded Float upgrade are
no longer reached from any output-buffer allocation path. Internal storage
sites (coefficient/cache matrices, `Tz`/`Tc`) keep the legacy 2-arg trait —
their role is different (intermediate compute, not kernel output).

A latent consequence the pattern surfaces: Rational chains predict
`Rational{Int}` rather than `Float64` at the trait level (kernel reality —
Julia's `Rational/Rational = Rational` doesn't Float-upgrade). User-visible
preservation is still blocked by `_promote_grid_float` in non-Constant
constructors (pinned as `@test_broken` in test_duck_tv_dual_tq.jl) — that
gap is the deferred follow-up.
- utils.jl 2-arg _output_eltype docstring: replace the inaccurate
  "arithmetic-kernel allocators (Linear/Cubic/Quadratic/Hermite)"
  audience claim with the actual current users (internal coefficient
  eltype, adjoint allocators, hetero ND legacy paths). Direct readers
  to the method-aware kernel-op overload for output-buffer sizing.
- constant_nd_eval.jl: drop reference to the deleted "sample-first
  allocator"; the protocol now uses trait-only sizing via _output_eltype.
- constant_series_interp.jl: ConstantSeriesInterpolant scalar callable
  docstring claimed `promote_type` widening for ducks; the code now
  uses `_output_eltype` (Base.promote_op on the kernel shape) so
  SVector × Dual resolves correctly.

Doc-only — no runtime behavior change.
Two missed migration sites prevented correct output-eltype propagation
under the kernel-shape op trait introduced earlier in this branch.

- constant_anchor.jl: the `ConstantInterpolant(aq_vec)` anchored-vector
  callable was routing through the legacy 2-arg `_output_eltype(Tv, Tg, Tq)`
  which Float-upgrades any `Tr <: _PromotableValue && !<:AbstractFloat`.
  For a fully-Int chain (`Tv = Int, Tq = Int`) this allocated
  `Vector{Float64}` while the kernel actually returns `Int` — silent
  conversion and a contract violation. Every other Constant allocator
  uses `_output_eltype(_constant_kernel_shape, Tv, Tq)`. Aligned.

- constant_series_interp.jl: `ConstantSeriesInterpolant`'s batch in-place
  dispatch had `outputs::AbstractVector{<:AbstractVector{Tv}}`. The
  parallel out-of-place form correctly widens `T_out` for duck `Tq`
  (e.g., `Vector{Vector{SVector{3, Dual}}}` when `Tv = SVector{3,Float64}`,
  `Tq = Dual`), but the widened buffer can't dispatch back into the
  in-place form. Linear's and Quadratic's parallel forms already used
  the relaxed `<:AbstractVector` constraint. Aligned.

Regression coverage added:
- "Constant Int-chain preservation" testset pins the anchored callable
  contract (`Vector{Int}` output for fully-Int chain).
- "Linear/Quadratic/Constant SeriesInterpolant persistent — SVector ×
  Dual carrier" testitem covers all 3 series persistent paths;
  the Constant block is the dispatch regression.
Additive coverage extending the duck-Tv × duck-Tq test surface to
methods the PR description named but the test file missed.

- "Hermite-family one-shot — duck-Tq carrier through trait": PCHIP/
  Cardinal/Akima 3-arg oneshot allocators were rewired to
  `_output_eltype(_arithmetic_kernel_shape, ...)` in this branch but
  weren't exercised by any test. PCHIP/Akima slope formulas use scalar
  `sign`/`abs` on slope differences, so they can't accept `SVector y`
  (StaticArrays doesn't define `sign(::SVector)`); their `Tq` duck path
  is exercised via Float y + Dual xq. Cardinal's slopes are pure
  averages — kept the stronger SVector × Dual case there.

- "PCHIP/Akima — custom scalar duck-Tv interface": minimal `DuckFloat`
  wrapper around Float64 implementing the required slope-formula
  interface (`sign`, `abs`, `zero`, `+`, `-`, `*`, `/`, ordering).
  Demonstrates that PCHIP/Akima impose NO `<: AbstractFloat` constraint
  on `Tv` — any duck type satisfying the interface flows through and
  the trait predicts the duck wrapper type as the kernel return.

- "Cubic/Quadratic/Hermite-family — Rational user-visible blocked
  (BROKEN)" testsets mirror the Linear @test_broken pins. Cubic's
  3-arg oneshot line is omitted with a note: its output buffer is
  correctly sized `Vector{Rational{Int}}` by the trait, but the
  kernel's internal z coefficients are still Float64; the result only
  round-trips through Rational for inputs whose Float values happen
  to be exactly representable. Asserting `isa Vector{Rational{Int}}`
  would pass for the wrong reason — pin removed until the
  coefficient-eltype follow-up lands.
The previous comment in the Cubic @test_broken block was imprecise: it
suggested the 3-arg oneshot's `isa Vector{Rational{Int}}` would pass
"by accident" for exactly-representable Floats. Empirical trace shows
the actual mechanism is sharper and uniform — but the contract is
still broken.

Mechanism (verified via @inferred + runtime trace):
- The kernel-shape trait `_output_eltype(_arithmetic_kernel_shape, Tg,
  Tv, Tq)` predicts `Rational{Int}` purely from input types at compile
  time. Type-stable, @inferred passes.
- The buffer is allocated as `Vector{Rational{Int}}`.
- BUT the in-place `cubic_interp!` builds an internal cache via
  `_promote_grid_float` which lifts grid + values to Float64. The
  kernel computes in Float64 throughout (z coefficients are Float64).
- `output[i] = kernel_float_result` triggers
  `convert(Rational{Int}, ::Float64)`, which uses the bit-exact Float64
  representation. For dyadic results (k/2 inputs on x²/2 data) the
  Float is exactly representable as a small-denominator Rational and
  the result LOOKS clean. For non-dyadic queries the result is a
  ~2^52-denominator binary fraction approximation, NOT the true
  Rational arithmetic value.

The trait says one thing, storage lift says another — they don't
coordinate. The next PR will decide between honoring the trait (relax
_promote_grid_float for arithmetic methods) or honoring storage (make
the trait aware of the lift). Either resolution is captured by the
semantic pin below.

Semantic pin: a cubic spline of `y = x` should reproduce `y = x`
exactly. The new `@test_broken cubic_interp(x, x, [3//7]) == [3//7]`
currently fails (Float64 path produces a 2^52-denom approximation)
and auto-surfaces when the storage/trait mismatch is resolved.
…c contract

Item A — canonical migration: route all surviving output-buffer sites through
the method-aware kernel-shape form. After this commit no output-allocation
path remains on the legacy 2/3-arg `_output_eltype(Tv, Tg, ...)` fallback.

Migrated callsites:
- linear_series_interp.jl: 2 sites (scalar + batch out-of-place callable)
- linear_oneshot_series.jl: 2 sites (3-arg scalar + batch oneshot)
- cubic_oneshot.jl:286: cache-variant 3-arg batch oneshot
- cubic_nd_oneshot.jl:42: ND scalar oneshot (Tv_p binding retained for the
  separate `_resolve_extrap` use; trait uses raw Tv to match batch peer)
- quadratic_nd_oneshot.jl:163: ND scalar oneshot
- core/adjoint_protocol.jl: all 6 1D + ND adjoint allocator callables
  (using `Tg` as both grid and dL slot — Tq is anchor-baked at this layer)

Item C — allocation contract pin: the persistent batch allocating callable
`itp(xq)` must allocate exactly one `Vector{T_out}` and nothing else. New
testitem in test_promotion_alloc.jl iterates over all 7 methods (linear/
cubic/quadratic/pchip/cardinal/akima/constant), comparing the call's
`@allocated` against the bare-Vector baseline + ALLOC_THRESHOLD slack. A
re-introduced sample-first eval (or any leaked scratch buffer) would
exceed this by at least one kernel-result-sized allocation per call.
Replace the baseline-Vector subtraction approach with a direct in-place
zero-alloc check. The persistent in-place callable `itp(out, xq)` must be
zero-alloc across all methods — that single contract covers both paths,
since the allocating form `itp(xq)` is just `Vector{T_out}(undef, n)` +
this in-place call. Matches the existing `_bench_linear!` pattern in the
same file (in-place + ALLOC_THRESHOLD), avoids baseline-subtraction noise,
and pins a stronger invariant.
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 20, 2026

FastInterpolations.jl Benchmarks

All benchmarks (50 total, click to expand)
Benchmark Current: c86ea3c Previous Imm. Ratio Grad. Ratio
10_nd_construct/bicubic_2d 38749.0 ns 37996.0 ns 1.02 1.0
10_nd_construct/bilinear_2d 678.4 ns 751.5 ns 0.903 1.018
10_nd_construct/tricubic_3d 332771.0 ns 337158.0 ns 0.987 0.974
10_nd_construct/trilinear_3d 1945.9 ns 1999.8 ns 0.973 1.032
11_nd_eval/bicubic_2d_batch 1522.3 ns 1529.2 ns 0.995 0.924
11_nd_eval/bicubic_2d_scalar 15.9 ns 15.9 ns 1.0 0.985
11_nd_eval/bilinear_2d_scalar 7.0 ns 7.0 ns 1.0 0.972
11_nd_eval/tricubic_3d_batch 3452.2 ns 3457.1 ns 0.999 1.009
11_nd_eval/tricubic_3d_scalar 35.8 ns 35.8 ns 1.0 1.025
11_nd_eval/trilinear_3d_scalar 13.0 ns 13.0 ns 1.0 0.981
12_cubic_eval_gridquery/range_random 4650.8 ns 4648.9 ns 1.0 1.037
12_cubic_eval_gridquery/range_sorted 4647.8 ns 4645.5 ns 1.001 1.039
12_cubic_eval_gridquery/vec_random 10516.4 ns 10054.4 ns 1.046 1.094
12_cubic_eval_gridquery/vec_sorted 3186.0 ns 3200.1 ns 0.996 0.992
13_nd_oneshot_gridquery/bicubic_2d_rand_rand 65106.2 ns 65912.3 ns 0.988 -
13_nd_oneshot_gridquery/bicubic_2d_sort_rand 61168.1 ns 61742.1 ns 0.991 -
13_nd_oneshot_gridquery/bicubic_2d_sort_sort 58328.9 ns 58935.1 ns 0.99 -
13_nd_oneshot_gridquery/bilinear_2d_rand_rand 18019.8 ns 16653.4 ns 1.082 -
13_nd_oneshot_gridquery/bilinear_2d_sort_rand 9320.6 ns 9097.8 ns 1.024 -
13_nd_oneshot_gridquery/bilinear_2d_sort_sort 4962.9 ns 4859.0 ns 1.021 -
14_series_oneshot_batch/constant_inplace_vec_k8_q1000_rand 19184.9 ns 20604.4 ns 0.931 -
14_series_oneshot_batch/linear_inplace_vec_k8_q1000_rand 20042.2 ns 18261.0 ns 1.098 -
1_cubic_oneshot/q00001 565.3 ns 567.0 ns 0.997 1.18
1_cubic_oneshot/q10000 47895.2 ns 47867.7 ns 1.001 0.767
2_cubic_construct/g0100 1539.3 ns 1535.9 ns 1.002 1.092
2_cubic_construct/g1000 14128.3 ns 14363.2 ns 0.984 1.045
3_cubic_eval/q00001 23.9 ns 23.3 ns 1.026 1.064
3_cubic_eval/q00100 484.7 ns 483.7 ns 1.002 1.038
3_cubic_eval/q10000 47046.9 ns 47029.5 ns 1.0 1.039
4_linear_oneshot/q00001 27.1 ns 27.1 ns 1.0 0.998
4_linear_oneshot/q10000 19356.2 ns 19406.7 ns 0.997 1.011
5_linear_construct/g0100 34.4 ns 40.0 ns 0.862 0.897
5_linear_construct/g1000 251.3 ns 270.2 ns 0.93 0.934
6_linear_eval/q00001 10.8 ns 10.3 ns 1.048 1.046
6_linear_eval/q00100 208.1 ns 199.7 ns 1.042 1.046
6_linear_eval/q10000 19833.9 ns 19164.3 ns 1.035 1.05
7_cubic_range/scalar_query 9.2 ns 9.2 ns 1.0 1.041
7_cubic_vec/scalar_query 10.3 ns 10.2 ns 1.01 0.988
8_cubic_multi/construct_s001_q100 679.4 ns 680.8 ns 0.998 1.173
8_cubic_multi/construct_s010_q100 4879.4 ns 4983.4 ns 0.979 1.059
8_cubic_multi/construct_s100_q100 44345.8 ns 44607.8 ns 0.994 1.044
8_cubic_multi/eval_s001_q100 815.6 ns 856.3 ns 0.953 1.119
8_cubic_multi/eval_s010_q100 1850.6 ns 1832.7 ns 1.01 1.05
8_cubic_multi/eval_s010_q100_scalar_loop 2481.3 ns 2474.3 ns 1.003 1.031
8_cubic_multi/eval_s100_q100 11810.8 ns 11814.5 ns 1.0 1.024
8_cubic_multi/eval_s100_q100_scalar_loop 3508.2 ns 3508.1 ns 1.0 0.991
9_nd_oneshot/bicubic_2d 44711.5 ns 46253.3 ns 0.967 1.108
9_nd_oneshot/bilinear_2d 588.5 ns 587.1 ns 1.002 1.005
9_nd_oneshot/tricubic_3d 384598.0 ns 386974.7 ns 0.994 1.053
9_nd_oneshot/trilinear_3d 1066.6 ns 1073.5 ns 0.994 0.985

⚠️ Performance Regression Confirmed ⚠️

After re-running 5 flagged benchmark(s) 10 time(s), 4 regression(s) confirmed.

Benchmark Current Previous Imm. Ratio Grad. Ratio Tier
1_cubic_oneshot/q00001 565.3 ns 567.0 ns 0.997 1.18 gradual
8_cubic_multi/construct_s001_q100 679.4 ns 680.8 ns 0.998 1.173 gradual
8_cubic_multi/eval_s001_q100 815.6 ns 856.3 ns 0.953 1.119 gradual
9_nd_oneshot/bicubic_2d 44711.5 ns 46253.3 ns 0.967 1.108 gradual

Thresholds: immediate > 1.1x (vs latest master), gradual > 1.1x (vs sliding window)

This comment was automatically generated by Benchmark workflow.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes incorrect output-element-type prediction for “duck” value/query type combinations (notably SVector × ForwardDiff.Dual), by replacing the legacy promote_type/_PromotableValue-based trait with a method-aware kernel-shape approach using Base.promote_op. It also refactors allocators to size output buffers from the trait (removing “sample-first” evaluation) and expands tests to pin carrier propagation, inference, and allocation guarantees.

Changes:

  • Introduces kernel-shape-based output eltype inference (_arithmetic_kernel_shape, _constant_kernel_shape) and routes allocator sizing through Base.promote_op.
  • Updates Constant interpolation to propagate query carriers via * one(dL) (and adjusts boundary/derivative short-circuits accordingly), plus ND deriv-zero carrier folding.
  • Adds extensive new test coverage for duck Tv × Tq (StaticArrays + ForwardDiff), allocation contracts, and Constant eltype behavior changes.

Reviewed changes

Copilot reviewed 44 out of 44 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
test/test_promotion_alloc.jl Updates Constant batch eltype expectations; adds zero-allocation contract test for persistent in-place batch calls.
test/test_precision_vector_queries.jl Updates Constant vector-call eltype expectation to track query carrier.
test/test_duck_tv_dual_tq.jl New comprehensive duck-typing + AD + inference regression suite (SVector/Dual, Series, adjoints, ND carrier).
test/test_constant.jl Updates Constant scalar behavior expectations for Int data + Float query; adds fully-Int-chain preservation checks.
test/test_constant_eltype.jl Reworks Constant eltype contracts across 1D/ND, periodic modes, FillExtrap, and oneshot vs persistent behavior.
test/test_complex_constant.jl Updates Constant Complex{Int} + Float query expectation (carrier propagation → ComplexF64).
test/test_bc_complex_int.jl Updates Constant Complex{Int} + Float query expectation (carrier propagation → ComplexF64).
src/core/utils.jl Refactors output-eltype trait logic and adds kernel-shape helpers for method-aware inference.
src/core/interpolant_protocol.jl Routes default output trait through _arithmetic_kernel_shape; removes scalar-path convert; adjusts ND deriv-zero fill carrier folding.
src/core/adjoint_protocol.jl Switches adjoint allocation eltype inference to kernel-shape-based _output_eltype.
src/linear/linear_interpolant.jl Notes Linear uses default arithmetic kernel shape route.
src/linear/linear_anchor.jl Anchored-vector allocator now uses kernel-shape output-eltype inference.
src/linear/linear_oneshot.jl One-shot linear batch allocator uses kernel-shape output-eltype inference.
src/linear/linear_series_interp.jl Linear Series allocators use kernel-shape output-eltype inference.
src/linear/linear_oneshot_series.jl Linear Series oneshot allocators use kernel-shape output-eltype inference.
src/linear/nd/linear_nd_types.jl Adds note that Linear ND inherits default arithmetic kernel shape route.
src/linear/nd/linear_nd_oneshot.jl ND oneshot allocators use kernel-shape output-eltype inference.
src/quadratic/quadratic_anchor.jl Quadratic anchored-vector callable now sizes output via kernel-shape trait.
src/quadratic/quadratic_oneshot.jl Quadratic oneshot batch allocator uses kernel-shape output-eltype inference.
src/quadratic/quadratic_series_interp.jl Quadratic Series allocators use kernel-shape output-eltype inference.
src/quadratic/quadratic_oneshot_series.jl Quadratic Series oneshot allocators use kernel-shape output-eltype inference.
src/quadratic/nd/quadratic_nd_oneshot.jl Quadratic ND oneshot allocators use kernel-shape output-eltype inference.
src/cubic/cubic_oneshot.jl Cubic oneshot allocators use kernel-shape output-eltype inference.
src/cubic/cubic_series_interp.jl Cubic Series allocators use kernel-shape output-eltype inference.
src/cubic/cubic_oneshot_series.jl Cubic Series oneshot allocators use kernel-shape output-eltype inference.
src/cubic/nd/cubic_nd_oneshot.jl Cubic ND oneshot allocators use kernel-shape output-eltype inference.
src/hermite/hermite_interpolant.jl Adds Hermite-family _output_eltype override using a two-chain kernel-shape pattern over y and dy.
src/hermite/hermite_oneshot.jl Mirrors Hermite-family two-chain output-eltype sizing in oneshot path.
src/pchip/pchip_oneshot.jl PCHIP oneshot allocator uses kernel-shape output-eltype inference.
src/cardinal/cardinal_oneshot.jl Cardinal oneshot allocator uses kernel-shape output-eltype inference.
src/akima/akima_oneshot.jl Akima oneshot allocator uses kernel-shape output-eltype inference.
src/constant/constant_types.jl Updates Constant storage commentary (raw storage; widening handled at evaluation).
src/constant/constant_interpolant.jl Defines _constant_kernel_shape and routes Constant output-eltype trait through kernel shape.
src/constant/constant_kernels.jl Changes Constant kernels to multiply selected value by one(dL) to propagate query carrier.
src/constant/constant_oneshot.jl Updates right-edge short-circuit to preserve carrier; allocator sizes output via kernel-shape trait; updates docs.
src/constant/constant_anchor.jl Ensures anchor short-circuits preserve carrier; anchored-vector allocator sizes output via kernel-shape trait.
src/constant/constant_series_interp.jl Improves duck-query sizing for Series constant path; loosens in-place output container typing.
src/constant/constant_oneshot_series.jl Uses _output_eltype fallback for duck queries in Constant Series oneshot.
src/constant/constant_adjoint.jl Comment updates noting buffer eltype comes from adjoint protocol.
src/constant/nd/constant_nd_interpolant.jl Comment updates on raw storage and kernel-based widening.
src/constant/nd/constant_nd_types.jl Constant ND _output_eltype trait now routes through _constant_kernel_shape.
src/constant/nd/constant_nd_eval.jl Propagates per-axis carrier in selection result; derivative-zero uses prod(one, q_eval).
src/constant/nd/constant_nd_oneshot.jl Derivative branches fold per-axis carrier; batch allocator eltype uses kernel-shape trait; removes overly-narrow type assertions.
src/constant/nd/constant_nd_adjoint.jl Comment update to align with protocol-sized adjoint buffers.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/core/adjoint_protocol.jl
Comment thread src/core/adjoint_protocol.jl
Comment thread src/core/interpolant_protocol.jl
Comment thread src/constant/constant_kernels.jl
Comment thread src/constant/nd/constant_nd_oneshot.jl Outdated
@codecov
Copy link
Copy Markdown

codecov Bot commented May 20, 2026

Codecov Report

❌ Patch coverage is 96.03960% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 96.43%. Comparing base (51d3afb) to head (c86ea3c).

Files with missing lines Patch % Lines
src/core/utils.jl 81.25% 3 Missing ⚠️
src/constant/constant_interpolant.jl 0.00% 1 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #146      +/-   ##
==========================================
- Coverage   96.45%   96.43%   -0.02%     
==========================================
  Files         143      143              
  Lines       11957    11978      +21     
==========================================
+ Hits        11533    11551      +18     
- Misses        424      427       +3     
Files with missing lines Coverage Δ
src/akima/akima_oneshot.jl 100.00% <100.00%> (ø)
src/cardinal/cardinal_oneshot.jl 100.00% <100.00%> (ø)
src/constant/constant_adjoint.jl 100.00% <ø> (ø)
src/constant/constant_anchor.jl 98.59% <100.00%> (+0.02%) ⬆️
src/constant/constant_kernels.jl 100.00% <100.00%> (ø)
src/constant/constant_oneshot.jl 100.00% <100.00%> (ø)
src/constant/constant_oneshot_series.jl 100.00% <100.00%> (ø)
src/constant/constant_series_interp.jl 97.57% <100.00%> (+0.01%) ⬆️
src/constant/constant_types.jl 100.00% <ø> (ø)
src/constant/nd/constant_nd_adjoint.jl 100.00% <ø> (ø)
... and 27 more
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

mgyoo86 added 6 commits May 20, 2026 12:20
…rait form

`_constant_kernel_shape(yv, q) = yv * one(q)` only saw the query type, so
`Dual` grid + `Float` xq paths under-predicted the kernel's true return
type — `_constant_vector_loop!` then tried to store a `Dual` kernel result
into a `Vector{Float64}` buffer (`MethodError` on Julia LTS Extensions CI).

Mirror the actual kernel arithmetic: `(xL, yv, xq) = yv * one(xq - xL)`.
`Base.promote_op` now sees `xq - xL` and infers the real `dL` carrier, so
duck grids (Dual, Measurement, …) flow through `promote_type(Tg, Tq)`
naturally. All callsites collapse to the canonical 4-arg `(kernel_op, Tg,
Tv, Tq)` form that Linear's `_arithmetic_kernel_shape` already used —
Constant was the lone outlier on the 2-arg `(Tv, Tq)` shape.

Int×Int×Int still infers `Int`; SVector × Dual still infers SVector{Dual};
the Rational chain pin and the Int batch-vs-scalar contract are unchanged
(only the trait's arg count changes, not its predictions).
…-arg trait

`test/ext/test_series_dual_grid.jl` on Julia LTS still tripped
`MethodError: Float64(::Dual)` after the kernel-shape rename — Constant
series oneshot + persistent paths were the last holdouts of the
`Tq <: _PromotableValue ? Tv : _output_eltype(Tv, Tq)` guard. That gate
ignores `Tg`, so `Dual` grid + `Float` xq short-circuited to `Vector{Tv}`
(Float-only) while the kernel returned `Dual` via `xq - xL`.

Migrate all 4 series sites to the canonical 4-arg trait
`_output_eltype(_constant_kernel_shape, Tg, Tv, Tq)`:
- `constant_oneshot_series.jl:83`  — scalar oneshot
- `constant_oneshot_series.jl:288` — vector oneshot
- `constant_series_interp.jl:409`  — persistent scalar callable
- `constant_series_interp.jl:461`  — persistent vector callable

Linear/Cubic/Quadratic series already use the canonical form; Hermite
adds a disjoint Tdy promote chain on top. Constant was the only family
still on the legacy gate.

Test pins updated to match the new Series contract (1D plain path's
`Int y + Float xq → Float` already established in PR #144; Series now
agrees instead of pinning a stale `Vector{Int}` raw-Tv hold-out):
- `Vector{Int}` → `Vector{Float64}` for `Float xq` cases (lines 496, 622-624)
- Added explicit pin: fully-Int chain (`Int grid + Int y + Int xq`)
  stays `Vector{Int}` — Series ↔ plain agree on both regimes.

PR description's "Out of scope" item that called out a Series storage
Float-forcing follow-up is now closed for Constant. Cubic/Quadratic Tz/Tc
coefficient widening is still on the legacy enumeration as documented.
PR #146 reviews (Codex P2 #1/#2 + Copilot) flagged that scalar OOB
extrap paths returned raw `Tv` for non-Number value types — `SVector` y
with `ClampExtrap` / `FillExtrap` + `Dual` xq returned `SVector{Float64}`
scalar but `SVector{Dual}` from the trait-sized batch buffer, breaking
the scalar/batch consistency the PR is meant to restore.

Root cause: `_promote_extrap_val(val::Number, xq::Number)` carrier-
propagates via `val + zero(xq) * zero(val)`, but the non-Number fallback
`_promote_extrap_val(val, xq) = val` returns raw — the actual `SVector`
boundary value passes through `_eval_extrapolation` → `_promote_extrap_val`
→ raw return, with no carrier added. ND scalar takes the same path via
`_try_fill_oob` → `_fill_extrap_result` → `_promote_extrap_val`.

Add an `AbstractArray` overload that broadcasts the same carrier-prop
pattern (`val .+ zero(xq) .* zero(eltype(val))`); same idea for the
derivative branch's `_promote_extrap_zero`. `SVector × Dual` now resolves
to `SVector{Dual}` on scalar OOB, matching in-domain kernel output and
the trait-sized batch buffer.

Constant ND oneshot derivative empty-fast-path also missed the query
carrier — `nq == 0 && return Vector{Tv}(undef, 0)` ignored Tq. The
non-empty branch already synthesises `0 * first(data) * prod(one, first_q)`.
Route the empty path through the same kernel-shape trait so eltypes agree.

RED-then-GREEN pins added under "OOB carrier — scalar/batch consistency
for non-Number Tv" + "ND oneshot derivative — empty vs non-empty query
eltype" in `test/test_duck_tv_dual_tq.jl`. 36 OOB combinations
(Linear/Cubic/Quadratic/Constant × Clamp/Fill × 1D scalar/batch + ND
analogues) all green.

Follow-up: the `AbstractArray` overload here is a stop-gap. The
discussion under design item 1 in `claudedocs/TODO/output_eltype_carrier_followups.md`
moves to a `Tr`-driven `_eval_extrapolation` that uses `convert(Tr, val)`
instead of arithmetic carrier inference — eliminates broadcasting
overhead, removes the `zero(Tr)` requirement, preserves NaN propagation
via the Constant-kernel pattern `0 * convert(Tr, val)`. Out of scope
here.
…pe contract

Three pins in `test_complex_constant_series.jl` predated the Series
canonical 4-arg trait migration (commit d28d7b0) and asserted the
legacy raw-Tv contract:

- "Integer grid + Complex values" (line 65, 69): `sitp(5.5)` with
  `Int` grid + `Complex{Int}` y → kernel-shape `yv * one(xq - xL)`
  promotes via `Float - Int = Float` → `ComplexF64`. Was pinned as
  `Vector{Complex{Int}}` / `5 + 10im`. Updated to `Vector{ComplexF64}` /
  `5.0 + 10.0im`. Added explicit fully-Int chain pin (`sitp(5)` → stays
  `Vector{Complex{Int}}`) mirroring the 1D plain-path pin in
  `test_constant_eltype.jl`.

- "Eltype contract: selection kernel preserves Tv" (line 284): Float32
  grid + Float32 y + Float64 xq → trait promotes to Float64. Was pinned
  as Float32 (legacy raw-Tv claim). Renamed to "kernel-shape trait
  promotes via xq - xL" and split into two pins: Float64 xq → Float64,
  Float32 xq → Float32 (fully-Float32 chain stays Float32).

No source change — the new Series promote behavior was already
established in commit d28d7b0; these tests just hadn't been refreshed.
Local `test_complex_constant_series.jl` now passes on both Julia 1.x
and the LTS-failure scenario reported on CI.
…contract

Two more pins predated the Series canonical 4-arg trait migration
(commit d28d7b0) and asserted the legacy raw-Tv contract under a
Float-grid + Int-y + Float-xq chain — that chain now (correctly)
promotes to Float via `xq - xL`.

- "Eltype contract: Integer series stays Int" (line 144): `Float grid +
  Int y + Float xq` was pinned `Vector{Int}`. Renamed to "fully-Int
  chain stays Int" and migrated setup to `Int grid + Int y + Int xq` —
  matches the intent ("Int chain preserved") and the 1D plain-path pin
  in test_constant_eltype.jl's "Fully-Int chain — Series stays Int"
  testset.

- "FillExtrap fill_value must be compatible with y eltype" (line 155):
  the InexactError contract requires an `Int` output buffer to catch
  the Float fill_value mismatch. Under the new promote rule, the
  original `Float grid + Int y + Float xq` setup gives a Float buffer
  that silently absorbs the 0.5 fill — exactly what this contract
  was trying to catch. Migrated to fully-Int chain so the Int buffer
  reinstates the strict-on-assign behavior.

No source change — same theme as `test_complex_constant_series.jl`
commit 82085fc.
@mgyoo86 mgyoo86 merged commit c47519b into master May 20, 2026
14 checks passed
@mgyoo86 mgyoo86 deleted the fix/output-eltype-duck-restoration branch May 20, 2026 21:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

MethodError when evaluating vector-valued interpolants (SVector) during ForwardDiff automatic differentiation (v0.4.10)

2 participants