-
Notifications
You must be signed in to change notification settings - Fork 27
Description
This package uses KnownNat whenever it is necessary to pass length information via types. A KnownNat is backed by a Natural, an algebraic datatype with two constructors, which in general cannot be unboxed (unless the simplifier is certain which constructor it is, which is rare). On the other hand Vector stores its length in an Int, and so at all interfaces we have to convert between Natural and Int. Both passing Natural to functions, and having them convert to/from Int incurs a performance penalty.
Worse yet, the conversions are lossy. fromIntegral :: Natural -> Int will silently truncate the higher bits, leaving you with a sized vector whose underlying unsized vector doesn't match its type:
> Data.Vector.Sized.replicate @0x10000000000000000 ()
Vector []Needless to say this breaks invariants of many functions. Worse yet, the package exports a way to access the reverse conversion from Int to Natural to KnownNat:
> Data.Vector.Sized.knownLength' (Data.Vector.Sized.replicate @0x10000000000000000 ()) GHC.TypeNats.natVal
0knownLength' uses unsafeCoerce under the hood, which in this case is actually unsafe. It generates an instance dictionary for KnownNat 0x10000000000000000, but populates it with the natural 0 instead. This incoherent dictionary breaks type safety, as can be observed via Typeable or simply sameNat:
> Just r = Data.Vector.Sized.knownLength' (Data.Vector.Sized.replicate @0x10000000000000000 ()) (GHC.TypeNats.sameNat @0 Proxy)
> :t r
r :: 0 :~: 18446744073709551616
> r
ReflWith a little type family you can construct an unsafeCoerce.
As a quick and dirty workaround, consider using fromEnum :: Natural -> Int which will at least error if the argument is out of bounds (this behavior doesn't seem consistent with e.g. fromEnum @Integer). You will have to migrate from the legacy GHC.TypeLits api that uses Integer (btw incurring additional Natural<->Integer conversions), to the new GHC.TypeNats (GHC 8.2+) that uses Natural.
As a more long term solution, over at https://github.com/mniip/finite-typelits/tree/integral I've been experimenting with an extension to Finite that would allow the underlying type to be a different type other than Integer, possibly Int. I've quickly run into essentially all the same issues, and the way the API is currently shaping to be is that an instance dictionary for a "known Int n" can only be solved if n <= maxBound @Int. This would ensure invariants of things like Vector or an Int-based Finite are satisfied, and reifying an Int back into a dictionary remains type-safe.
A challenge there is that much like having to prove KnownNat (n + m) one would have to prove n + m <= maxBound, except this time the constraint might not actually hold. However with the way Vector is using KnownNat this seems like a non-issue: if all places where the vector's underlying length is taken from KnownNat are annotated with the inequality, it should be impossible to construct a vector whose underlying length has overflowed, because the components from which the vector is constructed will need to have size just under maxBound, which should be impossible due to address space limitations. concatMap would be at risk if it weren't using size-unaware concat under the hood.
I don't know how the use-cases of vector-sized would fare with such inequalities sprinkled about, and would love to hear alternative suggestions.