Views: Pattern Matching Arrays as Lists
For performance reasons, we sometimes find ourselves implementing the interface of a more basic structure in terms of a much more complicated one. In functional programming languages, this often comes up in the context of needing to build a structure through mutation and then "seal" it for further processing as an immutable object. While this often comes with great performance gains, the encapsulation needed to hide the internal mutability or unsaftey, or even the nature of the interior mutability itself, can deprive us of the ability to meaningfully pattern match on the results. Views and dependent pattern matching can overcome this limitation by allowing us to redefine the notion of pattern matching for a type.
Motivating Structure - Wrapped Arrays
Whether or not the efficiency difference actually matters for a given usecase, working with arrays is almost always more efficient than working with the FP-favorite linked lists. This is unfortunate for us functional programmers, arrays don't play nearly as well with the rest of our toys as lists do. When we encounter a usecase where the efficiency gains can not be ignored 1, be it dealing with very large quantities of data, or the mutating algorithm for the task just having a way better time signature, it's common to write at least some part of the codebase in terms of mutable arrays.
Once we have our mutable array in its final state, we are faced with a choice. We can either convert it to a list, wasting a potentially very large number of precious cycles, or continue passing it around as an array2, losing pattern matching and all the other things that make linked lists nice to deal with, at least from the functional programmers perspective. Idris, by way of Epigram, gives us a new tool, views and dependent pattern matching3, that allow us to have our cake and eat it too, lettings us keep passing our structure around as an array, but interacting with it as if it was a list.
The Structure
To keep things simple, and only reference types that are in the somewhat sparse Idris standard library, we'll implement our own wrapper around a mutable IO array that "seals" it into an immutable collection. To keep ourselves honest, we'll open up a new namespace for this part of the article so we can encapsulate our structure:
Note
The namespace will encapsulate everything up to, but not including, the first non-intended block of code as if it were in its own module.
namespace IArray
The structure of this wrapper is pretty simple, containing only the
IOArray
being wrapped. To keep things interesting, and flex
the type system a bit more, we will index the type of our wrapper with
the length of the array. Worry not, however, we will still commit all
manner of horrors and unsafePeformIO
4
under the hood as we forge ahead:
export
record IArray (n : Nat) (t : Type) where
constructor MkIArray
arr : IOArray t
The functions for working with the built in IOArray
type
in Idris return Maybe
s, encoding the result of the bounds
checks performed by those operations 5. Since we are indexing
the type of our wrapper by the length of the contained array, we can
rely on type checking to ensure all indexing is within bounds and
soundly assume that none of our calls to IOArray
functions
will return Nothing
. We'll make a little unsafe function
for unconditionally converting a Maybe a
to an
a
:
Tip
The %unsafe
pragma is purely cosmetic, it just instructs
the compiler to highlight calls to this function with a different color
to draw the programmers attention.
%unsafe
unwrap : Maybe a -> a
unwrap Nothing = assert_total $ idris_crash "Oops all nothings"
unwrap (Just x) = x
What is an array without the usual constant time indexing function?
As indexing doesn't mutate the array, we can safely hide the IO under
the hood. We will also accept an erased auto
implicit proof
that the provided index is in bounds to exclude the possibility of an
out-of-bounds access:
export
index : (i : Nat) -> (arr : IArray n a) -> {auto 0 _ : i `LT` n} -> a
index i arr = unwrap . unsafePerformIO $ readArray arr.arr (cast i)
With careful use of linear types, we could provide a rich, safe
interface for building these arrays without leaking the fact that they
use IO
under the hood6, but that is beyond the
scope of this article. Instead, for the sake of having examples to work
with, we'll provide a few functions for generating sample arrays:
||| Convert a `Vect` into an IArray
export
fromVect : {n : Nat} -> (xs : Vect n a) -> IArray n a
fromVect xs = unsafePerformIO fromVectIO
where
loadVect : (arr : IOArray a) -> (idx : Nat) -> (ys : Vect m a) -> IO ()
loadVect arr idx [] = pure ()
loadVect arr idx (y :: ys) = do
_ <- writeArray arr (cast idx) y
loadVect arr (S idx) ys
fromVectIO : IO (IArray n a)
fromVectIO = do
arr <- newArray (cast n)
loadVect arr 0 xs
pure $ MkIArray arr
||| Create an `IArray` containing all the `Nat`s less than or equal to `n`
export
range : (n : Nat) -> IArray (S n) Nat
range n = unsafePerformIO rangeIO
where
buildRange : (arr : IOArray Nat) -> (idx : Nat) -> IO ()
buildRange arr 0 = do
_ <- writeArray arr 0 0
pure ()
buildRange arr (S k) = do
_ <- writeArray arr (cast (S k)) (S k)
buildRange arr k
rangeIO : IO (IArray (S n) Nat)
rangeIO = do
arr <- newArray (cast (S n))
buildRange arr (S n)
pure $ MkIArray arr
We will also need a few functions for handling arrays in a list-like
manner to write our view later. We'll start with a head
and
a tail
function with the same general semantics as their
list counterparts, an empty
function for constructing the
empty array, and an arrayCons
function that adds a new
element to the front of an array much like ::
on lists:
Note
We won't actually be calling empty
or
arrayCons
once we get into the later parts of this article,
we only actually need their type signatures for later, but their
implementations are provided for the sake of completeness.
export
head : IArray (S n) a -> a
head x = unsafePerformIO headIO
where
headIO : IO a
headIO = do
val <- readArray x.arr 0
pure $ unwrap val
export
tail : {n : Nat} -> IArray (S n) a -> IArray n a
tail {n} x = unsafePerformIO tailIO
where
copyContents : (newArr : IOArray a) -> (idx : Nat) -> IO ()
copyContents newArr 0 = pure ()
copyContents newArr (S k) = do
val <- map unwrap $ readArray x.arr (cast (S k))
_ <- writeArray newArr (cast k) val
copyContents newArr k
tailIO : IO (IArray n a)
tailIO = do
newArr <- newArray (cast n)
copyContents newArr n
pure $ MkIArray newArr
export
empty : IArray 0 a
empty = unsafePerformIO $ do
arr <- newArray 0
pure $ MkIArray arr
export
arrayCons : {n : Nat} -> (head : a) -> (tail : IArray n a) -> IArray (S n) a
arrayCons head tail = unsafePerformIO $ do
newArr <- newArray (cast (S n))
copyContents newArr n
pure $ MkIArray newArr
where
copyContents : (newArr : IOArray a) -> (idx : Nat) -> IO ()
copyContents newArr 0 = do
val <- map unwrap $ readArray tail.arr 0
_ <- writeArray newArr 1 val
pure ()
copyContents newArr (S k) = do
val <- map unwrap $ readArray tail.arr (cast (S k))
_ <- writeArray newArr (cast (S (S k))) val
pure ()
Its also common to need to know the lengths of arrays, and while the
IArray
type is indexed over the length of the contained
array, the length may be erased, and we may need to compute it at
runtime. We'll provide not just the length
function, but a
companion, unsafely implemented, proof that the result of the
length
function corresponds to the length argument to the
type constructor:
export
length : IArray n a -> Nat
length x = cast (max x.arr)
export
lengthCorrect : (0 xs : IArray n a) -> length xs = n
lengthCorrect _ = believe_me (the (1 = 1) Refl)
We can now exit our namespace and demonstrate that our immutable encapsulated array works as expected:
index 5 (range 10) == 5
head (fromVect ['a', 'b', 'c']) == 'a'
head (tail (fromVect ['a', 'b', 'c'])) == 'b'
length (range 10) == 11
Taking in the View
A View is essentially a data structure that provides us with
an alternate notion of pattern matching for another data structure. For
a given type t
, views over that type generally have a type
constructor of form t -> Type
, a type indexed over a
specific value of t
, and the data type itself essentially
provides proof that the given value of type t
can be broken
apart in a specific way, allowing you to use the with
rule7 to pattern match based on arbitrary,
user defined information about the data instead of just its nominal
structure.
One commonly used view form the standard library is AsList
from Data.String
. In Idris, String
is an
implementation defined bag of bytes with array-like internal structure,
much like our example here. This has numerous benefits over the
traditional functional programming representations based around linked
lists of characters, but makes traditional pattern matching based
algorithms annoying to implement. AsList
bridges this gap
by allowing us to process strings as if they were lazily linked
lists8.
Consider the following example of AsList
which, someone
redundantly, converts a String
to a
List Char
:
Tip
Notice how in the recursive call, instead of simply calling
stringToList str
, we instead write
stringToList str | xs
. The vertical pipe in the expression
side of the match indicates that the item on the right hand side of the
pipe, in this case xs
, should be used for value of the
right hand side of the with
statement in the recursive call
instead of recomputing it from scratch.
stringToList : String -> List Char
stringToList str with (asList str)
stringToList "" | [] = []
stringToList (strCons c str) | (c :: xs) = c :: stringToList str | xs
The function asList
has type
asList : (str : String) -> AsList str
, producing a value
of the AsList
view type indexed by the value of the input
string. Functions such as this one which produce values of view types
are referred to as covering functions. Note how the values on
the left hand side of the statements in the with
block are
a literal and a function call, the (dependent) type of the right hand
side has given us information about the form of values in the left hand
side that goes beyond simply identifying which constructor was used to
produce that value.
Our View Type
The view we'll be implementing for IArray
will be very
similar to AsList
, so we'll name it accordingly. We'll also
overload the list constructors to make the redefined pattern matching
feel as list-like as possible:
data ArrayAsList : IArray n a -> Type where
Nil : ArrayAsList IArray.empty
(::) : {xs : IArray n a}
-> (head : a) -> (tail : Lazy (ArrayAsList xs)) -> ArrayAsList (arrayCons head xs)
In this definition, []
translates to a proof that the
indexed array is the empty array, and x :: xs
translates to
a proof that the indexed array can be built by calling
arrayCons
on the value x
and the array
xs
. We need not actually call arrayCons
to
make use of this type, only establish that the value could be
produced with an arrayCons
invocation.
Implementing Our Covering Function
The covering function for ArrayAsList
is pretty straight
forward, if the array has length zero, return Nil
, and if
it has a non-zero length, separate out the head and tail of the array
and cons the head onto the result of calling arrayAsList
on
the tail. We'll implement two versions, one where the length parameter
of the type is known, and another where it is erased and recovered at
runtime:
Warning
Since the head
, tail
, and
empty
operations are defined using
unsafePerformIO
under the hood, they are opaque to the type
checker, and we have to use the hideously unsafe
believe_me
function, which has type
believe_me : a -> b
. believe_me
essentially
disables the type checker, functioning much like transmute
does in Rust, and will cause serious undefined behavior if used
carelessly.
This isn't how you would usually write views, normally some aspect of the structure would allow you to actually prove the properties in question, but is representative of writing views for FFI and IO based types. Additionally, it serves to really drive home the point that views allow pattern matching over arbitrary facts about the type, and not just structural facts.
arrayAsList' : {n: Nat} -> (xs : IArray n a) -> ArrayAsList xs
arrayAsList' {n = 0} xs =
believe_me (the (ArrayAsList (the (IArray 0 a) IArray.empty)) Nil)
arrayAsList' {n = (S k)} xs =
let rest : Lazy (ArrayAsList (tail xs)) = Delay (arrayAsList' (tail xs))
in believe_me $ head xs :: rest
arrayAsList : (xs : IArray n a) -> ArrayAsList xs
arrayAsList xs =
rewrite sym $ lengthCorrect xs in arrayAsList' (convert xs)
where
convert : (ys : IArray m a) -> IArray (length ys) a
convert ys = rewrite lengthCorrect ys in ys
Using Our View
Converting to a list
Now that we have our nice arrayAsList
covering function,
lets show it off. We'll start with a simple use case just to get a feel
for it, converting an array to a list:
arrayToList : (xs : IArray n a) -> List a
arrayToList xs with (arrayAsList xs)
arrayToList IArray.empty | [] = []
arrayToList (arrayCons head xs) | (head :: tail) =
head :: (arrayToList xs | tail)
Much like our example with String
's AsList
view, this is somewhat redundant, but lets us get a feel for using the
view, and shows the equivalence to pattern matching on lists quite
well.
We can also demonstrate that it works as expected:
arrayToList (fromVect ["Idris", "is", "fun", "!"]) == ["Idris", "is", "fun", "!"]
Implementing fold
To show off a slightly more in depth application of our new view, we
can use it to implement the Foldable
interface, the basic abstraction over collections that can be consumed
one element at a time, in a fairly idiomatic way.
First, we'll get the implementation of null
out of the
way, since that just needs to compare the length against 0:
arrayNull : IArray n a -> Bool
arrayNull xs = length xs == 0
More interesting is foldr
, which consumes collections
starting from the end, otherwise known as the "right hand side" due to
the left-to-right writing convention. foldr
on lists can be
thought of as replacing the conses (::
) with the provided
function, and replacing the terminator
(Nil
/[]
) with the provided accumulator value.
Our ArrayAsList
view allows us to cleanly translate that
concept over to arrays, despite arrays having neither conses nor
terminators. Note how this mirrors arrayToList
above with
f
in place of ::
on the expression side of the
match:
arrayFoldr : (f : e -> acc -> acc) -> acc -> IArray n e -> acc
arrayFoldr f x xs with (arrayAsList xs)
arrayFoldr f x IArray.empty | [] = x
arrayFoldr f x (arrayCons head xs) | (head :: tail) =
head `f` (arrayFoldr f x xs | tail)
To demonstrate that arrayFoldr
works as expected, we can
use it to convert an array to a list:
arrayFoldr (::) [] (range 4) == [0, 1, 2, 3, 4]
Even though foldr
is sufficient to provide an
implementation of Foldable
, we will also consider
foldl
, which consumes elements starting from the beginning,
or the left hand side, of the collection. foldl
on lists
can also be thought of as replacing the conses (::
) with
the provided function, but because the list is being consumed head to
tail the order of the arguments is flipped, and instead of replacing the
end of the list with the provided accumulator, the start is replaced.
ArrayAsList
also allows us to naturally carry over this
logic to arrays:
arrayFoldl : (f : acc -> e -> acc) -> acc -> IArray n e -> acc
arrayFoldl f x xs with (arrayAsList xs)
arrayFoldl f x IArray.empty | [] = x
arrayFoldl f x (arrayCons head xs) | (head :: tail) =
arrayFoldl f (f x head) xs | tail
When ::
is left-folded over a list, it reverses the
list. We can do the same to an array to demonstrate that our
arrayFoldl
works as expected:
arrayFoldl (flip (::)) [] (fromVect ['a', 'b', 'c', 'd', 'e']) == ['e', 'd', 'c', 'b', 'a']
We can now put together our implementation of Foldable
for IArray
:
Foldable (IArray n) where
null = arrayNull
foldr = arrayFoldr
foldl = arrayFoldl
And test drive it by using the standard library toList
function, which is implemented in terms of Foldable
:
toList (fromVect ["yet", "another", "array", "of", "things"]) == ["yet", "another", "array", "of", "things"]
Other Views
While making non-list structures behave as if they were lists is certainly an interesting and very practical application of views, they can do a lot more for us. We'll explore just a few examples of how views can reframe problems.
Division as Pattern Matching
The Divides
view from Data.Primitives.Views
allows us to reframe division into pattern matching, resulting in either
a "divide by zero" case, or multiplication plus a remainder:
Tip
When using Idris's interactive editing functionality, when using the
case-split operation on the right hand side of the with
block, the compiler will automatically perform the destructuring for
you, in this case rewriting numerator
as
((denominator * div) + rem)
.
exampleDivides : (numerator, denominator : Integer) -> Maybe Integer
exampleDivides numerator denominator with (divides numerator denominator)
exampleDivides numerator 0 | DivByZero = Nothing
exampleDivides ((denominator * div) + rem) denominator | (DivBy div rem prf) =
Just div
You may have noticed that Divides
provides us with a
total view of division, explicitly handling the divide by zero
case. Another upside of the Divides
view is that it allows
us to ignore the divide by zero case when the divisor is known to be
non-zero, without losing the totality of the operation:
isEven : (x : Integer) -> Bool
isEven x with (x `divides` 2)
isEven ((2 * div) + rem) | (DivBy div rem prf) = rem == 0
This also works when the divisor isn't know at compile time, but we have a proof that it isn't zero at runtime:
Tip
In the DivByZero
branch, the prf
argument
is revealed to have type 0 = 0 -> Void
, which is an
impossible situation.
Since a value of 0 = 0
is easily constructed, we can
pass it to the prf
function to 'realize' a value of
Void
(no Void
s are actually created, since
this is an impossible situation). This can then be passed into
absurd
, which has type signature
Uninhabited a => a -> b
, and is basically
beleive_me
, but what if it is actually completely safe
because it's impossible to actually call.
absurd
is commonly used to, as the docs refer to it,
"discharge proof obligations". Basically, it gets us out of having to
implement a branch when we have proof we can't actually be in that
branch, even when that fact is a bit too non obvious for the
impossible
keyword to work.
nonZeroDivides : (x, y : Integer) -> (prf : Not (y = 0)) -> Integer
nonZeroDivides x y prf with (x `divides` y)
nonZeroDivides x 0 prf | DivByZero = absurd $ prf (the (0 = 0) Refl)
nonZeroDivides ((y * div) + rem) y prf | (DivBy div rem prf1) = div
Pattern Matching on Halves of a List
The SplitRec
view allows us to pattern match on a list by splitting it into a left
half and a right half, preserving the ordering of the elements, and
doing so recursively. This can be used to, for example, implement merge
sort in a rather ergonomic way:
Note
++
is the normal list concatenation operation. For a
given list xs
, a value of the SplitRec xs
type
with the form SplitRecPair lefts rights _ _
is a proof that
the original list xs
can be produced as the result of
evaluating lefts ++ rights
.
mergeSort : Ord a => (xs : List a) -> List a
mergeSort xs with (splitRec xs)
mergeSort [] | SplitRecNil = []
mergeSort [x] | (SplitRecOne x) = [x]
mergeSort (lefts ++ rights) | (SplitRecPair lefts rights lrec rrec) =
merge (mergeSort lefts | lrec) (mergeSort rights | rrec)
where
merge : (ys, zs : List a) -> List a
merge [] [] = []
merge [] (z :: zs) = z :: zs
merge (y :: ys) [] = (y :: ys)
merge (y :: ys) (z :: zs) =
if y < z
then y :: (merge ys (z :: zs))
else z :: (merge (y :: ys) zs)
And we can see that it works as expected:
mergeSort [4, 1, 5, 3, 2] == [1, 2, 3, 4, 5]
Pattern Matching on Both Ends of a List
The VList
view allows recursion on the middle of a list, separating out
both the first and the last element in each step:
isPalindrome : Eq a => List a -> Bool
isPalindrome xs with (vList xs)
isPalindrome [] | VNil = True
isPalindrome [x] | VOne = True
isPalindrome (x :: (xs ++ [y])) | (VCons rec) =
if x == y
then isPalindrome xs | rec
else False
For the sake of demonstrating isPalindrome
, we'll
revisit String
's AsList
, using it to make a
function that converts all the characters in a string to lowercase and
skips over periods and spaces:
prepareString : String -> List Char
prepareString str with (asList str)
prepareString "" | [] = []
prepareString (strCons c str) | (c :: x) =
if isYes $ isElem c [' ', '.']
then prepareString str | x
else toLower c :: (prepareString str | x)
And stringing them together for a demonstration:
isPalindrome (prepareString "Rat Star.") && (not . isPalindrome . prepareString) "Palindrome."
A practical inevetiability in any sufficently large codebase↩︎
Possibly sealing it into an immutable array if the language in use has good support for that↩︎
https://idris2.readthedocs.io/en/latest/tutorial/views.html↩︎
While it still takes great care to use
unsafePerformIO
in a sound manner, unlike in Haskell, laziness by default doesn't come around to bite us in Idris, being a strict-by-default language after all. UsingunsafePerformIO
in Idris in a sound manner is more comparable to, say,unsafe
in Rust, tricky, but doable.↩︎We could avoid this by using the primitive array operations, which assume bounds checking has already been performed, but
PrimIO
is a story for another day↩︎https://idris2.readthedocs.io/en/latest/tutorial/views.html↩︎
Lazy lists in functional programming languages are largely equivlant to iterators in procedural programming languages. The
AsList
view is morally equivlant to.chars()
from Rust, for instance.↩︎