A Journey Through Types
I’m giving a talk as an introduction to dependently typed programming languages. The contents of this talk can be found in this post, and the slides can be found here.
Introduction
Hi all!
My name is Calvin and I’m here to talk to you about a couple of things – mostly types, and how they can help you write correct programs! This talk aims to address some of the following:
Firstly: You should care, and you NEED to care about your software. Computers are integral to nearly every aspect of our lives, and their influence is growing daily. Whether it’s a voting machine, banking software, or a component of a vehicle, there’s always a programmer behind it somewhere.
And this fact should probably terrify you if you have ever programmed before.
An incorrect program may be annoying and lose a business a customer, or it could be life threatening if it’s something vital, like the software for a pacemaker.
We need to be careful!
Types can help us be careful, and are a good way to deal with how scary this is!
Secondly, not only are types great for writing correct programs, but they can make it a lot easier to write programs in the first place! They help the compiler help you!
This is going to be somewhat of a whirlwind introduction to the world of types, so if you get lost along the way that’s okay. Just let me know and hopefully I can clarify things. If all else fails talk to me after the fact! Sometimes it takes a while to grok mathy stuff!
What are we trying to solve?
You think that this is a normal, and necessary thing:
That’s bad. Runtime exceptions don’t actually have to exist, and it can be very bad if our pacemaker segfaults. So maybe we should avoid these problems in the first place!
I’ll talk a bit about what we can do to avoid these problems. Later we’ll look a little into how you can guarantee that the logic in your program is correct too, and not just that it does not explode at runtime.
All of this is going to involve a little help from types.
What is a type?
A type describes what a value “is”.
You’ve probably heard of types in programming languages, and probably even used them. Most of the time you see them in the context of “it tells the compiler how to store the value in memory”, and while that’s true it’s not the entire picture.
Over the years in computing science and mathematics we’ve learned that types are good for a lot more than just figuring out how to lay out bits in memory.
Types tell us how you can use values – what operations are defined on the type? Can you add things of that type? Can a function take a value of a certain type as an argument?
This can really help us write programs that make sense! And these types are excellent documentation which the compiler can ensure is accurate!
A type checker can be used to reject programs which consist of nonsense like \(357^{circles}\), and in fact types can eliminate entire classes of errors if designed properly. No more null reference exceptions!
The types you may have seen.
You might have already seen how a few languages use types. Let’s discuss some of them quickly!
Python
In Python we might have something like this:
def my_sort(xs):
if xs == []:
return xs
else:
= xs[0]
first_elem = xs[1:]
rest
= my_sort([x for x in rest if x <= first_elem])
smaller = my_sort([x for x in rest if x > first_elem])
larger
return smaller + [first_elem] + larger
def my_factorial(n):
if n == 0:
return 1
else:
return n * my_factorial(n-1)
Python likes to pretend that types aren’t a thing, so Python doesn’t tell us anything about what a function like this can do. We can pass this function whatever we want as an argument, and it may or may not fail – we don’t know until we run the program or read it very carefully.
Naming and documentation can help, but in practice, since it can’t be done automatically, enforcing good naming and documentation is really damn hard.
In a large code base it’s difficult to even know what you should pass to a function. Should it take a list, or a set, or an integer?
This factorial function only works with ints (a non-integer number will never trigger the base case), but you might not realize you’re calling it with floats until it’s too late!
Can you simply pass the result of another function to this one, or might that function return None, which this factorial function can’t handle? You can’t know for sure until you read what that other function does, and what every function that function calls does. That’s a lot of work! You can run your program, perhaps with a suite of tests, but that can easily miss a special case.
Another concern is that this function could do a bunch of secret stuff. It could throw away the argument, and read in an integer from some file – maybe it will crash if that file doesn’t exist! It could change the value of some global variable, causing it to return different values depending on the last time it was called – and this might cause other functions to behave differently as well! This can make your program a complicated web of states, which is really difficult to wrap your head around because you need to understand it in its entirety – calling any function could have a drastic effect on the behavior of your program. We’ve all been here, and it’s awful! Often better to rewrite the program than it is to debug it! It would be nice to keep things separated into nice modular compartments that don’t affect each other. That’s what functions are supposed to do, but very often they rely upon outside state so they’re not actually compartmentalized.
What if we could force functions to be compartmentalized so we can’t make these mistakes!? What if we could express what a function can and can’t do in a concise format, and then have the compiler or interpreter tell us when something could go wrong! Why should we accept runtime exceptions when we can catch these problems early on!?
Just a hint, but this is very possible! And we’re going to do it with types!
Java
In languages like Java you have to specify the types of things:
Integer factorial(Integer n) {
if (n == 0) {
return 1;
}
else {
return n * factorial(n - 1);
}
}
ArrayList<Integer> my_sort(ArrayList<Integer> xs) {
if (xs.size() == 0) {
return new ArrayList<Integer>();
}
else {
...
}
}
This little bit of added verbosity actually helps us a lot! We don’t run into issues with non-termination when we accidentally pass in a floating point value like 3.1, and we get to know a little bit about what this function can do – we can see from the types that it takes an integer value, and returns an integer value.
Some languages that do this kind of thing will perform implicit
type conversions. If we call factorial(3.1)
these languages might
convert the floating point number 3.1 to the integer value 3
without telling us about it. This might seem convenient, but
sometimes this can lead to really nasty and hard to track down bugs
when you think you’re doing one thing, but the language is hiding
these sneaky conversions behind the scenes. I’m of the opinion that
it’s better to explicitly convert the values – you don’t actually
want to do conversions that often, and when you do it’s better to
know when it’s happening, otherwise you might end up with
unexpected behavior which is really difficult to debug.
Even this Java example has problems. For instance Java is a
language with null references. A variable of any type in Java (save
for some primitive types) can have the value null
assigned to
it. You’ve probably seen null
in languages before, even Python
sort of has this with None
. The problem with null
inhabiting
every type is that it behaves very poorly with almost every
operation. Comparing null
to 0 could lead to a runtime
exception. Subtracting 1 from null
would lead to a runtime
exception. We don’t want runtime exceptions, since we might not
catch them until our application is running in production! It would
be great if the compiler could tell us when we’re doing something
that doesn’t make sense like comparing a null value to an
integer. Sometimes it makes sense to have None
values, since a
computation could have no solution, or fail for some reason, but we
need the compiler to ensure that we check for these cases. We are
notoriously bad at checking for null references, and it’s
particularly difficult and verbose when every variable can be null
.
Which leads us to the issue that a lot of people don’t like declaring types for all of their variables, thinking that this is a tedious task when the compiler can clearly see that 3 is an integer. We’ll see shortly that this extra syntax can be avoided most of the time with “type inference”, and that when we do choose to write types it can actually make writing our programs easier and quicker. There’s really no excuse not to have types!
Languages like Java are what you might think of when you think of types, and maybe that makes you think types are bad. I assure you that it’s Java that’s wrong, and not the types!
A better idea
Alright, so there are a few things that can make types better for us. First of all we should identify some important qualities that we want.
- Catch errors at compile time. If something is “wrong”, why wait for the program to run to tell us?
- Ease reading and writing programs.
- Allow us to specify properties, and guarantees within our programs. E.g., this function does not alter global state, or read from a file.
- Less verbosity when writing types. Should be easy and clean!
Haskell
So, our trip through the land of types brings us to
Haskell. Haskell is a programming language which treats types
well. The syntax may be a little different than what you’re used
to, but it’s surprisingly clean, concise, and precise. Haskell is
quite a mathematical language. Haskell is a pure language meaning
that whenever you call a function with the same inputs, it produces
the same outputs, which can help you understand your programs a lot
better. In Haskell there is no immutable state, when you give a
variable a value that value can’t change. This sounds limiting, but
it’s really not, you won’t even notice in the examples. But it
helps you undestand your programs a lot better. You only have to
look for where x
is assigned to understand what value x
has;
you need not scrutinize the entire program.
Recall the Python programs from earlier:
def my_sort(xs):
if xs == []:
return xs
else:
= xs[0]
first_elem = xs[1:]
rest
= my_sort([x for x in rest if x <= first_elem])
smaller = my_sort([x for x in rest if x > first_elem])
larger
return smaller + [first_elem] + larger
def my_factorial(n):
if n == 0:
return 1
else:
return n * my_factorial(n-1)
These might look like this in Haskell
mySort :: Ord a => [a] -> [a]
= []
mySort [] first_elem::rest) = smaller ++ [first_elem] ++ larger
mySort (where smaller = mySort [x | x <- rest, x <= first_elem]
= mySort [x | x <- rest, x > first_elem]
larger
factorial :: Integer -> Integer
0 = 1
factorial = n * factorial (n - 1) factorial n
This actually looks pretty nice! In each of these functions it does what’s called pattern matching to break down the different cases. You hardly have to write any type signatures at all, but it’s useful to write the top level signatures that you see here as it helps guide you when writing the function – it acts as a little specification and the compiler can tell you if you deviate from it. In Haskell even these can be avoided, and the compiler can still infer what the types of variables should be in most cases. After all if you write 3, then it’s probably a number. If you multiply a variable by another floating point number, then that variable has to be a float too, so the compiler could figure this out for us. This lets us be as explicit with our types as we want, but the compiler can still catch issues even if you don’t tell it the type of something. You’ll probably find that writing type signatures for functions in Haskell really helps you figure out how to write the function. It’s kind of like test driven development in a way, it gives you an idea of how you would use the function right away, which makes it easier to write the logic later.
In the sort function you’ll see what’s called a typeclass constraint, “Ord”, which stands for “ordered”, and a type variable “a”. This means that “a” can be any type as long as it implements the functions in “Ord”, which contains things like “less than”, “equal to”, and “greater than” comparisons.
This is great, because now we know exactly what we can do with the elements of the list passed into the sort function! We can compare them, and since they have an ordering we can sort them!
Now if you try to sort a list of unorderable things, like functions, the compiler will complain.
*2), lambda x] -- Causes a type error, because it doesn't make sense. mySort [factorial, (
Whereas in python it will just cause a runtime exception, which we might not know about until it’s too late!
# This causes an error when the program is running...
# We might not catch something like this until it hits production!
sorted([lambda x: x * 2, lambda x: x ** 2])
Additionally, we do need the Ord
constraint in Haskell. Otherwise we have something like this:
-- Instead of: Ord a => [a] -> [a]
mySort :: [a] -> [a]
= []
mySort [] first_elem::rest) = smaller ++ [first_elem] ++ larger
mySort (where smaller = mySort [x | x <- rest, x <= first_elem]
= mySort [x | x <- rest, x > first_elem] larger
Which causes a type error, since a
could be any type without
this constraint, which also includes unorderable types like
functions, or pictures. If the compiler lets you call mySort on a
list of something, then that list can actually be sorted, and
you’re guaranteed that things will just work! So that’s one less
thing to worry about at runtime!
Haskell is also a bit more strict about what its types mean. For instance we know that these functions can’t return “None” or “null”. In the case of the factorial function it MUST return an integer value of some kind, and in Haskell there is no “None” or “null” value under the Integer type.
These “Nothing” values are encoded in so-called “Maybe” types, i.e., types which may contain just a value of a given type, or may yield Nothing.
-- Find out where a value is in a list.
whichIndex :: Eq a => a -> [a] -> Maybe Integer
= whichIndexAcc 0
whichIndex
-- Helper function that remembers our position in the list.
whichIndexAcc :: Eq a => Integer -> a -> [a] -> Maybe Integer
= Nothing
whichIndexAcc pos value [] x::xs) = if x == value
whichIndexAcc pos value (then Just pos
else whichIndexAcc (pos+1) xs
-- A dictionary of all the important words.
dictionary :: [String]
= ["cats", "sandwiches", "hot chocolate"]
dictionary
main :: IO ()
= do entry <- getLine
main case whichIndex entry dictionary of
Just pos) => putStrLn "Your entry is at position " ++ show pos ++ " in the dictionary."
(Nothing => putStrLn "Your entry does not appear in the dictionary."
In this case you know that getIndex
can return something like a
null
value called Nothing
, but it could also return “Just” an
Integer. You have to explicitly unwrap these values to get at the
possible value, like in the case statement in main
. This might
seem tedious, but if you’re a fancy Haskell person you might use
“do” notation, which does this automatically.
-- Look up a word in the same position in a different dictionary.
dictionary :: [String]
= ["cats", "sandwiches", "hot chocolate"]
dictionary
synonyms :: [String]
= ["meows", "bread oreos", "sweet nectar"]
synonyms
moreSynonyms :: [String]
= ["floofs", "subs", "hot coco"]
moreSynonyms
getIndex :: Integer -> [a] -> Maybe a
= Nothing
getIndex _ [] 0 (x:xs) = Just x
getIndex :xs) = getIndex (n-1) xs
getIndex n (_
lookupSynonyms :: String -> Maybe (String, String)
= do index <- getIndex word dictionary
lookupSynonyms word
-- Lookup my synonyms, if anything fails return Nothing.
<- getIndex index synonyms
firstSynonym <- getIndex index moreSynonyms
secondSynonym
-- Success! Return Just the synonyms.
Just (firstSynonym, secondSynonym)
-- lookupSynonyms essentially desugars to this.
-- The compiler can help avoid this tedium!
painfulLookupSynonyms :: String -> Maybe (String, String)
= case getIndex word dictionary of
painfulLookupSynonyms word Nothing -> Nothing
Just index) -> case getIndex index synonyms of
(Nothing -> Nothing
Just first) -> case getIndex index moreSynonyms of
(Nothing -> Nothing
Just second) -> Just (first, second)
(
main :: IO ()
= do word <- getLine
main case lookupSynonym word of
Nothing -> putStrLn ("Hmmm, I don't know a synonym for " ++ word)
Just synonym) -> putStrLn ("I think " ++ word ++ "'s are a lot like " ++ synonym ++ "'s!") (
Types never really add any extra tedium, and they can often relieve it because the compiler can automatically do stuff for you.
These examples also show how input and output are encoded in the types. For example:
-- putStrLn :: IO ()
-- getLine :: IO String
main :: IO ()
= do putStrLn "What is your name?"
main <- getLine
name putStrLn ("Hello, " ++ name)
The ()
’s essentially mean “void” or “no return value,” we’re just
printing stuff so there’s nothing valuable to return. An IO
String
, like getLine
, is something which gets a string value
using IO. A function which computes its return value based on an IO
action will be forced to have an IO type as well, so you can’t hide
IO actions in functions which supposedly don’t rely upon IO.
It seems that Haskell satisfies most of our goals.
- We can catch errors at compile time. If something is “wrong”, why wait for the program to run to tell us?
- The type system lets us describe values in a fair amount of detail.
- Types don’t contain values like
null
which cause explosions at runtime.
- It eases reading and writing programs. It’s nice to know what a function can do based on a small type.
- Types help in much the same way as test driven development.
- Makes you think about the arguments you function takes, and what it returns.
- Thinking about what you can actually compute with restricted types is helpful.
- Keeps focus.
- Helps you know what a function can possibly do.
- Types point out errors when developing, such as forgetting to unwrap a Maybe value and check each of the cases.
- Types help in much the same way as test driven development.
- It allow us to specify properties, and guarantees within our programs. E.g., this function does not alter global state, or read from a file.
- Functions are “pure”, meaning they always produce the same output for the same input.
- Special actions, like IO, are labeled in the type. So you can’t use an IO value in a non-IO function.
- The IO action would cause the calling function to have an IO type. IO taints values, and can’t be hidden.
This is really great, and it’s super helpful. There’s a saying that
“if a Haskell program compiles, then it’s probably correct”
because the type system ends up preventing a lot of errors. For
instance, you never end up trying to index None
like you would in
Python. Think how much time you would save if you never ran into
that problem! Quite a lot!
However, we can do even better!
Enter dependent types.
There are some things that we just can’t do even with Haskell’s types. I can write a function to index a list
index :: Integer -> [a] -> Maybe a
index 0 [] = Nothing
index 0 (x::xs) = Just x
index n (x::xs) = index (n-1) xs
But I can’t write one that the compiler can ensure is never called with an index outside the range of our list.
-- Want the integer argument to always be in range so we don't need
-- Maybe!
index :: Integer -> [a] -> a
index 0 [] = error "Uh... Whoops, walking off the end of the list!"
index 0 (x :: xs) = x
index n (x :: xs) = index (n-1) xs
We need to somehow encode the length of the list into the type so we can only call index when the position provided is in range. We can’t do this in Haskell because it doesn’t let us have types which depend upon values (e.g., the length of a list).
It’s also not possible to encode other properties which depend upon values in the types. For instance I can’t say that a function returns a list of values which are sorted in ascending order, I can only say that a sort function also returns a list with values of the same type…
mySort :: Ord a => [a] -> [a]
= []
mySort [] first_elem::rest) = smaller ++ [first_elem] ++ larger
mySort (where smaller = mySort [x | x <- rest, x <= first_elem]
= mySort [x | x <- rest, x > first_elem] larger
It’s nice that we can specify that this function only works on lists which have orderable elements, but it would be even better if we could also say things like…
- The output list must have the same length as the input list.
- The list in the output must contain the same elements as the input list.
- The output list must be sorted in ascending order.
If we could encode these properties in the types, then if the program type checks it would prove that our sort function does the right thing.
In fact, that’s an interesting idea, isn’t it? Why don’t we make it so we can encode essentially any set of properties in our types, any proposition we can think of, and then make it so our program only type checks if it satisfies these properties. That would be a very powerful tool for ensuring the correctness of our programs! Maybe we can even use such a type checker to help us with our proofy math homework? We’ll look into this idea very shortly, but first let’s look at some basic dependent types in Idris, a programming language that is essentially Haskell with dependent types.
Dependent Types in Idris
The classic example of a dependent type is a vector. A vector is a lot like a list, but the length of the list is included in the type.
So, for example, a vector of 2 strings is a different type from a vector of 3 strings.
two_little_piggies : Vect 2 String
= ["Oinkers", "Snorkins"]
two_little_piggies
-- This would be a type error, caught at compilation:
three_little_piggies : Vect 3 String
= two_little_piggies three_little_piggies
And one thing that’s cool about this is you can actually do some computations at the type level to make more complicated, generalized functions. A classic example is appending two vectors together.
append : Vect n elem -> Vect m elem -> Vect (n + m) elem
The lower case identifiers in the type are “variables” again, in
this case meaning n
and m
can be any natural numbers, and elem
can be any type, this is because Vect
is defined as follows:
data Vect : Nat -> Type -> Type where
Nil : Vect 0 a
(::) : (x : a) -> Vect k a -> Vect (S k) a
Meaning that the type constructor Vect
takes a natural number, and
another type, in order to make a full vector type.
Idris has a lot of built in tools for generating your programs based
on their types. Since this type for append
is actually pretty
specific, Idris is able to do a lot of the work for us. Let’s walk
through how that might work.
append : Vect n elem -> Vect m elem -> Vect (n + m) elem
= ?append_rhs append xs ys
The thing on the right hand side is known as a “hole”, and this is a stand in for a value which Idris can potentially fill in for us, or it can at least tell us the type of what we should put in the hole.
Since Idris knows how types are constructed, we can have it automatically perform a case split on the first argument, leading to this:
append : Vect n elem -> Vect m elem -> Vect (n + m) elem
= ?append_rhs_1
append [] ys :: xs) ys = ?append_rhs_2 append (x
Which gives us two cases, with two holes. Idris helpfully tells us about these holes:
- + Main.append_rhs_1 [P]
__ elem : Type
`m : Nat
ys : Vect m elem
------------------------------------------
Main.append_rhs_1 : Vect (0 + m) elem
- + Main.append_rhs_2 [P]
__ elem : Type
`x : elem
m : Nat
ys : Vect m elem
k : Nat
xs : Vect k elem
----------------------------------------------
Main.append_rhs_2 : Vect ((S k) + m) elem
Above the dashed line you can see what variables are in scope where the hole is, and what types they have. Underneath we have our hole, and the type that it has.
Idris is smart, so it can automatically find values that match a
hole of a given type. For the first hole we know that it has type
Vect (0 + m) elem
, but Idris evaluates this to Vect m elem
, and
the only vector of length m
that it has in scope is ys
, so it
just happily fills this in for us, if we ask nicely!
append : Vect n elem -> Vect m elem -> Vect (n + m) elem
= ys
append [] ys :: xs) ys = ?append_rhs_2 append (x
The second hole is a bit more interesting.
- + Main.append_rhs_2 [P]
__ elem : Type
`x : elem
m : Nat
ys : Vect m elem
k : Nat
xs : Vect k elem
----------------------------------------------
Main.append_rhs_2 : Vect ((S k) + m) elem
We can see that xs
has been given the type Vect k elem
, which
means that n = S k
, since xs
is a part of the Vect n elem
argument, just with one less element since x
is split from it. S
means successor, so S k
is just the next natural number from k
,
so it’s k+1
.
Our goal is to make a vector with length S k + m
, which we can
happily ask Idris to do, and it finds:
append : Vect n elem -> Vect m elem -> Vect (n + m) elem
= ys
append [] ys :: xs) ys = x :: append xs ys append (x
… Which is exactly what we want. So how did Idris do this? Well, it realized a couple of things.
data Nat : Type where
0 : Nat -- Zero
S : Nat -> Nat -- Successor (+1)
(+) : Nat -> Nat -> Nat
+) 0 m = m
(+) (S k) m = S (k + m)
(
data Vect : Nat -> Type -> Type where
Nil : Vect 0 a
(::) : (x : a) -> Vect k a -> Vect (S k) a
First, it evaluated (S k) + m
, which turns out to be S (k +
m)
. It looked at the type constructor for a vector and saw that in
order to get a Vect (S (k + m)) elem
it would need to concatenate
an element with a Vect (k+m) elem
, which gets us two holes. One
for the element to concatenate, and one for the rest of the vector.
append : Vect n elem -> Vect m elem -> Vect (n + m) elem
= ys
append [] ys :: xs) ys = ?elem_to_concat :: ?rest_of_vect append (x
- + Main.elem_to_concat [P]
__ elem : Type
`x : elem
m : Nat
ys : Vect m elem
k : Nat
xs : Vect k elem
-----------------------------------
Main.elem_to_concat : elem
- + Main.rest_of_vect [P]
__ elem : Type
`x : elem
m : Nat
ys : Vect m elem
k : Nat
xs : Vect k elem
------------------------------------------
Main.rest_of_vect : Vect (k + m) elem
So, Idris knows of one element with the type elem
, and that’s x
,
so it can fill that in.
append : Vect n elem -> Vect m elem -> Vect (n + m) elem
= ys
append [] ys :: xs) ys = x :: ?rest_of_vect append (x
It also knows about recursion, so it knows it has this function
append
which it could call that has a type Vect n elem -> Vect m
elem -> Vect (n + m) elem
. And since Idris has a vector xs : Vect
k elem
, and a vector ys : Vect m elem
, it knows that
append xs ys : Vect (k + m) elem
Which is exactly the type of thing we need in this hole, so it can fill it in as well.
So what you just witnessed is Idris essentially writing a program, albeit a small one, for you based on a type which specified the behaviour of this program. That’s awesome, and super helpful!
We can even see how this would not work as well if we were just using lists, which don’t have the length in their type.
append : List elem -> List elem -> List elem
= []
append [] ys :: xs) ys = [] append (x
If you try to fill this in automatically, Idris will just make the function return empty lists, because it’s the easiest way to satisfy the type. If your types are not precise enough, then a number of functions will type check just fine, and Idris can’t tell which one of these possible functions you would want, it just gives you the first one it can find.
Dependent types and indexing
We can actually guarantee that a function indexing a vector stays within the bounds of the vector at compile time, too!
index : Fin len -> Vect len elem -> elem
FZ (x :: xs) = x
index FS n) (_ :: xs) = myIndex n xs index (
Fin len
is a type which represents natural numbers strictly less
than len
. So, given a vector of length len
, if we provide a
natural number greater than or equal to len
as the index, then it
would not be an element of the Fin len
type, so the program would
not type check, catching any potential bugs where you might walk
off the end of an array at compile time. Here’s a quick example:
cats : Vect 2 String
= ["The Panther", "Smoke Smoke"]
cats
-- "The Panther" : String
0 cats -- This type checks.
index
-- (input):1:9:When checking argument prf to function Data.Fin.fromInteger:
-- When using 2 as a literal for a Fin 2
-- 2 is not strictly less than 2
2 cats -- This is out of bounds, so the program won't even compile! index
There are lots of cool guarantees we can make with dependent types! As alluded to earlier, we can even use them to make specifications for how our program should behave with arbitrary propositions, and then use the type checker to ensure that our program actually follows these specifications.
Logic Primer
In order to get into this we need to do a quick primer on logic and logical proofs. In logic you have things known as propositions. A proposition is just a statement, such as “the sky is blue”, or “2 + 2 is 4”. These propositions happen to be true, but we can also have propositions which are false, such as “2 + 2 is 27”. A proposition is just something that you can propose. I might propose to you the notion that “2 + 2 is 27”, but using logical proofs we can determine that this proposition is in fact not a true statement.
So! These propositions are often represented by variables, for instance:
P
P
is a proposition. It could be anything, really…
P = "ducks are fantastic"
And I might have another proposition:
Q = "ducks are truly the worst"
Right now I’m using plain English to convey these propositions to you, but often they’ll be more mathematical statements, such as:
\[\forall n \in \mathrm{N}, \exists m \in \mathrm{N} \text{ such that } m > n\]
Propositions are built up from a set of axioms, which are just rules describing your mathematical objects, and propositions can be combined in a number of ways.
- Implications
- \(P \rightarrow Q\), meaning “if P is true, then Q must be true.”
- Conjunctions
- \(P \wedge Q\), meaning “both P and Q are true.”
- Disjunctions
- \(P \vee Q\), meaning “at least one of P or Q is true.”
- Negation
- \(\neg P\), meaning “P is false.”
- Universal quantification
- \(\forall x, P(x)\), meaning whenever we substitute any value for
x
inP
, the propositionP(x)
holds true.
- \(\forall x, P(x)\), meaning whenever we substitute any value for
- Existential quantification
- \(\exists x, P(x)\), meaning we can find an
x
that we can substitute intoP(x)
to make the proposition hold.
- \(\exists x, P(x)\), meaning we can find an
Inference / proof rules
There are some basic axioms for how you can work with these propositions. These are just rules that “make sense”. Such as modus ponens
\[p \rightarrow q, p \vdash q\]
Or conjunction elimination
\[p \wedge q \vdash p\] \[p \wedge q \vdash q\]
Or conjunction introduction
\[p, q \vdash p \wedge q\]
Curry-Howard Isomorphism
As it turns out when you start to think of your types as propositions some interesting things start to pop up…
For instance if we look at something like implication in logic…
\[P \rightarrow Q\]
This means that if I have a proof of the proposition P, then I can produce a proof of the proposition Q.
That’s very similar to a function type in something like Haskell or Idris. If I’m given a value of type P, then I can produce a value of type Q. So function application seems to be identical to modus ponens.
-> q p
Similarly in logic I might have
\[P \wedge Q\]
Which means that I have a proof of P and a proof of Q.
If you squint that’s kind of similar to:
(p, q)
Which means that I have a value of P, and a value of Q. Conjunction elimination is then just the projection of either the first or second value in the tuple:
-- P /\ Q -> P
fst :: (p, q) -> p
fst (a, b) = a
-- P /\ Q -> Q
snd :: (p, q) -> q
snd (a, b) = b
What are the values of a type then? Well, they look a lot like an existence proof of a given proposition. For instance:
const : p -> q -> p
= a const a b
The value, in this case the function which returns the first element, can be seen as a proof of the proposition \(p \rightarrow q \rightarrow p\). You take the proof of the first proposition in the chain of implications, and return it as a proof of the same proposition at the end of the implication chain. So, given a proof of \(p\) and a proof of \(q\), if we take the proof of \(p\) and discard the proof of \(q\), then we can prove \(p\). Which makes sense to me!
We can also see how the type checker can prevent us from proving false propositions. For instance, you can’t prove \(p \rightarrow q\), because there would be no way to get a proof of \(q\) from a proof of another proposition \(p\), when both \(q\) and \(p\) could be any random proposition!
bogus : p -> q
= -- What can I put here that would type check? :( bogus p
We can’t find a value of type q
, since we only have a value of type p
!
Proofs in practice
I’m going to quickly show you some basic proofs in Idris. With any luck you can at least imagine how these proofs might be extended to more complicated programs!
Idris has a type which represents equality between two things. This type is constructed, as you might expect, with the equals sign.
equality_good : 2+3 = 5
= Refl
equality_good
-- This fails to type check
equality_bad : 2+3 = 7
= Refl equality_bad
An equality like this has only one constructor, Refl
. This
equality type is roughly defined as:
data (=) : a -> b -> Type where
Refl : x = x
Which looks a little obtuse, but really all this means is that if we
want to put Refl
in a hole with some equality type, then Idris
needs to be able to determine that whatever is on the left and right
of the equals signs will evaluate to the exact same value. If Idris
can determine that, then the left and the right side are considered
to be identical, and we can replace whatever is on the left side
with whatever is on the right side and vice versa. This is
reflexivity, and it’s what Refl
stands for.
Now, with this in mind lets walk through a small, but mind bending example:
cong : (f : a -> b) -> x = y -> f x = f y
= ?cong_rhs cong f prf
cong
stands for congruence, and has a type which represents the
proposition that, if you are given a function f
, and you know that
some x
and y
are equal, then f x = f y
.
This might seem really odd and scary right now, because you have
equals signs in your types. But remember, types are propositions of
theorems, and these equals signs just means that we should be able
to use the Refl
constructor to show that both things are equal
using Idris’s internal notion of the equality of terms.
Here’s what we see when we ask Idris about our goal, cong_rhs
:
- + Main.cong_rhs [P]
__ b : Type
`a : Type
x : a
f : a -> b
y : a
prf : x = y
---------------------------
Main.cong_rhs : f x = f y
So, it looks like we need to be able to show that f x = f y
. In
the list of known values it seems that we have a proof of x=y
from
one of the arguments to cong
. And since we have a proof that
x=y
, we should be able to rewrite y
to be x
using
reflexivity. In Idris this is done by deconstructing the proof of
x=y
by pattern matching on the argument.
cong : (f : a -> b) -> x = y -> f x = f y
Refl = ?cong_rhs_1 cong f
That looks really unimpressive, but let’s see what it did to our goal:
- + Main.cong_rhs_1 [P]
__ b : Type
`a : Type
x : a
f : a -> b
-----------------------------
Main.cong_rhs_1 : f x = f x
Perfect! If we have a proof that x = y
, then Idris knows that
they’re interchangeable, and it automatically replaced y
with x
everywhere. Now we just need something with the type f x = f x
,
which is trivial, since if you look at the definition of Refl
,
that’s pretty much exactly what it does. We just need to substitute
f x
for the x
in Refl
.
Refl : x = x
-- So, if we just replace this general "x" with our "f x" we would get...
Refl : f x = f x
Refl
actually
In Idris Refl
uses implicit arguments, since it can often infer
what it should use in context, so we could just write Refl
:
cong : (f : a -> b) -> x = y -> f x = f y
Refl = Refl cong f
But we could also give it an argument explicitly.
cong : (f : a -> b) -> x = y -> f x = f y
Refl {x}) = Refl {x = f x} cong f (
I realize this is a bit confusing because there are x
’s in both
places, but the x
in the definition of Refl
is in a different
scope, and we’re just substituting our f x
for that x
, like an
argument to a function.
Slightly more complicated proof
Now that we have a proven congruence theorem we can construct some more complex proofs. Let’s write a function to do addition on natural numbers and prove that it’s associative.
In Idris natural numbers look like this:
data Nat : Type where
0 : Nat
S : Nat -> Nat
-- 0 = 0
-- S 0 = 1
-- S (S 0) = 2
-- etc...
(+) : Nat -> Nat -> Nat
+) 0 y = y
(+) (S x) y = S (x + y) (
The 0 represents 0 (it’s actually Z
, but I think writing 0 is
less confusing), and S
stands for successor, which means “plus
one”. So we have defined the set natural numbers recursively, by
adding one to the previous natural number. This gives us a unary
representation of the natural numbers that’s very nice to work
with, it’s similar to tallies.
Similarly we can define addition recursively:
- Zero plus any number is that number.
- One plus x added to y is x + y with one added to it.
Now let’s define our theorem:
plus_assoc : (x, y, z : Nat) -> x + (y + z) = (x + y) + z
= ?plus_assoc_rhs plus_assoc x y z
This just says that for all x
, y
, and z
in the natural
numbers, x
added to y + z
is the same as x + y
added to z
.
To prove this kind of thing we often use induction. We’ve actually
already seen induction in Idris. It’s just recursion. So we’ll case
split on x
, which gives us a base case where x = 0
, and a case
where x = S k
for some natural number k
.
plus_assoc : (x, y, z : Nat) -> x + (y + z) = (x + y) + z
Z y z = ?plus_assoc_rhs_1
plus_assoc S k) y z = ?plus_assoc_rhs_2 plus_assoc (
We have some interesting holes now.
- + Main.plus_assoc_rhs_1 [P]
__ y : Nat
`z : Nat
---------------------------------------------------------------
Main.plus_assoc_rhs_1 : 0 + (y + z) = (0 + y) + z
- + Main.plus_assoc_rhs_2 [P]
__ k : Nat
`y : Nat
z : Nat
-----------------------------------------------------------------------
Main.plus_assoc_rhs_2 : (S k) + (y + z) = ((S k) + y) + z
For the first one we have to just realize that when we use Refl
,
Idris will try to evaluate both sides of the equals sign
completely. Because of how plus is defined, it can actually
evaluate these partially even though we don’t know what y
and z
are. This just triggers the first case of our definition of plus,
where 0 is on the left side. So this goal is really:
- + Main.plus_assoc_rhs_1 [P]
__ y : Nat
`z : Nat
---------------------------------------------------------------
Main.plus_assoc_rhs_1 : y + z = y + z
And we can satisfy this with reflexivity.
plus_assoc : (x, y, z : Nat) -> x + (y + z) = (x + y) + z
Z y z = Refl
plus_assoc S k) y z = ?plus_assoc_rhs_2 plus_assoc (
The second hole is more complicated.
- + Main.plus_assoc_rhs_2 [P]
__ k : Nat
`y : Nat
z : Nat
-----------------------------------------------------------------------
Main.plus_assoc_rhs_2 : (S k) + (y + z) = ((S k) + y) + z
Again, Idris can still evaluate this partially, so this goal is really this:
- + Main.plus_assoc_rhs_2 [P]
__ k : Nat
`y : Nat
z : Nat
-----------------------------------------------------------------------
Main.plus_assoc_rhs_2 : S (k + (y + z)) = S ((k + y) + z)
So it looks like we need to prove associativity… Which is what we’re trying to do.
But since Idris knows about recursion, we can actually call
plus_assoc
on k
, y
, and z
to get something with the type…
+ (y + z) = (k + y) + z k
So we’re almost there, we just need to be able to apply the successor function on both sides of the equality. This is exactly what congruence does:
cong : (f : a -> b) -> x = y -> f x = f y
Refl {x}) = Refl {x = f x} cong f (
If we give cong
a function, and something with an equality type,
then cong
gives us an equality type with the function applied to
both sides. So we can do this:
plus_assoc : (x, y, z : Nat) -> x + (y + z) = (x + y) + z
Z y z = Refl
plus_assoc S k) y z = cong S (plus_assoc k y z) plus_assoc (
Which completes our proof! It’s interesting how applying a theorem,
like cong
, is literally just applying a function.
It’s also neat how recursion and induction are really just the same thing, and you can see that pretty clearly when working with something like Idris.
Tactics
Sometimes building up proof terms in this functional programming style is a bit tedious. Once you get more complicated proofs on the go it gets pretty hard to keep track of all of the types. There are other languages which use a different style of proof based on tactics, which are little commands that build up these proof terms behind the scenes for you.
These languages are interesting to work with, but the proofs are actually pretty hard to read without the proof state shown, which editors for these languages will display nicely. The proof state is just your current goal type, and it displays it in much the same fashion as Idris displays its goals.
Here’s an example of the associativity proof we just did in Coq:
Inductive nat : Type :=
| O : nat
| S : nat -> nat.
Fixpoint plus (n m : nat) : nat :=
match n with
| O => m
| S n' => S (plus n' m)
end.
Theorem plus_assoc : forall (x y z : nat), plus x (plus y z) = plus (plus x y) z.
Proof.
intros x y z. induction x as [| k].
- reflexivity.
- simpl. (* Simplify with evaluation *)
rewrite IHk. (* Use induction hypothesis to rewrite terms *)
reflexivity.
Qed.
It’s actually pretty similar, and you can maybe get some idea of how the tactics translate into the functions from before. We break things into cases much the same way with the induction tactic.
- The base case is just handled with reflexivity.
- Then, after simplifying the types by evaluation much like in Idris,
we apply the theorem inductively, and then use reflexivity to handle
the case of
x = S k
.
These tactics look pretty weird when you first see them, but if you start thinking about how they get turned into proof terms like in the Idris examples, it becomes a lot easier to understand.
Conclusion
So, that’s the end of the talk. It’s just a rough overview of why types are so magical, and why you should care about them.
I realize that this was quite the whirlwind introduction to this topic, so if you have any questions feel free to ask!