fancyfa'u – Function Fall

2015-08-06

A proof with intervals

One of the problems with homotopy type theory at the moment is that there are few uninformed hobbyists writing partially correct blog posts about it. More generally, it's hard to find examples of simple proofs that nonetheless make use of some of the tools HoTT gives us above the intensional type theories used by most theorem provers and dependently typed programming languages today. This is my attempt to improve (or worsen) the situation.

For this post, I will use the programming language Hoq. I use this for two reasons:

I like the functional (Agda/Idris) style of proving, particularly when exploring the possibilities of the type system. Tactics-based provers seem too removed from the underlying mathematics to give me a clear picture of what's going on.
I wanted to use a language built for HoTT, rather than one that has been modified to use it. Idris is incompatible with HoTT due to the K axiom, and, though it can be run K-less these days, I don't know whether there are other things missing from Agda (my guess being that there are).

My only other options were the mutually related cubical and cubicaltt. Though these are well documented, lack of comfort features like implicit arguments and LHS pattern matching makes them rather painful to work in. Hoq has barely any documentation (examples/basics.hoq is all I can find), so this post should help with that.

The objective of this exercise is to prove right distributivity of boolean or over and by first proving left distributivity and commutativity, then combining them in the obvious way. The fact that we can combine them in the obvious way is the focus of this post. This contrasts with what we'd do in an intensional language, as we will see.

First, we define some basic things, which should really be in some prelude module. This will hopefully give an introduction to the syntax, which is similar to that of Idris and Adga. I've refrained from defining the most general versions of ext and flip for simplicity (and to help Hoq infer implicit arguments).

-- Booleans
data Bool = False | True

or : Bool -> Bool -> Bool
or False False = False
or _ _ = True

and : Bool -> Bool -> Bool
and True True = True
and _ _ = False

-- Convenient alias: at p = \i -> p @ i
at : {A : Type} -> {x y : A} -> x = y -> I -> A
at _ _ _ p i = p @ i

-- Equality proofs
ext : {A B : Type} -> {f g : A -> B} -> ((a : A) -> f a = g a) -> f = g
ext _ _ _ _ p = path (\i a -> p a @ i)

idp : {A : Type} -> {a : A} -> a = a
idp _ a = path (\_ -> a)

-- Combinator
flip : {A B C : Type} -> (A -> B -> C) -> B -> A -> C
flip _ _ _ f y x = f x y

Also introduced implicitly is the interval type I, as seen in lambda variable i. An equality proof p for terms of type A is basically a wrapper around a function of type A -> I, and the underlying function is accessed using p @ i. I is supposed to be analogous to \({\leq}|0,1|{\geq}\) (written by people other than me as \([0,1]\)), with end points left and right which, crucially, we can't pattern match with. If p : x = y, p @ left is x and p @ right is y.

In order to proceed, we prove the required facts about or in the standard way:

orCommutative : (x y : Bool) -> or x y = or y x
orCommutative False False = idp
orCommutative _ _ = idp

orLeftDistr : (x y z : Bool) -> or x (and y z) = and (or x y) (or x z)
orLeftDistr False False _ = idp
orLeftDistr False _ False = idp
orLeftDistr False True True = idp
orLeftDistr True _ _ = idp

As you can see, we use idp (the identity path) where we might expect Refl on the right-hand side.

We could just prove right distributivity in the same way, but that would be terribly boring. The point of this article is really to introduce a method that can be used in more complex situations, where we want to avoid code duplication. In intensional type theory, the proof we want might look something like this (using equational reasoning syntax from examples/Paths.hoq):

orDistrRight' : (x y z : Bool) -> or (and x y) z = and (or x z) (or y z)
orDistrRight' x y z =
  or (and x y) z
    ==< orCommutative (and x y) z >==
  or z (and x y)
    ==< orLeftDistr z x y >==
  and (or z x) (or z y)
    ==< pmap (\w -> and w (or z y)) (orCommutative z x) >==
  and (or x z) (or z y)
    ==< pmap (and (or x z)) (orCommutative z y) >==
  and (or x z) (or y z)
    !qed

This is rather repetitive, though. Each time we want to apply orCommutative, we have to say exactly where to apply it in the expression, which involves writing out the expression over and over again.

We'll proceed with the main proof in small steps. To put us on steady ground, we prove or = flip or:

orFlippedEq : or = flip or
orFlippedEq = ext (\x -> ext (\y -> orCommutative x y))

Now, at last, we get to a HoTT-specific part. Here, we prove that the identity we're trying to prove is equivalent to the type of orDistrLeft. Naturally, we produce an equivalence between equivalences:

orRightDistrEqOrLeftDistr : (x y z : Bool) ->
  (or x (and y z) = and (or x y) (or x z)) =
    (or (and y z) x = and (or y x) (or z x))
orRightDistrEqOrLeftDistr x y z =
  path (\i -> (orFlippedEq @ i) x (and y z) =
                and ((orFlippedEq @ i) x y) ((orFlippedEq @ i) x z))

The expression inside the lambda abstraction is simply the left side of the required equality, but with or replaced by orFlippedEq @ i. orFlippedEq @ left is or and orFlippedEq @ right is flip or, which give us the required end points.

The final thing we need to introduce is coe (short for “coerce”, presumably). Its type is (A : I -> Type) -> (i : I) -> A i -> (j : I) -> A j, meaning that it takes a type equality and moves the given value along it from i to j. Here, we use it to simply transport our proof of left distributivity into a proof of right distributivity:

orRightDistr : (x y z : Bool) -> or (and x y) z = and (or x z) (or y z)
orRightDistr x y z =
  coe (at (orRightDistrEqOrLeftDistr z x y)) left (orLeftDistr z x y) right

And that's the result we wanted.

2015-04-08

The language of types

It's been a while since I posted here. That can be attributed partially to the increased workload I had after starting university, and partially to the fact that most of my ideas had reached either a conclusion or a dead end by the end of summer. But now it's revision time, and I've been able to think about the things I wrote here. Since last post, I've learnt a lot about type theory via Idris. In this post, I apply some of that knowledge.

My aim is to present a type-theoretic interpretation of spoken language, taking advantage of the Curry-Howard correspondence. On the one hand, type theory lends itself very well to the declarative claims of spoken language. It's possible to define predicates as types, with the claim the predicate is making visible to the outside world. Also, it's well documented that logical connectives are easily defined as type families. However, the somehow closed nature of types, in comparison to sets, makes it difficult to work out what types there should be, and how they should be used.

My main idea is to encode verbs as type families and nouns as values. For now, we'll consider all nouns to be of a single, simple type. Notionally, every noun we could want to talk about is specified as its own value constructor, which takes no arguments. Constructors of most real-world verb types are specified similarly, but construct values only of a specific type. For example, taking Lojban's {ciska} (“write”), we make the type family:\[
\begin{align}
\mathbf{data}\ &\mathrm{Ciska} : \mathrm{Noun}\to\mathrm{Noun}\to\mathrm{Noun}\to\mathrm{Noun}\to\mathrm{Type}\ \mathbf{where}\\
&\mathrm{MkCiska0} : \mathrm{Ciska}\ \mathit{me}\ \mathit{this\_post}\ \mathit{my\_blog}\ \mathit{keyboard}\\
&\mathrm{MkCiska1} : \mathrm{Ciska}\ \mathit{me}\ \mathit{a\_tweet}\ \mathit{Twitter}\ \mathit{keyboard}\\
&\dots
\end{align}
\]Notice that these simple value constructors essentially correspond to events. It has happened that I have written a tweet on Twitter using a keyboard, so the event of that happening constitutes a proof. In this way, we can consider Lojban's {nu} as a wrapper over the contained bridi (type), fixing the type parameters. Of course, we already know that {ka} is used for creating functions resulting in types, and many type constructors take this kind of argument. I'm not sure yet about {du'u} and {si'o}, and {ni} is probably something different again.

Since we now have a reïfied view of statements, it makes sense to quantify them. This is in fact what we are doing with dependently typed Πs and Σs. In implication statements, for example, we are saying that for all proofs of the hypothesis, there exists a proof of the conclusion using that proof. “every time A, B” is logically equivalent to “A ⇒ B”, and is proven using a function of type \(A\to B\). We can similarly use dependent pairs for existential quantification, just as in dependently typed programming.

Quantifiers like “almost all” and “almost always when”, naturally, are more difficult to handle, and types don't particularly help us with that. They're probably best dealt with theoretically by having a function of type \(A\to\mathrm{Maybe}\ B\) mapping this over a list of inhabitants of A, then interpreting the output list (in some context-sensitive way). How useful this notion is, I don't know.

Anyway, the main point of the latter part of this post was that type families, as opposed to my previous idea of referent sets, are the things that should be quantified over. This becomes apparent when we have polyadic verbs. In my new language, it used to be the case that I'd express “a dog catches a ball” as something like “gerku fasu'o fasu'o kavbu fesu'o fasu'o bolci” (with all those “su'o” usually being implicit). This would literally be read as “a dog is a catching thing; a caught thing is a ball”. However, the problem is that the dog and the ball aren't obviously involved in the same event, and the symmetry of the sentence structure means that we can't make “kavbu” be the main verb. Now, I propose to say “gerku su'o fafa kavbu su'o fefa bolci su'o”, literally “a dog is the catcher in at least one catching event in which a ball is caught”.

2014-08-29

Intervals

The current interval notation has notoriety for clashing with other notations and having weird unmatched brackets (the two cases being mutually exclusive). For instance, \(x\in[0,1]\) is ambiguous as to whether \(x\) is a fuzzy logic value (being a member of an interval) or a strict boolean value (being a member of a list). That's quite a large ambiguity for this otherwise useful notation. Also, the unacquainted can easily confuse whether a bracket is for an open end or a closed end. I've always managed to remember, but that's no excuse to keep things as they are.

I propose a simple alternative: replace \((\) with \({<}|\) and \([\) with \({\leq}|\), and reflect for the right-hand end. So, our fuzzy logic interval would be \({\leq}|0,1|{\geq}\). Intervals of integers can keep the double-dot notation, so \({\leq}|0..1|{\geq}\) represents the same thing as the list \([0,1]\). It's a list because it corresponds to a portion of \(\mathbb Z\), which I consider a list (see last post for justification).

That's about it. The best way to approximate this in ASCII is by writing things like “<=|0,1|=>”. It's better to keep the ‘=’ next to the ‘|’, so that one avoids the misparse “(<=|0,1|>) =”. And a note for \(\LaTeX\): put the ‘<’-like symbols in curly braces to avoid them spacing as if they were infix operators.

2014-08-28

Zip lists

I've laid off the maths for a while in preparation for this post. You may remember that the last post ended in a rather unsatisfactory manner. Here, I try to rectify the situation. The heart of the issue is that I want to be able to express two different types of equality between functions: the weaker set-wise complete closure and the stronger point-wise extensional equivalence.

I define a function \(f\) as being completely closed over a set \(S\) iff it satisfies these conditions (in non-mapping notation):\[\begin{aligned}
\forall x\in S.\,& f\ x\in S \\
\forall y\in S.\,\exists x\in S.\,& f\ x=y
\end{aligned}\]In other words, the domain and image of \(f\) are equal. If \(S\) is a finite set, \(f\) essentially reorders the elements. If \(S\) is infinite, there may be other things that \(f\) could do. For example, if \(f=\text{succ}\wedge S=\mathbb Z\), \(f\) is a kind of translation function. In any case, \(f\) is surjective. My aim is to be able to write this in a concise and generalisable way in mathematical notation.

If two functions are extensionally equivalent, they do the same job, and thus one can replace the other without changing the logic of the situation. However, they may have different computational properties, and hence can't be considered to be totally equal. The conditions for this relation between \(f\) and \(g\) with respect to set \(S\) can be written:\[
\forall x\in S.\,f\ x=g\ x
\]This, too, I want to have a concise and generalisable way of writing, since it is a very common relation.

So, why are these related? Well, let's start naïvely applying the principles of the notation without variables. For extensional equivalence, it is intuitive to write \(f=g\). Now, we want to limit that to some set? Maybe we say \((f=g)\ S\). That expands to \(f\ S=g\ S\) by the fork rule (since equality doesn't act upon functions). But, by implicit mapping, \(f\ S\) and \(g\ S\) are sets, so correspondences between individual elements are lost. All we've equated are the images for the functions, given a domain. If \(g\) is the identity function, the statement is that \(f\) is completely closed over \(S\).

As the title suggests, the solution lies in using lists, rather than sets. These retain an order, so there is still correspondence between individual elements in the domain and image. In fact, a function can be specified as a pair of lists, as well as a set of pairs. To say that \(f\) and \(g\) are extensionally equivalent over domain \(S\), we just have to make sure that \(S\) is cast to be a list. In symbols, \((f=g)\ \text{list}\ S\).

Both \((f=g)\ \text{list}\ S\) and \((f=g)\ S\) are useful. \(D\ {}_\cdot^2=2\cdot{}\) is an example of the former, where the implicit argument for the expression is \(\text{list}\ \mathbb C\). \((\Pi=(!))\ \text{list}\ \mathbb N\) is an example of where the list argument cannot be reliably inferred, and has a significant effect on the truth of the statement. Finally, \((\text{floor}\ \frac 2 \iota=\iota)\ \mathbb N\), where \(\iota\) is the identity function (an example of a non-bijective completely closed function).

List can occur in other situations, too. For example, \(\pm\) can be formalised by saying that it returns a 2-element list (as too does \(\sqrt \iota\), but not \({}_\cdot^{2/1}\); see an earlier post). Explicitly, \(x\pm y=[x+y,x-y]\). This gives \(\cos(A\pm B)=\cos A\cdot\sin B\mp\sin A\cdot\cos B\), which boils down to an equivalence between two lists. In fact, lists are so useful that it may be worth assuming that the standard sets are lists, which can be converted to sets as needed. This makes more sense, too, because the \(\text{list}\) function essentially introduces new information, whereas functions should lose information (like \(\text{set}\) does on lists). Also, in general, expressions like \(\text{head}\ \text{list}\ S\) are meaningless. However, given the axiom of choice, \(\text{list}\) can be defined for any set, it's just that some operations on it are meaningless.

2014-08-15

Phonotactics

This is the last piece of documentation I need before I can start creating words. That's not to say that I'll go about word creation particularly quickly. For the time being, brivla are likely to be taken from Lojban without any attempt at making them fit the correct word patterns. Remember that these rules don't apply to cmevla.

Phonotactically, but not phonemically, I distinguish between the onset and coda of a syllable. Codas can contain 0 or 1 approximants followed by 0 or 1 nasals. There are some further restrictions:

If followed by an obstruent, a nasal must have the same place of articulation.
Sequences “ij” and “uw” are not allowed unless immediately followed by a vowel.

Onsets are where most difficulty arises. There are a lot of rules to make, but with some guidelines, the number of explicit rules needed can be lessened. The main rule is that all onsets must be pronounceable at the start of a word (i.e, after a pause). This implies that all onsets start with obstruents; sonorants would be part of the preceding coda. There are two other important rules:

If an onset is deemed pronounceable, so is its reflection.
If an onset is deemed pronounceable, so is its revoicing.

The functions “reflect” and “revoice” are both self-inverse, and defined by these tables:

original	p	b	t	d	f	s	z	m	n	w	l	i	e	a
reflection	k	g	t	d	x	s	z	r	n	j	l	u	o	a

original	p	t	k	f	s	x	m	n	r	w	l	j	o	i	a	u	e
revoicing	b	d	g	f	z	x	m	n	r	w	l	j	o	i	a	u	e

The name “reflect” refers to the fact that, for its input, it picks the letter on the opposite side of the corresponding table about a vertical line of symmetry. “revoice” is a more obvious name. If one imagines the consonant table being 3-dimensional, revoice is similar to reflect, but works on a different axis (the voice axis, rather than the place-of-articulation axis).

These restrictions have some interesting corollaries. For instance, it can be seen that coronal (alveolar) consonants reflect to themselves. This gives them more freedom than labial and dorsal (velar) consonants, which lives up to our expectations of them. Sonorants, by revoicing to themselves, also enjoy similar freedom. ‘f’ and ‘x’ also revoice to themselves, but are probably heavily restricted, anyway.

To simplify things further, ‘o’, ‘a’ and ‘e’ are mutually considered phonotactically equivalent. ‘i’ and ‘u’ are special cases because they are close to ‘j’ and ‘w’. For example, I don't expect ‘pji’ to be valid, since it's too similar to ‘pi’. However, ‘pja’, ‘pju’, ‘pwi’ and ‘tji’ are all valid, being distinct enough from ‘pa’, ‘pu’, ‘pi’ and ‘ti’, respectively. Of course, for ‘pja’, we need to test ‘bja’, ‘kwa’ and ‘gwa’ before we accept it (and similar for the others).

Beyond that, there is only one more easily-stated rule I can think of: onsets cannot mix voicings. All of the other phonotactic rules are left, unfortunately, to common sense. That's implicitly my common sense, but I promise to not screw up!

2014-08-05

Morphology

Before moving onto the main morphology, I'd like to get cmevla out of the way. These are practically the same as dotside Lojban's cmevla. They start with a pause (‘.’) and end with a consonant or schwa followed by a pause. The requirement for a consonant/schwa comes about because pauses can arbitrarily occur between words, so cmevla need a feature that binds them together. When using Shwa, IPA or similar orthographies, any letters can be used in cmevla (except pauses and glottal stops), even those that are not part of the language. Phonotactic rules are also done away with.

All other words begin with an obstruent (“pbtdkgfszh”) or pause, and end with a normal vowel (“oiaue”). In other words, word breaks occur in and only in these environments:

at a pause or glottal stop (which is part of neither word)
at the gap between a normal vowel and an adjacent following obstruent

This resembles Ceqli, but has a major difference: obstruents can occur in the middle of a word, in which case they were preceded by a sonorant (“mnrwlj”) or schwa (“y”). This allows much easier importation of loan words. Also – it nearly goes without saying – the scheme here is much simpler than Lojban's, with tosmabru and all. I don't rely on stress, which frees things up for songwriters. I suggest that the reason why there are barely any Lojban versions of existing songs, whether performed or not, is that it's so difficult to get the stress right (I've tried it a few times).

Other than the vowel-obstruent restriction, native words are free in form. There are no formal equivalents to lujvo, with more tanru-like structures taking their place. To keep the classes separate, brivla are required to have at least 1 nasal, and cmavo & tags have none. Like how one can recognise types of Lojban words after little practice, there shouldn't be problems telling the two types apart.

I haven't decided on an equivalent to the gismu creation algorithm yet, though I aim to have a Lojban-style one. With brivla being free-form, I'll need some way of deciding what concepts get a short word, and which get longer ones. I will likely base this on the length of the source words (by some metric).

Just to give an idea of what loan words would look like, here are some language names:

native	cmevla	IPA-style cmevla	brivla
.lojban.	.lozban.	.loʒban.	.lonzba
Ceqli	.tjerlis.	.tʃeŋlis.	tjerli
English	.irglis.	.ɪŋɡlɪʃ.	.irgli
Esperanto	.espelanton.	.esperanton.	.enspelanto
Español	.espanjol.	.espaɲol.	.enspanjo
日本語	.nihorgos.	.nihõŋɡos.	.nirhorgo

There are, of course, ISO code names, but I haven't worked out that system yet. Ceqli doesn't have a code, anyway. Also, stress and tone can be marked in any way that seems fit. For brivla, these have no phonemic value, but they may have in cmevla.

2014-08-04

Phonology

This should be a steady post. I post it now because I'm sick of using Lojban's FA tags when giving examples for my language. I need words, and for that I need phonemes, phonotactic rules and a writing system.

Ideally, the writing system would be Shwa (used phonemically, because things like assimilation are optional, and can vary between speakers). However, a romanisation scheme is needed. That is all I'll document here; Shwa and IPA substitutions should be obvious. Here are the consonants:

	Labial	Coronal	Dorsal	Glottal
Plosive	p b	t d	k g	.
Fricative	f	s z	h	(h)
Nasal	m	n	r
Approximant	w	l	j

There are a few sticking points. Firstly, ‘r’ represents /ŋ/. This is just an ASCIIism derived from the shape of the lowercase letter (‘m’ → ‘n’ → ‘r’). I want newcomers to be able to write in the language without having to fiddle with their keyboard layout (though it is fun).

Then, there are the glottal consonants. ‘.’ is almost exactly the same as ‘.’ in Lojban. Notably, the dotside conventions for cmevla carry over as expected. With ‘h’, I decided to stick to the convention of many natural languages, which tend to have either /x/ or /h/, but not both. In Lojban, they're roughly in complementary distribution anyway. Instead of using Lojban's ‘'’, I use approximants to join vowels. It's your choice which of [x] and [h] you want to use, and you can mix them in your speech. Even English speakers may prefer [x] in certain circumstances.

The final thing to mention is about approximants. Unlike Lojban, I use on-glides regularly, so distinguishing between vowels and semivowels becomes necessary. Also, it should be noted that the approximants' positions on the chart are more to make the abstractions work, rather than to give phonetic information. ‘w’ and ‘j’ are the semivowels of ‘u’ and ‘i’, respectively. ‘l’ is any lateral or rhotic.

Now, the vowels:

	Front	Central	Back
Close	i		u
Mid	e	y	o
Open		a

These are essentially the vowels of Lojban, and serve similar purposes. ‘y’ represents the schwa, which is morphologically distinct from the other vowels. In an IPAisation, it can be written as ‘ə’, but you may also want to change ‘e’ and ‘o’ to ‘ɛ’ and ‘ɔ’ to avoid them looking too similar.

The order of the vowels is “oiaue”. Tracing this out makes a star shape, thus guaranteeing maximum phonetic distance between vowels close in that sequence. The subsequence “iau” is also used (as it is in Lojban). You may or may not have noticed that the various classes of phonemes have different numbers of phonemes each. For instance, there are 3 approximants, 4 fricatives, 5 vowels and 6 plosives. These can come together to make sequences of semantically similar cmavo.

Where the Latin alphabet needs to be transcribed (like when using ISO 3166-1 codes for countries), the code letters are usually given sounds based on the ASCII orthography. I'll have to go into the morphology before describing this system, though. I will cover morphology and phonotactics soon, in that order.