fancyfa'u – Function Fall: Mathematics

Showing posts with label Mathematics. Show all posts

2014-08-29

Intervals

The current interval notation has notoriety for clashing with other notations and having weird unmatched brackets (the two cases being mutually exclusive). For instance, $x\in[0,1]$ is ambiguous as to whether $x$ is a fuzzy logic value (being a member of an interval) or a strict boolean value (being a member of a list). That's quite a large ambiguity for this otherwise useful notation. Also, the unacquainted can easily confuse whether a bracket is for an open end or a closed end. I've always managed to remember, but that's no excuse to keep things as they are.

I propose a simple alternative: replace $($ with ${<}|$ and $[$ with ${\leq}|$, and reflect for the right-hand end. So, our fuzzy logic interval would be ${\leq}|0,1|{\geq}$. Intervals of integers can keep the double-dot notation, so ${\leq}|0..1|{\geq}$ represents the same thing as the list $[0,1]$. It's a list because it corresponds to a portion of $\mathbb Z$, which I consider a list (see last post for justification).

That's about it. The best way to approximate this in ASCII is by writing things like “<=|0,1|=>”. It's better to keep the ‘=’ next to the ‘|’, so that one avoids the misparse “(<=|0,1|>) =”. And a note for $\LaTeX$: put the ‘<’-like symbols in curly braces to avoid them spacing as if they were infix operators.

2014-08-28

Zip lists

I've laid off the maths for a while in preparation for this post. You may remember that the last post ended in a rather unsatisfactory manner. Here, I try to rectify the situation. The heart of the issue is that I want to be able to express two different types of equality between functions: the weaker set-wise complete closure and the stronger point-wise extensional equivalence.

I define a function $f$ as being completely closed over a set $S$ iff it satisfies these conditions (in non-mapping notation):\[\begin{aligned}
\forall x\in S.\,& f\ x\in S \\
\forall y\in S.\,\exists x\in S.\,& f\ x=y
\end{aligned}\]In other words, the domain and image of $f$ are equal. If $S$ is a finite set, $f$ essentially reorders the elements. If $S$ is infinite, there may be other things that $f$ could do. For example, if $f=\text{succ}\wedge S=\mathbb Z$, $f$ is a kind of translation function. In any case, $f$ is surjective. My aim is to be able to write this in a concise and generalisable way in mathematical notation.

If two functions are extensionally equivalent, they do the same job, and thus one can replace the other without changing the logic of the situation. However, they may have different computational properties, and hence can't be considered to be totally equal. The conditions for this relation between $f$ and $g$ with respect to set $S$ can be written:\[
\forall x\in S.\,f\ x=g\ x
\]This, too, I want to have a concise and generalisable way of writing, since it is a very common relation.

So, why are these related? Well, let's start naïvely applying the principles of the notation without variables. For extensional equivalence, it is intuitive to write $f=g$. Now, we want to limit that to some set? Maybe we say $(f=g)\ S$. That expands to $f\ S=g\ S$ by the fork rule (since equality doesn't act upon functions). But, by implicit mapping, $f\ S$ and $g\ S$ are sets, so correspondences between individual elements are lost. All we've equated are the images for the functions, given a domain. If $g$ is the identity function, the statement is that $f$ is completely closed over $S$.

As the title suggests, the solution lies in using lists, rather than sets. These retain an order, so there is still correspondence between individual elements in the domain and image. In fact, a function can be specified as a pair of lists, as well as a set of pairs. To say that $f$ and $g$ are extensionally equivalent over domain $S$, we just have to make sure that $S$ is cast to be a list. In symbols, $(f=g)\ \text{list}\ S$.

Both $(f=g)\ \text{list}\ S$ and $(f=g)\ S$ are useful. $D\ {}_\cdot^2=2\cdot{}$ is an example of the former, where the implicit argument for the expression is $\text{list}\ \mathbb C$. $(\Pi=(!))\ \text{list}\ \mathbb N$ is an example of where the list argument cannot be reliably inferred, and has a significant effect on the truth of the statement. Finally, $(\text{floor}\ \frac 2 \iota=\iota)\ \mathbb N$, where $\iota$ is the identity function (an example of a non-bijective completely closed function).

List can occur in other situations, too. For example, $\pm$ can be formalised by saying that it returns a 2-element list (as too does $\sqrt \iota$, but not ${}_\cdot^{2/1}$; see an earlier post). Explicitly, $x\pm y=[x+y,x-y]$. This gives $\cos(A\pm B)=\cos A\cdot\sin B\mp\sin A\cdot\cos B$, which boils down to an equivalence between two lists. In fact, lists are so useful that it may be worth assuming that the standard sets are lists, which can be converted to sets as needed. This makes more sense, too, because the $\text{list}$ function essentially introduces new information, whereas functions should lose information (like $\text{set}$ does on lists). Also, in general, expressions like $\text{head}\ \text{list}\ S$ are meaningless. However, given the axiom of choice, $\text{list}$ can be defined for any set, it's just that some operations on it are meaningless.

2014-07-22

Filtration as implication

$
\newcommand{\d}{\mathbin\cdot}
\newcommand{\abs}{\text{abs}}
$In Marianesque mathematical notation, all mathematical statements are functions acting upon an implicit argument. This argument is taken to be the domain of the statement, and is hence a set. Implicit mapping does its magic, and everything is how we'd expect it. But often we want to make claims about part of the domain. In conventional mathematics, we'd say things like $n>3\implies {}_\d^2n>9$. Indeed, we can still do that, but there are alternatives.

In this article, Jakub Marian defines square brackets to denote the filter function. It is defined thus for sets:\[
[p]\ xs = \{x:x\in xs\wedge p\ x\}
\]How do we use it? Well, Jakub translates classical:\[
{}_\d^2x+{}_\d^2y=1\implies\abs\ (x\d y)\in\left[0,\frac 1 2\right]
\]into:\[
\abs\ (x\d y)\ [{}_\d^2x+{}_\d^2y=1]\subseteq\left[0,\frac 1 2\right]
\]where $x$ and $y$ are essentially the fst and snd functions, and the implicit argument is the set of all pairs of real numbers. I use $\abs$ rather than vertical bars for two reasons: vertical bars don't work well and I plan to have other uses for them in conjunction with brackets (to make syntax that looks similar to Haskell's quasi-quoters). Most notably, I need a notation for zip lists.

Getting back on topic, I think that this notation can be taken further. Notice how the $\abs\ (x\d y)$ and $\left[0,\frac 1 2\right]$ have become split up. This is not ideal. So, we can rephrase the whole thing as this:\[
\left(\abs\ (x\d y)\in\left[0,\frac 1 2\right]\right)\ [{}_\d^2x+{}_\d^2y=1]
\]I'd like to say that this has some new and interesting properties, but I can't find any right now. It does, though, give a different way of thinking about implication. It's more visual than the pure logical version, given that it uses sets (which can act like spaces).

I'd like to make some notes about relations. To make things work well, usual equality – along with the greater-than and less-than relations – are not defined for functors. This is because equality between functions makes more sense when the domain is considered. We should be able to say $(\Gamma\ \text{succ}=\text{fac})\ \mathbb{N}$. But if equality also worked on sets, our claim would become $\Gamma\ \text{succ}\ \mathbb{N}=\text{fac}\ \mathbb{N}$, which is a weaker claim than we wanted. Hence, we have to map from the inside out, mapping over the functions, then the set on the outside. $\equiv$ is provided for equality between anything other than functions.

An alternative that always uses $\equiv$ is to cast sets to zip lists before forking. $(\Gamma\ \text{succ}\equiv\text{fac})\ \text{zipList}\ \mathbb{N}$ is still the stricter claim. This is because lists maintain their order, so elements can be compared pairwise. However, I'm not sure whether it's realistic to expect people to remember to cast sets to lists just to allow forking. I think, though, that this is preferable, since the result is $\top$ or $\bot$, not $\{\top\}$, $\{\top,\bot\}$ or $\{\bot\}$ as it would be with the set solution. Maybe we should default to zip lists, even for things like $\mathbb{C}$, then cast them to sets (or just non-zip lists) as necessary.

2014-07-04

Multi-valued functions

$
\newcommand{\r}{\rightarrow}
\newcommand{\R}{\mathbb R}
\newcommand{\d}{\mathbin{\cdot}}
$In the last post, I mentioned that the integration function returns the set of all functions of which the argument is the derivative. To reïterate, $\int : (\R\r\R)\r\{\R\r\R\}$. There's nothing strange about a function returning a set of values, so everything's good so far.

What we need are ways of dealing with these results. The typeclass that helps us here is the monad. For our purposes, a monad $m$ is just an applicative functor with a suitable $\text{join} : m\ (m\ a)\r m\ a$ function. For sets, this function takes a set of sets, then concatenates the inner sets. For functions, we infer from the type $(r\r(r\r a))\r(r\r a)$, equivalently $(r\r r\r a)\r r\r a$, that $\text{join}\ f\ x = f\ x\ x$. The function monad isn't immediately relevant, but it's there.

Now we need to define some more multi-valued functions. The obvious candidates are roots. Superscript-operator pairs are defined by
\[\begin{aligned}
{}_*^1x &{}= x \\
{}_*^nx &{}= x*{}_*^{1-n}x
\end{aligned}\]for non-zero natural $n$. This, for some operators and arguments, can be extended to cover non-natural $n$, but should still return a value of the same type as $x$. Thus, these cannot be multi-valued functions. Instead, roots written with the radical symbol, which are not defined this way, can be defined as multi-valued functions in the obvious way.

It may be worth defining an $\text{inv}$ function which returns the set of all possible inverse functions, rather than just the simplest one (as ${}_\circ^{1-0}$ does, by the above logic). So $\int = \text{inv}\ D$ and $\sqrt{\large\iota} = \text{inv}\ {}_\d^2$.

We have some functions, so let's try something simple. We can work out $\sqrt[4]x$ by doing $\text{join}\ \sqrt{\sqrt x}$:
\[\begin{aligned}
\sqrt[4]{16} &{}= \text{join}\ \sqrt{\sqrt{16}} \\
&{}= \text{join}\ \sqrt{\{4,4-0\}} \\
&{}= \text{join}\ \{\{2,2-0\},\{2\d i,2\d i-0\}\} \\
&{}= \{2,2-0,2\d i,2\d i-0\}
\end{aligned}\]Notice that I used implicit mapping, but implicit joining isn't a thing. We could want a set of sets, and implicit joining would screw that up.

Oh, and here's a relevant Red Dwarf joke:

In answering the question “What is a monoid in the category of endofunctors?” – write bigger – there are various words that need to be defined. What is a category, what is one of endofunctors, why is it one of endofunctors, and why is it so frequently linked with monoids?

What the hell is a monoid?

And there concludes the worst monad tutorial ever.

2014-07-02

Back to the old way, part 2

$
\newcommand{\id}{{\large\iota}}
\newcommand{\d}{\mathbin{\cdot}}
\newcommand{\R}{\mathbb{R}}
\newcommand{\r}{\rightarrow}
$Remember that division and subtraction have their operands switched. I explained it in my first post, but it's worth reïterating because things would be rather confusing on here if you forgot. Remember also that they bind right-first.

Say we have:
\[
\frac{Dx}{Dy} = x + \sin x
\]and want to find an expression for $y$. We can do it in a pretty orthodox way:
\[\begin{aligned}
\frac{Dx}{Dy} &{}= (\id+\sin)\ x \\
Dy &{}= (\id+\sin)\ x\d Dx \\
y &{}\in \int((\id+\sin)\ x\d Dx) \\
&{}\in \int\left(D\left(\left(\frac{2}{{}_\d^2}+\cos-0\right)\ x\right)\right) \\
&{}\in \left(\frac{2}{{}_\d^2}+\cos-0\right)\ x + \R \\
&{}\in \frac{2}{{}_\d^2x}+\cos x-\R
\end{aligned}\]The Jakub Marian method of integration doesn't need a designated variable, but here I have emulated one. The bonus, of course, is that our intuitions get formal grounding; we really were multiplying by $Dx$.

Separation of variables is an obvious extension, where one of the intermediate steps leaves integrals on both sides. So, consider:
\[\begin{aligned}
\frac{Dx}{Dy} &{}= 2\d x\d y \\
\frac{y}{Dy} &{}= 2\d x\d Dx \\
\int\frac{y}{Dy} &{}= \int(2\d x\d Dx) \\
\ln y &{}\in {}_\d^2x+\R \\
y &{}\in \exp{}_\d^2x\d(0,\infty)
\end{aligned}\]Notice that, like Jakub, I interpret indefinite integration as the complete inverse of differentiation. $\int : (\R\r\R)\r\{\R\r\R\}$. I haven't thought of a consistent and satisfying way of writing the result of this, which is why I end up switching from $=$ to $\in$ suddenly.

Back to the old way

$
\newcommand{\R}{\mathbb{R}}
\newcommand{\C}{\mathbb{C}}
\newcommand{\r}{\rightarrow}
\newcommand{\d}{\mathbin{\cdot}}

$Leibniz' way of expressing the derivative of one implicit function with respect to another was to write $\frac{dy}{dx}$ – intuitively the infinitesimal change in $y$ divided by the infinitesimal change in $x$, giving a gradient. Of course, when division is flipped, this becomes $\frac{dx}{dy}$, putting the variables in the familiar $(x,y)$ order.

So, now imagine that $x$ isn't the independent variable, and we instead have an unnamed independent variable that can be used via $\large\iota$. The derivative of $x$ is just known as $Dx$, and the derivative of $y$, with respect to the independent variable, is $Dy$.

Let's do something easy. Let $y = x + \sin x$. Notice that $\sin x$ becomes $\text{sin}\circ x$ when we evaluate the expression, since $x$ is an $\R\r\R$ function (or $\C\r\C$ or whatever). Also, the whole thing is forked over $+$. We can rewrite the equation as $y = ({\large\iota} + \sin)\ x$. Then we can differentiate, using the chain rule: $Dy = (1 + \cos)\ x\d Dx$. Magically, we're left with:
\[
\frac{Dx}{Dy} = 1 + \cos x
\]Whereas it's a sin in normal mathematics to throw around solitary $dx$s, $Dx$s are just ordinary functions. You can do whatever you want with them.

Let's try something a little more interesting. Consider:
\[\begin{aligned}
{}_\d^2x+{}_\d^2y&{}=1&& \\
2\d x\d Dx+2\d y\d Dy&{}=0&& \quad \text{applied }D\text{ to both sides} \\
\frac{y}{x}+\frac{Dx}{Dy}&{}=0&& \quad \text{applied }(2\d y\d Dx/)\text{ to both sides} \\
\frac{Dx}{Dy}&{}=\frac{y}{x}-0&& \quad \text{applied }\left(\frac{y}{x}-\right)\text{ to both sides}
\end{aligned}\]See that curried functions are useful for documentation. They give us the option to be more symbolic, rather than using verbose phrases like “divide through by ...” which have to be translated before we can see them interacting with the equation at hand.

2014-07-01

Forcing rank

$
\newcommand{\len}{\text{len}}
\newcommand{\r}{\rightarrow}
\newcommand{\d}{\mathbin{\cdot}}
$By default, functions that act upon functors are not mapped at all. So, for instance, $\len\ \{\{1,4,6,9\},\{2\},\{5,7\}\} = 3$. But what if what we actually wanted was $\{1,2,4\}$? (Notice that the set has been resorted. Notice also that if two of the inner sets had the same length, the result would have been shorter than the initial list, since duplicates are removed.)

This is where we use the prefixed subscript. This acts differently based on whether or not the number is negative (it must be an integer, by the way). For positive $n$, ${}_nf$ is the function $f$ applied to the parts of rank $n$ with respect to the functor $f$ handles. ${}_{n-0}f = {}^n\text{fmap}\ f$. ${}_0f$ is $f$ applied as if $f$ didn't handle the functor it does (the values it gets will be implicitly lifted).

But what is rank? I should have explained this before. The rank of a value, with respect to some functor, is how many times functors of that type can be “unwrapped” from the value. For example, a value of type $\{\{\mathbb{N}\}\}$ (as in the first example) has rank 2, since it is wrapped twice in the Set functor. A value of type $(a\r b\r c)\r(a\r b)\r a\r c$ has rank 3, since it has been wrapped in the Function functors $((a\r b\r c)\r)$, $((a\r b)\r)$ and $(a\r)$. This is the type of the $S$ combinator, $S\ f\ g\ x = f\ x\ (g\ x)$, which, as you can see, takes 3 arguments.

So, to achieve the result of our first example, we can do either ${}_1\len\ A$ (applying to the rank-1 cells) or ${}_{1-0}\len\ A$ (applying to the cells 1 rank below the given value's rank). We can also distinguish between ${}^2D\left({}_\d^3\right)$ and ${}_1{}^2D\left({}_\d^3\right)$:
\[
\begin{aligned}
{}^2D\left({}_\d^3\right) &{}= (DD)\left({}_\d^3\right) \quad & {}_1{}^2D\left({}_\d^3\right) &{}= \text{fmap}\ {}^2\ D\ {}_\d^3 \\
&{}= D\left(D\left({}_\d^3\right)\right) & &{}= {}^2\left(D\left({}_\d^3\right)\right) \\
&{}= D\left(3\d{}_\d^2\right) & &{}= {}^2\left(3\d{}_\d^2\right) \\
&{}= (6\d{}) & &{}= \left(3\d{}_\d^2\right)\left(3\d{}_\d^2\right) \\
&&&{}= 3\mathbin{\d}{}{}_\d^2\left(3\d{}_\d^2\right) \\
&&&{}= 27\d{}_\d^4
\end{aligned}
\]Okay, it's rare to use this explicitly. But it's there.

Multiplicative groups

When using this notation, I always write multiplication explicitly. This makes the descriptor “multiplicative” make no sense in describing structures where the binary operator is not written.

The implicit operator is application. Therefore, we may want to make applicative groups, where applying one element to another produces a new element. So, we have elements of type $t$, where $t = t\rightarrow t$. But this is bad. It means that our elements have infinite type, and that opens us up to Curry's paradox.

Instead, we give all elements type $a\rightarrow a$, and make a compositional group. For example, let's look at $C_3$, with elements $\{e,f,g\} : \{G\rightarrow G\}$. The group table gives us ways to rewrite any composition of these functions, like $f\circ g = e$.

So, what happens when we don't write the $\circ$ operator? Implicit mapping comes into play. In $fg$, $f : G\rightarrow G$ expects a $G$, but gets $g : G\rightarrow G$. (It's important to note that the $G$s in each type signature are the same. $G$ can't match any type expression non-trivially containing $G$.) Fortunately, $(G\rightarrow)$ is a functor, so $fg$ becomes $\text{fmap}\ f\ g$, which is $f\circ g$.

This way of describing group-like structures happens to be very similar to the way of defining monoids (and hence groups) in category theory. A monoid is defined as a category with a single object and some arrows, one of which is the identity arrow. Necessarily, all of the arrows go from and to the object. In my example, the object corresponds to type $G$, and the elements each correspond to an arrow.

(I wish I had a digit negative 1 right now!)

One thing to note is that one should not use this notation unless the structure has been proven associative. Function composition is associative, so using it as the operation in a non-associative structure would be wrong.

2014-06-29

Implicit mapping

The last post was about what happens when one tries to apply a function to a value that is too small (not wrapped in an applicative functor). This post is about what happens when the value is too big (wrapped in too many functors). Jakub Marian defines (when translated into my notation):
\[\begin{aligned}
(f\ g)\ x &{}= f\ (g\ x) \\
f\ A &{}= \{f\ x : x\in A\}
\end{aligned}\]I posted the applicative function definition in the last post, but here we only need a regular functor. As far as I understand it, a functor is what comes directly from category theory, and an applicative functor just adds some functions that they nearly always have. There are very few functors that cannot be made into applicative functors.

$
\newcommand{\fmap}{\text{fmap}}
$A functor $f$ provides us with just:
\[
\fmap : (a\rightarrow b)\rightarrow f\ a\rightarrow f\ b
\]For sets, $\fmap$ applies a function to every element of the set (eliminating duplicates afterwards), and for functions, it composes the given function (type $(a\rightarrow b)$) to the left of the functor's function (type $f\ a$, which is $t\rightarrow a$). I encourage you to see why the types work for the latter example.

$
\newcommand{\neg}{\text{neg}}
\newcommand{\R}{\mathbb R}
$So, post done? Pretty much. It's maybe worth going through the example of $\neg\ {}_\cdot^{\neg\ 1}D$, where $\neg : \R\rightarrow\R$ is the negation function and $D : (\R\rightarrow\R)\rightarrow \R\rightarrow\R$ is the differentiation function (and the superscript-subscript notation is described in the initial post). Of course, limiting to real numbers is arbitrary; I just needed a number type.

Firstly, we try to apply $\neg : \R\rightarrow\R$ to ${}_\cdot^{\neg\ 1} : \R\rightarrow\R$. $\R\rightarrow\R$ doesn't match $\R$, but it does match $f\ \R$, where $f = (a\rightarrow)$. So, we transform $\neg\ {}_\cdot^{\neg\ 1}$ into $\fmap\ \neg\ {}_\cdot^{\neg\ 1}$, which is $\neg\circ{}_\cdot^{\neg\ 1}$, the negative reciprocal function. Let's call it $\text{negrec} : \R\rightarrow\R$.

Next, we try to apply that to $D : (\R\rightarrow\R)\rightarrow \R\rightarrow\R$. $D$'s type certainly doesn't match the expected $\R$, but it does match $h\ (g\ \R)$, where $g = (b\rightarrow)$ (where $b = \R$) and $h = (c\rightarrow)$ (where $c = \R\rightarrow\R$). Note that it doesn't match $i\ \R$ where $i = ((\R\rightarrow\R)\rightarrow\R\rightarrow)$ because $\rightarrow$ is not associative (it binds to the right). So, $\text{negrec}\ D$ becomes $\fmap\ (\fmap\ \text{negrec})\ D$, or $(\text{negrec}\circ{})\circ D$.

That's as far as we can go, so let's apply this function to, say, $\exp$ and $0$. \[\begin{aligned}
((\text{negrec}\circ{})\circ D)\ \exp\ 0
&{}= ((\text{negrec}\circ{})\ (D\ \exp))\ 0 \\
&{}= (\text{negrec}\circ\exp)\ 0 \\
&{}= \text{negrec}\ (\exp 0) \\
&{}= \text{negrec}\ 1 \\
&{}= \neg\ 1
\end{aligned}\]And that tells us that the gradient of the normal to the exponential curve at 0 is negative 1.

Implicit value lifting

Jakub Marian's notation without variables has a definition I call “the fork rule”:
\[
(f\star g)\ x = f\ x\star g\ x
\]It is also made clear that if either $f$ or $g$ is a constant, the constant should be preserved. For instance, the following has to hold:
\[
({\large\iota}+1)\ 2 = {\large\iota}\ 2+1\ 2 = 2 + 1 = 3
\]where $\large\iota$ is the identity function. Obviously, the thing we need to sort out is $1\ 2=1$.

My first idea was that all constants should be turned into constant functions. Essentially, “1” becomes a function that always yields the value 1 (but not the function 1). However, I realised that this was unsatisfactory. Of minor importance was the fact that it felt weird to never have access to the underlying object. But the main point is that I want to generalise the ideas of constants and constant functions to elements and single-element sets. If every constant were a function, it would also need to be a set, and a zip list, and every other similar structure we want to use. There would have to be some order to the wrapping of all of these structures, but there is no natural order.

So instead I say this: values can be implicitly lifted when applied to a function that expects an applicative functor. As an introduction, the applicative functor $f : \text{Type} \rightarrow \text{Type}$ has definitions for these functions:
\[\begin{aligned}
\text{fmap}\ &: (a\rightarrow b)\rightarrow f\ a\rightarrow f\ b \\
\text{pure}\ &: a\rightarrow f\ a \\
{<}{\ast}{>}\ &: f\ (a\rightarrow b)\rightarrow f\ a\rightarrow f\ b
\end{aligned}\]So $f$ acts like a wrapper on other types, an idea played on by this functor/applicative/monad introduction. We won't need monads here (yet), but they are related.

For this post, we're only interested in $\text{pure}$. For $\text{Set}$, $\text{pure}\ a = \{a\}$, and for types $(t\rightarrow)$ (functions from arbitrary type $t$), $\text{pure}\ a = \lambda x.a$. So there we have our singleton set and constant function connected. (Note for functional programmers: sets are indeed functors in maths, since any things of the same type in maths can be equated. Mathematicians don't care for such worthless notions as computability.)

To address $1\ 2 = 1$, we'll write it out with an explicit application operator, $$$. So, we consider $1\mathbin$2$. $$$ has type $(a\rightarrow b)\rightarrow a\rightarrow b$, so we're hoping that $1$ fits $a\rightarrow b$ and $2$ fits $a$. $2$ does indeed fit $a$ (anything does), so we need $1$ to fit $\mathbb{R}\rightarrow b$. It doesn't. But it does fit $b$, and $(\mathbb{R}\rightarrow)$ is an applicative functor. So we transform $1\mathbin$2$ into $\text{pure}\ 1\mathbin$2$, which is $(\lambda x.1)\ 2$, which is $1$. Yay!

So we can now use constants where functions were wanted, elements where sets were wanted and values where maybes were wanted. That seems enough for this post.

2014-06-28

J meets Haskell, for maths

This is the introductory post on a few of my ideas for mathematical notation. Some are more useful or realistic than others (with no particular correlation between the two). It's based on Jakub Marian's notation without variables, so big thanks to Jakub.

Here are some ways in which my notation differs from Jakub's:

Juxtaposition means function application. So it's $f\,x$ as much as it's $\sin x$. This is the only feasible notation for application of curried functions, other than having some explicit left-associative infix operator. And because I plan on using functions a lot (as JM does), brevity of application is important. Function application is the closest-binding operation.
Juxtaposition doesn't mean multiplication. So it's $2\cdot x$ as much as it's $2\cdot 3$. There's nothing special and fundamental about multiplication, so it doesn't have particularly special notation.
Implicit mapping is better-defined. When applying a function to some functor that it doesn't handle, the function is $\text{fmap}$ed until it can match its argument. Otherwise, it's not $\text{fmap}$ed at all. There's a bit more to it, and it probably deserves its own post.
Implicit mapping is controlled via subscripted numbers before the function. The numbers work in the same way as array ranks in J (the " conjunction), but also work for mapping over other functors, like functions. It's rare to actually use these, but they're there for anyone who needs them.
Exponentiation is generalised, and the exponent is written before the base. $_\cdot^2x$ (notice the dot) denotes the square of $x$, and $_\circ^\infty\cos 0$ denotes the fixpoint of $\cos$, starting from $0$ ($\circ$ is the explicit function composition operator). The subscript-superscript pair is considered a function, so the usual precedence and mapping rules for function application apply. I haven't decided how this handles non-associative operators.
Subtraction and division have their operands switched. I wrote a Quora answer about this, so I won't explain it here, yet. This also requires that both are right-associative (though fixity of division is not used much).
- This also applies to the $\bmod$ function, now also written in prefix (like any other function with a name made of letters). $\bmod 2\ 5=1$. An alternative is to use $\mathbin\%$ (ugly) or repurpose $\mathbin|$ (like J). Both of those are written with \mathbin in $\LaTeX$, which adds spacing.
- Limits of integration are also swapped.
Elision of arguments implies partial application, not use of a default argument. So it's $2-0$, like it's $\frac{2}{1}$ (remember, they both got flipped, so $2-0$ is negative 2). This may seem annoying, but being able to use partially-applied infix operators has its own rewards.
Oversetting and undersetting can be used for composition around infix operators. $x\mathbin{\underset\log+}y=\log x+\log y=\log(x\cdot y)=x\mathbin{\overset\log\cdot}y$. Ironically, this nifty non-linear notation comes from two J conjunctions. This is possibly most useful for defining weaker equality operators, as seen in modular arithmetic ($5\mathbin{\underset{\bmod 2}=}3$).

Numerals are separate to everything I've listed here, and are not really relevant. For the record, I prefer balanced base twelve at the moment, but I'm going to try balanced base twenty-four soon. Balanced bases have the advantage, amongst other advantages, that small negative numbers can be written with a single character, rather than three (as I propose).

Of all these, switching of parameters is probably the most jarring and incompatible with standard maths notation. It takes a while to get used to, so I'll try to spare you from it as much as possible on this blog. The other large differences mostly don't clash with existing notation, and others don't have counterparts in the current system.