Monday, April 8, 2013

The Reduction Engine

This is a first, rough version of the reduction engine:

The reduction engine on one page

Usage

  • Inserting:
    • Ctrl+Click makes a new tree.
    • +
    • -
    • *
    • S
    • K
    • I
    • Y
    • . Insert a function application
    • 0..9,# A number
  • Modifying:
    • e Partially apply
    • C Clear subtree
    • i Insert argument
  • Reducing:
    • r Reduce at current node
  • Applicatives (see explanation below -- basically work like lambdas):
    • b Bury
    • c Collect
  • Moving around:
    • Down Move to child
    • Up Move to parent
    • Left Cycle selection left
    • Right Cycle selection right
    • Click to select a particular node.
  • Dragging things around visually:
    • Shift+Up,Down,Left,Right Move subtree.
    • Drag with mouse also moves subtree.
    • f Reformat subtree.

The list to the right allows you to jump to past editing states.

General Description

The reduction engine is a small, interactive editing environment for playing around with combinatory expressions as well as applicative expressions. Applicative expressions as they are done in the reduction-engine is one of its unique features, and are described below.

The use of "holes" in the tree is inspired by holes in Agda.

Applicative expressions generalize lambda expressions

How to think about functors and applicative functors

A description of applicative functors in Haskell

Conor McBride, Ross Paterson. Idioms: applicative programming with effects

Functors

A functor is like a way of operating on data that has been wrapped up inside another abstraction. That is, it takes ordinary operations on plain data, and makes it so they can operate on the imaginary data inside the abstraction.

The List functor is a good example: it gives a way of operating on elements inside a list. Say f is the operation "add 2 to a number", and we have a list

[ 1, 2, 3 ]

The list functor gives us a way to take f and make it work on the elements of the list, giving

[ 3, 4, 5 ]

Another example is the function functor. If I have a function

g  =  x -> x * 7

and I want to apply f to this, the function functor says that the way we do this is to move the abstraction (x ->) one place further to the outside, giving us

x  ->  (x * 7) + 2

That's all for functors: the power to operate on data inside another abstraction. Applicatives will add another power.

Applicatives

Applicatives have 2 powers:

  1. Operate on data inside the abstraction -- the same power as functors have.

  2. Mix abstracted data from 2 sources.

I have omitted one of the powers usually ascribed to Applicatives. More on that below.

Mixing abstracted data from 2 sources

Supposing we have two lists,

[ 1, 2, 3 ]
[ 4, 5, 6 ]

and we would like, Matlab-style, to add them together element-by-element, to get:

[ 5, 7, 9 ]

It's a pretty common thing to want to do, which is why Matlab makes it happen all the time. Well the problem is, with just power #1, we can't bring two lists together. So power #2 says, you can take an operation of two things (like add two numbers together), and provide for it as arguments two functor-embedded pieces of data.

So now, we have a way to operate on one functor-embedded object, and two at a time.

Now if you imagine an expression

       f
      / \
    [x]  g
        / \
       y  [z]

where [x] and [z] are inside the functor, and y is a plain, unfunctored object, and f and g are some operations, and say we would like to evaluate it. So we apply power #1 to evaluate g(y, z), which ends up being in the functor, so f has 2 children in the function so we apply power #2 to evaluate f(x, ...), and there we go. And if every expression you could ever want to evaluate can be represented as a binary tree, then just applying powers #1 and #2 you can evaluate any applicative expression regardless of how polluted with functors it is -- provided we stick on only one kind of functor (and no nesting).

How did the data get inside the functor in the first place?

Usually, applicatives are assigned a third power, which is the power to take a plain object and bring it into the functor in the simplest way possible, eg 2 becomes [ 2 ]. This power is essential to writing generic, recursive programs with applicatives, but it is a little superfluous in the context of the Reduction Engine, because every functor, applicative or not, ought to have some way of getting data inside it, otherwise you wouldn't be trying to operate with that functor in the first place.

So what you can imagine is, we have these big expression trees,

         .
        / \
       .   \
      / \   \
             .
            / \
           .
          / \

and somehow, despite our best efforts to keep them out, functors manage to slip in there -- maybe as subtly as the risk of finding a zero when dividing -- but, when we have the applicative powers #1 and #2, we can go on evaluating the expression as if nothing ever happened, and only at the end, when we want to look at what we finally got, do we see that there was a functor involved in there somewhere.

Applicatives in the Reduction Engine

The preceding discussion sort of gives the impression that the applicative powers can be applied "automatically" to the tree, Matlab-style, without the programmer having to type them in. In reality, no programming language supports that fully, though some have varying degrees of syntactic sugar that take a bite into that problem. Unfortunately, there is actually a good reason why programming languages don't do this, though that reason is not quite as good as people usually believe.

The reason is basically that sometimes human judgment is required. The applicative combinators can't be applied automatically because it is not always automatically clear what the programmer actually meant. If I write

    f
   / \
 [x] [y]

where [x] and [y] are trapped in a functor, let's say the same functor, your instinct is to apply power #2, but on reflection you see that there are actually 6 different things the programmer might have meant:

  1. Apply power #2, mixing the two functors.
  2. Apply power #1, treating [x] as inside a functor but treating [y] as an aggregate object (think [1, 3] + length(ls))
  3. Treating [x] as aggregate and [y] as inside.
  4. Treating both [x] and [y] as aggregate.
  5. Treating [x] as inside, [y] as aggregate initially, until you get inside [x], then treat [y] as inside (think f([1,2], [3, 4]) = [[f(1,3),f(1,4)],[f(2,3), f(2, 4)]])
  6. Treating [y] as inside, [x] as aggregate until getting inside [y], then treating as inside.

That is five too many for the computer to pick one! Let us say, to simplify the matter, that f is not capable of operating on aggregates, which eliminates 2, 3 and 4 but leaves 1, 5 and 6. You can think of the difference between them as the difference between "lining up" the functors, and two kinds of overlapping:

x -- y    Lining up

x --
  -- y    x surrounds y

  -- y    y surrounds x
x --

In general, if we have an n-ary meeting of arguments (f(x1, x2, ...)) each x_i of which is wrapped in k_i functors, which we assume to be all amicable:

[ ]   [ ]          [ ]
[ ]   [ ]          [ ]
...   ...    ...   ...
 x1    x2           xn

(here [ ] represents some functor that one of the xs is wrapped in, symbolized by it being "on top"), the possible meanings correspond to different ways to draw non-overlapping horizontal lines through the functors

where every functor is hit by exactly one line, and the lines may "miss" functors so long as they do not cross. That is quite a lot of ways.

The idea behind the Reduction Engine -- and I'm very tired now so I'll have to save where exactly that idea comes from for another post -- is that you can solve most of those problems by making the functors artificially incompatible. Let us say we have

   f
  / \
[x] [y]

where [x] is embedded in the α functor and [y] is embedded in the β functor, and say that α and β are incompatible in such a way that we cannot perform power #2 on them. Then the only things we could end up with are an α β or a β α -- choices 5 and 6. So that's a little better.

Then, if the functors are power-2 compatible, we say, well, the programmer could have made these incompatible if they wanted to, so, since they didn't, we'll assume that they wanted to line them up, leaving only one possibility.

So the problem is almost solved. The only thing left is we need a way to decide between α β and β α. They way the reduction engine addresses this is something that makes no conceptual sense, but, having thought about it, I have decided makes exactly as much sense as lambda expressions, which, this has led me to believe, also make no conceptual sense. But they are very useful.

The solution is that in the Reduction Engine, after applying an applicative expression like f(x, y), you have a series of "recollect" steps:

  Cβ
   |
  Cα
   |
   f
  / \
[x] [y]

(C stands for "collect"). Then, on the argument side, you balance the recollect steps with some "bury" steps.

  Cβ
   |
  Cα
   |
   f
  / \
Bα  Bβ
 |   |
[x] [y]

What you are doing with the "bury" steps is marking certain arguments as "going inside" the functor. This is how we eliminated options 2 3 and 4 from the big list above. When you say Bα(x), you are saying, let's treat x as belonging to the α functor, and only the α functor.

What's going on in the recollect steps is less clear. Nominally, a recollect step "unmarks" the mark that the bury step imposed. eg, if you wrote

Cα
 |
Bα
 |
[x]

you are marking and the immediately unmarking x, so this is the same as just x. But the recollect steps are playing another crucial role, and that is that the order of recollect steps is deciding the order of overlap of the functors. If you collect α first and then β, you are asking for a β α, and if you do it the other way around, you are asking for an α β. Ambiguity solved.

The reason I believe that this makes no conceptual sense is that if you cover up the recollect steps with your hand for a moment and just look at what lies below them,

   f
  / \
Bα  Bβ
 |   |
[x] [y]

you will see that, even though you have written the complete application of f, it is as of yet totally unclear what is actually going to happen at that junction. Is this tree an α β or a β α? Consider that in a typed language like Haskell, you would have a hard time indeed typing this expression, not knowing which functor was on the outside and which was on the inside.

If you don't think that that is odd enough, consider a deeper tree where the bury and recollect steps are far apart:

      Cβ 
       | 
      Cα 
       |
       h
      / \
     g   w
    / \
   f   z
  / \
Bα  Bβ
 |   |
[x] [y]

Now the actual computations that happen at the f junction -- and, in a typed language, its type -- are determined by something far up the tree. What is going on?

If you think about this for awhile I hope you can convince yourself that this is exactly the mechanics of lambda expressions. Collect is lambda, bury is a variable, "artificially making functors incompatible" is alpha conversion. The only odd part is that variables have something subordinate to them in the expression tree; other than that, it is literally exactly the same.

That also sort of resolves the typing issue. We can think of bury as if it literally stripped off the functor type, making it totally invisible. In that interpretation, at the f junction, f has its normal, unfunctored typed, so the type is not, as we first feared, determined by something far up the tree. This is the same mechanics as open terms in lambda expressions -- we don't say that in \x -> x + 2 that x + 2 has type Int -> Int just because it has a free variable in it -- we say it is of type Int, oh and by the way it's also an open term. This is just another kind of open term -- an applicative open term.

But it gets better than that. Not only have we made a new notion of open terms, but regular old lambdas are just a special case where the function functor is used. So what we have really done is opened up the sphere of open terms from being tied to just one applicative functor to encompassing all applicative functors.

Conclusion

I would love to see the principles behind the reduction engine in a mainstream programming language, or even an academic one like Agda. One of the things I've always like about Agda is that it has the ability to reason about open terms, so you don't have to write a complete program to play around with some data and evaluate things. Allowing general applicative open terms just makes it better. And, I'll have to think more about it, but I believe that open applicative terms are a useful restriction to Scala's delimited continuations (which are more general), that could simplify writing applicative expressions. These are much, much more flexible than idiom brackets.

No comments:

Post a Comment