Saturday, May 7, 2011

Corresponding

(The examples here work with the version of insidefunctor tagged as "v2")

Unfortunately I couldn't do this cleanly outside the library. So the changes are made in insidefunctor.

Levels are no longer used to "line up" eaches. So, for example,

> library(insidefunctor)

> `%+.%` = fmap(`+`)

> `%/.%` = fmap(`/`)

> x = c(1, 2, 3)

> y = c(4, 5, 6)

> .[z] = each(x) %+.% each(y)

> z


  • 5  6  7

  • 6  7  8

  • 7  8  9



> .[w] = each(x) %+.% pond(y)

> w

5  7  9

where "pond" stands for "corresponding" and is chosen because no one would use the word pond for anything else.

But this is so much more flexible! Because the definition of "corresponding" is itself flexible. Here it means "having the same sequential position", but it could be taken to mean just about anything else. Like, for example, a linearly-interpolated lookup-table:

> `%near%` = function(y, x) {

> UseMethod("%near%")

> }

> looktable = function(dep, ind) {

> reorder = order(ind)

> dep = dep[reorder]

> ind = ind[reorder]

> attr(dep, "indvar") = ind

> class(dep) = c("looktable", class(dep))

> dep

> }

> `%near%.looktable` = function(y, x) {

> pond(approx(attr(y, "indvar"), y, x)$y)

> }

All it does is translate linear nearness into sequential correspondence, ie exactly what approx() does in the first place. Then you can use it like

> .[u] = each(1:5) %+.% reeval(runif(1))

> u

1.29927378566936  2.63695163559169  3.90192435029894  4.71558610373177  5.38200953858905

> v = looktable((0:6)^2, 0:6)

> v

0  1  4  9  16  25  36

> .[w] = v %near% u

> w

1.89782135700807  7.18475817795843  15.3134704520926  22.4402749335859  29.2021049244795

> .[z] = (v %near% u) %/.% each(u)

> z

1.46067855592911  2.72464541290166  3.92459439940689  4.75874566595812  5.42587387017808

These could of course have been done just as easily in straight-up R. The only difference is grammar. You can generalize corresponding to other kindns of lookups, like, say, functions:

This is a little trickier because if you were to say

> each(u) %/.% the(sin)

when the(sin) is called it has no idea what axis it runs along, ie it can't sequentially correspond at that time. Luckily the definition of correspondence is flexibly: it is done through several generic functions:

> corresponds = function(arg, inside) {

> UseMethod("corresponds")

> }

> alignable = function(arg, inside) {

> UseMethod("alignable")

> }

> corresponding = function(arg, inside, i) {

> UseMethod("corresponding")

> }

With defaults suitable for sequential correspondence:

> corresponds.correspondence = function(arg, inside) {

> if (is.null(arg$ref)) {

> T

> }

> else {

> identical(arg$axis, inside$axis)

> }

> }

> alignable.correspondence = function(arg, inside) {

> identical(arg$axis, inside$axis)

> }

> corresponding.each = function(arg, inside, i) {

> part = arg$items[[i]]

> arg$items[[i]]

> }

So all we need to do is provide methods to tell apply.functor.each that the(sin) corresponds to each(u), that they are alignable, and, given an element of u, to find the corresponding element in the(sin).

> the = function(func) {

> the = list(

> func = func

> )

> class(the) = c('the', class(the))

> the

> }

> corresponds.the = function(the, inside) {

> T

> }

> alignable.the = function(the, inside) {

> T

> }

> corresponding.the = function(the, inside, i) {

> the$func(inside$items[[i]])

> }

Now see if that works:

> u = c(1, 2, 3)

> u

1  2  3

> .[v] = each(u) %/.% the(sin)

> v

1.18839510577812  2.19950034058923  21.2585021872116

Alas we still could not do

> plot(each(seq(0, 1, len = 100)), the(sin))

because the elements would not be collected at the time of calling plot(). Perhaps another functor could fix this.

Sunday, May 1, 2011

"Inside" Functors -- Evaluating things more than once

(The examples here work with the version of insidefunctor tagged as "v1")

I ran into an interesting problem using "inside" functors.

Something is wrong in the following code (well, depending on what you thought it should do).

> library(insidefunctor)

> `%+.%` = fmap(`+`)

> x = seq(0, 10, len = 50)

> plot(x, collect(each(x) %+.% runif(1)))

It's clear that in constructions like each(x) + y, y is only going to be evaluated once. Of course, the preceding example could have been written

> plot(x, collect(each(x) %+.% each(runif(length(x)))))

but I think that that is not as grammatically pretty.

But, since we solved the last grammatical problem with a hacky use of inside-functors, why not try the same trick? Say we define an inside functor meval (for multiple-evaluations) that behaves like this:

  • meval(expr) returns a promise to evaluate expr
  • func(meval(expr)) returns a promise to evaluate func(expr)
  • collect(meval(expr)) evaluates it finally.

That is, the unevaluated chain keeps growing until it is finally collected, at which point a value results.

So let's define that.

> meval = function(expr, level=1) {

> expr = substitute(expr)

> callback = function () {

> eval(expr)

> }

> make.meval(callback, level=level, depth=1)

> }

> make.meval = function(callback, level, depth) {

> functor = inside.functor(level, depth)

> functor$callback = callback

>

> class(functor) = c('meval', class(functor))

>

> functor

> }

> apply.functor.meval = function(

> inside,

> func,

> args,

> caller

> )

> {

> our.level = level(inside)

>

> args.boxed = args

> for (i in seq_along(args.boxed)) {

> arg = args.boxed[[i]]

>

> if (is.inside.functor(arg) && level(arg)>=our.level) {

> }

> else {

> args.boxed[[i]] = list(

> callback = function() {

> arg

> }

> )

> }

> }

> max.depth = max(sapply(args.boxed, depth))

>

> callback = function() {

> piece.args = lapply(args.boxed, function (arg) {

> arg$callback()

> })

> caller(func, piece.args)

> }

>

> make.meval(

> callback,

> level = our.level,

> depth = max.depth

> )

> }

> collect.end.meval = function(inside) {

> inside$callback()

> }

And test it.

> promise = meval(runif(1))

> collect(promise)

0.633877807762474

> collect(promise)

0.236430999357253

Works so far. Now try the motivating example:

> plot(x, collect.all(each(x, l = 2) %+.% meval(runif(1))))

Oh god no it's this problem again.

arg isn't being remembered in

> args.boxed[[i]] = list(

> callback = function() {

> arg

> }

> )

so the fix is to

> apply.functor.meval = function(

> inside,

> func,

> args,

> caller

> )

> {

> our.level = level(inside)

>

> args.boxed = args

> for (i in seq_along(args.boxed)) {

> arg = args.boxed[[i]]

>

> if (is.inside.functor(arg) && level(arg)>=our.level) {

> }

> else {

> args.boxed[[i]] = (function(arg) {

> force(arg)

> list(

> callback = function() {

> arg

> }

> )

> })(arg)

> }

> }

> max.depth = max(sapply(args.boxed, depth))

>

> callback = function() {

> piece.args = lapply(args.boxed, function (arg) {

> arg$callback()

> })

> caller(func, piece.args)

> }

>

> make.meval(

> callback,

> level = our.level,

> depth = max.depth

> )

> }

Which is ugly but works. Then:

> plot(x, collect.all(each(x, l = 2) %+.% meval(runif(1))))

Now the real challenge is to understand why the above code works, but interchanging the levels (ie making the each() happen before the meval()) does not:

> plot(x, collect.all(each(x) %+.% meval(runif(1), l = 2)))

And, given that you obviously wanted it to go the first way or why would you have used meval(), is there any way to modify the semantics so that only the first way makes sense (and is that a good idea?), which brings us to...

About those levels...

They're yucky. Also note that the call to collect.all in the preceding example is really doing 2 collects, even though the functors are only ever written 1 deep.

The reason is that expressions like

> x = c(1, 2, 3)

> y = c(4, 5)

> collect.all(each(x, l = 2) %+.% each(y))


  • 5  6

  • 6  7

  • 7  8



behave like (inserting for the xs)

> collect.all(

> each(x)

> %+.%

> each(

> lapply(x, function(x.) each(y))

> )

> )


  • 5  6

  • 6  7

  • 7  8



which behaves like (inserting for the ys)

> collect.all(

> each(

> lapply(x, function(x.)

> each(

> lapply(y, function(y) x.)

> )

> )

> )

> %+.%

> each(

> lapply(x, function(x) each(y))

> )

> )


  • 5  6

  • 6  7

  • 7  8



ie only when the levels are the same do the eaches "line up" and remain a single each. When the levels are different they "miss" each other and become two nested eaches. This is by design but it still feels messy.

Suppose we were to bring back the suggestion of the name "corresponding" that we mentioned earlier:

> each(x) %+.% corresponding(y)

would stand for when the levels are identical; in any other case the levels would be assumed to be different and the functors would "overlap".

The advantage to this notation is that only when the word "each" is actually used is another level introduced. Plus it aligns more closely with English.