2013/08/03

A new R trick ... for me at least

What were going to be talking about today are dynamic argument lists for functions. Specifically, how to unpack and prepare them in R using ..., list(), and do.call()

Biased by Matlab and varargin

Initially, I based my use of ... in R on my experience with Matlab's varargin. Using varargin, Matlab functions can have a signature of:

function f(varargin)
% do stuff here

Functions that use varargin are responsible for processing its contents, which is easy since it is simply a cell array. Thus, it can be "unpacked" and modified using cell array methods.

function f(varargin)
arg1 = varargin{1}
arg2 = varargin{2}
return(arg1*arg2)

At call, arguments captured by varargin can be specified as an expanded cell array:

args = {foo, bar}
f(args{:})

As a matter of fact, functions that do not use varargin can also be called this way since Matlab effectively interprets an expanded cell array as a comma-separated list

This comes in handy when you have a mixture of required and optional arguments for a function.

f(arg, opts{:})

Back to R ...

I used to think ... was analogous to varargin since:

  • it captures all function arguments not explicitly defined by the call signature
  • the number of arguments it captures can vary

However, unlike varargin:

  • ... is a special R language expression/object
  • it needs to be converted to a list to access the arguments (names and/or values) that it captures

The former point is strength and quirk of R, as it allows for arguments encapsulated in ... to be passed on to additional functions:

f = function(x, ...) {
  y = g(x, ...)
  return(y)
}

The latter point above (unpacking ...) is actually easy to do:

f = function(x, ...) {
  args = list(...) # contains a=1, b=2
  return(args$a * args$b)
}

Where confusion arises for many is that ... is essentially immutable (cannot be changed). While conceptually a list(), you can't modify it directly using list accessors:

f = function(x, ...) {
  ...[[1]] = 3 # this produces an error, as would ...$var and ...[1]
  y = g(x, ...)
  return(y)
}

So, what if I wanted to unpack arguments in ..., check/change their values, and repackage it for another function call? Since ... is immutable the code below would throw an error.

f = function(x, ...) {
  args = list(...) # unpack, contains a='foo'
  args$a = bar

  ... = args # ERROR!

  y = g(x, ...)
  return(y)
}

Also, there isn't a way (that I've found yet) to unroll a list() object in R into a comma-separated list like you can with a cell array in Matlab.

# this totally doesn't work
args = list(a=1, b='foo')
result = f(args[*]) # making up syntax here. would be nice, no?

As it turns out, ... doesn't even come into play here. In fact, you need to use a rather deep R concept - calls.

Whenever a function is used in R, a call is produced, which is an unprocessed expression that is then interpreted by the underlying engine. Why the delay? Only the creators/developers of R can fully detail why, but it does allow for some neat effects - e.g. the automatic labeling of plots.

To package a programmatically generated argument list one uses the do.call() function:

result = do.call('fun', list(arg1, arg2, etc, etc))

where the first argument is the name of the function to call, and the second argument is a list of arguments to pass along. For all intents and purposes, the R statement above is equivalent to the Matlab statement below.

results = fun(args{:}) % where args = {arg1, arg2, etc, etc}

Thus, process to unpack ..., check/modify an argument, and repack for another function call becomes:

f = function(x, ...) {
  args = list(...) # unpack, contains a='foo'
  args$a = bar     # change argument "a"

  y = do.call(g, c(x, args)) # repack arguments for call to g()
  return(y)
}

I must credit this epiphany to the following StackOverflow question and answer: http://stackoverflow.com/questions/3414078/unpacking-argument-lists-for-ellipsis-in-r

Written with StackEdit.

No comments:

Post a Comment