Odd Hypothesis: 2014/08

As of late, I’ve been making heavy use of Reference Classes in R. They are easier for me to wrap my mind around since they adopt a usage style more like “traditional” OOP languages like Java. Primarily, object methods are part of the class definition and accessed via the instantiated object.

For instance:
With S3/S4 classes, you define an object. Then you define separate generic functions that operate on the object:

# class and object method definition
myClass = setClass('myClass', ...)
print.myClass = function(x){...}

# so then ...
obj = myClass(...)
print(obj)

With Reference classes, you define an object and therein the methods the object employs:

# class and object method definition
myClass = setRefclass('myClass', 
    fields  = list(), 
    methods = list(print=function(){...}))

# so then ...
obj = myClass(...)
obj$print()

In the grand scheme of things, both ways of defining objects and their methods are pretty much equivalent. From a coding perspective, the S3/S4 style allows for object methods to be defined separately of the object class (e.g. in separate files if one prefers). The appeal of Reference Classes is that the objects they define know what methods they have.

Privacy issues

The one aspect of OOP in R that I’ve been trying to work out is how to implement private methods and fields - i.e. only visible/usable from within the scope of the object, thus not user callable/mutable. There is (currently) no official way to specify these in base R.

A little research reveals that the best one can do is obfuscate.

Roxygen2 will not specify the existence of a RefClass method if it lacks a docstring, but it will still be available to the user if they introspect the object interactively (ala the tab key if using RStudio).
The best suggestions I’ve come across are:

build a package around your RefClass definition and use non-exported package functions for private methods
define private methods as functions within the public methods they are used in

Option 1 is probably the better of the two, as it lets R’s namespace rules do the dirty work. However, it does require writing functions of the form:

privateFun = function(obj, ...) {
    # do stuff
    obj$field <<- newValue
}

Option 2 would likely require much code replication or, at the very least source()-ing the requisite code where ever a private function is required. A very far from ideal development/debugging situation.

From a high level view, Reference Classes are environments with added bells and whistles. What’s interesting is say I defined a class like so:

myClass = setRefClass('myClass',
  fields = list(
     pubField = 'character',
    .prvField = 'character'
  ),
  methods = list(
     pubMethod = function(){print('public')},
    .prvMethod = function(){print('private')}
  )
)

Notice, that the “prvField” field and “prvMethod” use a . to prefix the name. In R this is a way of creating a “hidden” variable – akin to hidden files on a *nix OS.

When I try to ls() the resultant object, I get:

> obj = myClass()
> ls(env = obj)
[1] "getClass" "pubField"

So, it is within the realm of possibilities!

Another alternative that I thought of was to make a field of the object an environment and then place private elements there. Again, making things private via obfuscation (and more typing). However, users could still access said elements by:

obj$private$field

R6 - a new hope

As I was putting the finishing touches on this post I read a great post by Romain Francois entitled “Pro Grammar and Devel Hoper” (kudos on the pun). Towards the end he links to an Rpub posted by Winston Chang entitled “Introduction to R6” which piqued my interest.

R6 is a new OOP system provided by the R6 package - posted to CRAN ~~just 4 days ago~~ about a month ago. While similar to the existing Reference Class system (objects are specially wrapped environments), it also provides separation of public and private elements - exactly what I was looking for! Performance tests also show that R6 is faster and more memory efficient than Reference Classes.

Suffice it to say, I’ll be checking R6 out.

Written with StackEdit.

I recently discovered a powerful use for R expression()’s

Say you are trying to fit some experimental data to the following nonlinear equation:

K y 0 e u ( x - t l ) K + y 0 ( e u * ( x - t l ) - 1 ) + b 1 + (b 0 - b 1) e - k x + b 2 x

$\frac{K y_0 e^{u (x-tl)}}{K + y_0 (e^{u*(x-tl)}-1)} + b_1 + (b_0 - b_1) e^{-k x} + b_2 x$
with the independent variable x using nlminb() as the minimization optimizer.

This sort of work is significantly improved (i.e. faster with better convergence) if an analytical gradient vector and a Hessian matrix for the objective function are provided. This means a lot of partial differentiation of the model equation with respect to each parameter.
To get these partial derivatives one could:

Review some calculus and derive them by hand
Feed the equation into an online engine like Wolfram Alpha and copy/paste the results

The discovery I made is that there is a purely R solution via the combination of expression()’s and the functions D() and all.vars().

The all.vars() function extracts all variable and parameter names from an expression as a character vector. For example:

> all.vars(expression(b1 + (b0 - b1)*exp(-k*x) + b2*x))
[1] "b1" "b0" "k"  "x"  "b2"

The D() function takes two arguments: an expression to differentiate and a character specifying the variable term to differentiate by:

> D(expression(b1 + (b0 - b1)*exp(-k*x) + b2*x), 'x')
b2 - (b0 - b1) * (exp(-k * x) * k)

The following code produces a list of the partial derivatives of the above equation with respect to each variable/parameter.

# the model equation
expr = expression(
    (K*y0*exp(u*(x-tl)))/(K + y0*(exp(u*(x-tl))-1)) + 
        b1 + (b0 - b1)*exp(-k*x) + b2*x
)

sapply(all.vars(expr), function(v){
  D(expr, v)
})

# returns:
# $K
# y0 * exp(u * (x - tl))/(K + y0 * (exp(u * (x - tl)) - 1)) - (K * 
#     y0 * exp(u * (x - tl)))/(K + y0 * (exp(u * (x - tl)) - 1))^2
# 
# $y0
# K * exp(u * (x - tl))/(K + y0 * (exp(u * (x - tl)) - 1)) - (K * 
#     y0 * exp(u * (x - tl))) * (exp(u * (x - tl)) - 1)/(K + y0 * 
#     (exp(u * (x - tl)) - 1))^2
# 
# $u
# K * y0 * (exp(u * (x - tl)) * (x - tl))/(K + y0 * (exp(u * (x - 
#     tl)) - 1)) - (K * y0 * exp(u * (x - tl))) * (y0 * (exp(u * 
#     (x - tl)) * (x - tl)))/(K + y0 * (exp(u * (x - tl)) - 1))^2
# ...

Each element of the list returned by the sapply() statement above is itself an expression. Evaluation of each will give rows of the Jacobian matrix, which we’ll subsequently need to compute the gradient:

p0 = c(y0=0.01, u=0.3, tl=5, K=2, b0=0.01, b1=1, b2=0.001, k=0.1)
x = seq(0,10)

# notice that t() is applied to put parameters on rows
J = t(sapply(all.vars(expr), function(v, env){
  eval(D(expr, v), env)
}, env=c(as.list(p0), list(x=x))))

J
# returns:
#             [,1]          [,2]          [,3]          [,4] ...
# K  -4.367441e-06 -5.298871e-06 -6.067724e-06 -6.218461e-06 ...
# y0  2.248737e-01  3.033101e-01  4.089931e-01  5.512962e-01 ...
# u  -1.118747e-02 -1.207174e-02 -1.220845e-02 -1.097079e-02 ...
# x   1.006712e-01  9.148428e-02  8.327519e-02  7.598662e-02 ...
# tl -6.712481e-04 -9.053805e-04 -1.220845e-03 -1.645619e-03 ...
# b1  0.000000e+00  9.516258e-02  1.812692e-01  2.591818e-01 ...
# b0  1.000000e+00  9.048374e-01  8.187308e-01  7.408182e-01 ...
# k   0.000000e+00  8.957890e-01  1.621087e+00  2.200230e+00 ...
# b2  0.000000e+00  1.000000e+00  2.000000e+00  3.000000e+00 ...

Since x is the independent variable, the row corresponding to it can be safely removed from the Jacobian:

J = J[names(p0),,drop=F]
J
# returns:
#             [,1]          [,2]          [,3]          [,4] ...
# y0  2.248737e-01  3.033101e-01  4.089931e-01  5.512962e-01 ...
# u  -1.118747e-02 -1.207174e-02 -1.220845e-02 -1.097079e-02 ...
# tl -6.712481e-04 -9.053805e-04 -1.220845e-03 -1.645619e-03 ...
# K  -4.367441e-06 -5.298871e-06 -6.067724e-06 -6.218461e-06 ...
# b0  1.000000e+00  9.048374e-01  8.187308e-01  7.408182e-01 ...
# b1  0.000000e+00  9.516258e-02  1.812692e-01  2.591818e-01 ...
# b2  0.000000e+00  1.000000e+00  2.000000e+00  3.000000e+00 ...
# k   0.000000e+00  8.957890e-01  1.621087e+00  2.200230e+00 ...

The gradient vector is simply the inner product of the Jacobian and a vector of residuals:

gr = -J %*% r

For the Hessian, the full form in Gibbs-Einstein notation is:

H j k = \partial f i \partial p j \partial f i \partial p k + r i \partial 2 r i \partial p j \partial p k

$H_{jk} = \frac{\partial f_i}{\partial p_j} \frac{\partial f_i}{\partial p_k} + r_i \frac{\partial^2 r_i}{\partial p_j \partial p_k}$

The first term is simply

JTJ $\mathbf{J}^T \mathbf{J}$ . The second term is typically called the “Hessian of the residuals” and is referred to as a matrix

B $\mathbf{B}$ . I’m still trying to wrap my head around what it actually is in vector notation.

Thankfully, in optimization cases where the initial guess is near the optimum, the behavior of the system should be “linear enough” that one can ignore the second term:

H \approx J T J

$\mathbf{H} \approx \mathbf{J}^T \mathbf{J}$

The equivalent R code would be:

H = J %*% t(J)

(because linear algebra in R is a little strange, the transpose is applied to the second Jacobian)

Putting it all together:

# the model equation
expr = expression(
    (K*y0*exp(u*(x-tl)))/(K + y0*(exp(u*(x-tl))-1)) + 
        b1 + (b0 - b1)*exp(-k*x) + b2*x
)

p0 = c(y0=0.01, u=0.3, tl=5, K=2, b0=0.01, b1=1, b2=0.001, k=0.1)
x = seq(0,48,by=0.25)

# let's say these are the residuals
r = runif(length(x))

# magic syntax that converts an equation expression into a jacobian matrix
J = t(sapply(all.vars(expr), function(v, env){
  eval(D(expr, v), env)
}, env = c(as.list(p0), list(x=x))))

# and then a gradient vector
gr = -J %*% r

# and then an approximate Hessian matrix
H = J %*% t(J)

Extending this further, one can now write a generic model object like so:

ModelObject = setRefClass('ModelObject', 
  fields = list(
    name = 'character',
    expr = 'expression'
  ),
  methods = list(
    value = function(p, data){
      eval(.self$expr, c(as.list(p), as.list(data)))
    },
    jacobian = function(p, data){
      J = t(sapply(all.vars(.self$expr), function(v, p, data){
              eval(D(.self$expr, v), c(as.list(p), as.list(data)))
            }, p=p, data=data))

      return(J[names(p),,drop=F])
    },
    gradient = function(p, data){
        r = data$y - value(p, data)
        return(-jacobian(p, data) %*% r)
    },
    hessian = function(p, data){
      J = jacobian(p, data)
      return(J %*% t(J))
    }
  )
)

which is instantiated with simply an expression and can be used to provide gradient and Hessian functions to nlminb():

> xy = list(x=seq(0,10,by=0.25), y=dnorm(seq(0,10,by=0.25),10,2)) # test data
> p0 = c(y0=0.01, u =0.2, l=5, A=log(1.5/0.01))
> mo = ModelObject(
         name = 'gompertz', 
         expr = expression( y0*exp(A*exp(-exp((u*exp(1)/A)*(l-x)+1))) )
       )

> fit = nlminb(p0, function(p, data){
        r = data$y - mo$value(p,data)
        return(r %*% r)
    }, gradient = mo$gradient, hessian = mo$hessian, data=xy)

> fit$par
         y0           u           l           A 
0.001125766 1.345796890 3.646340494 5.408138500 

> fit$message
[1] "relative convergence (4)"

> plot(xy, main='Fit Results'); lines(xy$x, mo$value(fit$par, xy))

Painless!

Written with StackEdit.

Odd Hypothesis

2014/08/23

R OOP - a little privacy please?

Privacy issues

R6 - a new hope

2014/08/20

Optimizing with R expressions