2012/06/13

Making rApache load rJava

Here at work I've been in the business of developing webapps using R as the backend computational framework.  The list of parts to get this running is pretty lightweight, just:
I'm not going to cover how to set these things up here, there is pretty good documentation around the web and on rApache's site.  Instead, I'm going to talk about a hair pulling setback I encountered early on.

Problem

R scripts run behind rApache cannot load rJava without throwing an HTTP 500 error

Details

Specifically, if you look at the error_log file you see something like the following:
Loading required package: rJava
Error : .onLoad failed in loadNamespace() for 'rJava', details:
  call: dyn.load(file, DLLpath = DLLpath, ...)
  error: unable to load shared object '/usr/local/lib64/R/library/rJava/libs/rJava.so':
  libjvm.so: cannot open shared object file: No such file or directory
Error: package 'rJava' could not be loaded

Running the same R script from
  • a user login session ... no problem.
  • behind PHP (via a system() call) ... no problem.

Suffice it to say, this had me really really stumped.  Stumped enough to give up temporarily and settle with calling R code that needed rJava via a PHP-to-shell intermediary.  Of course, that got confusing and unscalable quite quickly, forcing me to find a real solution.

So I started digging and found one unanswered post on the rApache Google Group relating to this problem dating back to 2010 (it's answered now, with my solution as detailed below).  Not helpful.

More digging produced this post, which pointed me in the direction of the LD_LIBRARY_PATH variable, which apparently you shouldn't mess with directly unless you want a lot of R pain.


Using the following one line test script:
cat(Sys.getenv()['LD_LIBRARY_PATH'], '\n')

I quickly determined that rApache was NOT setting this variable, or anything else defined in
R/etc/ldpaths

before creating an instance of R.

From the folks that work on RStudio, R needs this variable set before starting R for rJava to initialize correctly - i.e. be able to find libjvm.so.


So how do you do this in an Apache process?  I know that using a SetEnv directive in httpd.conf is a dead end.  Thankfully, folks at the Ubuntu forums found a way.

Solution

Here's my modification of the Ubuntu forum solution.


Step 1:
Add a file to:

/etc/ld.so.conf.d

called:

rApache_rJava.conf

with just a single line:

/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre/lib/amd64/server/

which happens to be the direct parent path to libjvm.so on my server.

Step 2:
As root, run:
/sbin/ldconfig

Step 3:
Restart Apache

Wrap-up

After all this rigamarole it appears that I can load packages that depend on rJava from within rApache - i.e.  lines like
library(rJava)

no longer complain and I'm not getting any more HTTP 500 errors as a result, which makes me happy for the moment.  How long this happiness lasts depends.  R scripts within rApache still don't see an LD_LIBRARY_PATH variable, but at least the parent Apache process knows where to find libjvm.so.