2012/06/13

Making rApache load rJava

Here at work I've been in the business of developing webapps using R as the backend computational framework.  The list of parts to get this running is pretty lightweight, just:
I'm not going to cover how to set these things up here, there is pretty good documentation around the web and on rApache's site.  Instead, I'm going to talk about a hair pulling setback I encountered early on.

Problem

R scripts run behind rApache cannot load rJava without throwing an HTTP 500 error

Details

Specifically, if you look at the error_log file you see something like the following:
Loading required package: rJava
Error : .onLoad failed in loadNamespace() for 'rJava', details:
  call: dyn.load(file, DLLpath = DLLpath, ...)
  error: unable to load shared object '/usr/local/lib64/R/library/rJava/libs/rJava.so':
  libjvm.so: cannot open shared object file: No such file or directory
Error: package 'rJava' could not be loaded

Running the same R script from
  • a user login session ... no problem.
  • behind PHP (via a system() call) ... no problem.

Suffice it to say, this had me really really stumped.  Stumped enough to give up temporarily and settle with calling R code that needed rJava via a PHP-to-shell intermediary.  Of course, that got confusing and unscalable quite quickly, forcing me to find a real solution.

So I started digging and found one unanswered post on the rApache Google Group relating to this problem dating back to 2010 (it's answered now, with my solution as detailed below).  Not helpful.

More digging produced this post, which pointed me in the direction of the LD_LIBRARY_PATH variable, which apparently you shouldn't mess with directly unless you want a lot of R pain.


Using the following one line test script:
cat(Sys.getenv()['LD_LIBRARY_PATH'], '\n')

I quickly determined that rApache was NOT setting this variable, or anything else defined in
R/etc/ldpaths

before creating an instance of R.

From the folks that work on RStudio, R needs this variable set before starting R for rJava to initialize correctly - i.e. be able to find libjvm.so.


So how do you do this in an Apache process?  I know that using a SetEnv directive in httpd.conf is a dead end.  Thankfully, folks at the Ubuntu forums found a way.

Solution

Here's my modification of the Ubuntu forum solution.


Step 1:
Add a file to:

/etc/ld.so.conf.d

called:

rApache_rJava.conf

with just a single line:

/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre/lib/amd64/server/

which happens to be the direct parent path to libjvm.so on my server.

Step 2:
As root, run:
/sbin/ldconfig

Step 3:
Restart Apache

Wrap-up

After all this rigamarole it appears that I can load packages that depend on rJava from within rApache - i.e.  lines like
library(rJava)

no longer complain and I'm not getting any more HTTP 500 errors as a result, which makes me happy for the moment.  How long this happiness lasts depends.  R scripts within rApache still don't see an LD_LIBRARY_PATH variable, but at least the parent Apache process knows where to find libjvm.so.

3 comments:

  1. wow..great. Thank you very much. I was facing the same issue. Your post saved my time.

    ReplyDelete
  2. Were you able to load JDBC driver successfully? I am having problems while loading the JDBC driver.

    See below;

    I am getting a message 'Empty reply from server' - while Loading JDBC driver

    Code:

    [root@localhost R]# cat hello.R
    library("plyr")
    library("ggplot2")
    library("reshape")
    library("scales")
    require("scales")
    library("RJDBC")
    JDBC("com.mysql.jdbc.Driver", "/usr/share/java/mysql-connector-java.jar", identifier.quote="`")
    print("Printing data...")
    [root@localhost R]#


    Response:

    [perf@localhost src]$ curl http://localhost/rapachetest
    curl: (52) Empty reply from server
    [perf@localhost src]$


    Apache Logs

    [root@localhost logs]# pwd
    /etc/httpd/logs
    [root@localhost logs]# tail -f *_log
    [Fri Jul 06 17:07:22 2012] [error] [client 127.0.0.1] rApache Notice!

    I have tested this code using R terminal and it works well. Could you please help?

    Regards
    Santosh

    ReplyDelete
    Replies
    1. Hi Santosh,

      Sorry for the long delay.

      I tested the following script

      ================================
      setContentType('text/html')

      tic = proc.time()['elapsed']
      cat('Loading package RJDBC ... ')
      library(RJDBC)
      toc = proc.time()['elapsed'] - tic
      cat('OK', sprintf('(%.2f s)', toc), '\n')

      DONE
      ================================

      and it returned just fine. So it seems on my server the hack to load rJava works to enable RJDBC.

      It's hard to tell from your error log what R is complaining about. Can you provide more information?

      Delete