Without root access, run R with tuned BLAS when it is linked with reference BLAS
why my way does not work
First, shared libraries on UNIX are designed to mimic the way archive libraries work (archive libraries were there first). In particular that means that if you have libfoo.so
and libbar.so
, both defining symbol foo
, then whichever library is loaded first is the one that wins: all references to foo
from anywhere within the program (including from libbar.so
) will bind to libfoo.so
s definition of foo
.
This mimics what would happen if you linked your program against libfoo.a
and libbar.a
, where both archive libraries defined the same symbol foo
. More info on archive linking here.
It should be clear from above, that if libblas.so.3
and libopenblas.so.0
define the same set of symbols (which they do), and if libblas.so.3
is loaded into the process first, then routines from libopenblas.so.0
will never be called.
Second, you've correctly decided that since R
directly links against libR.so
, and since libR.so
directly links against libblas.so.3
, it is guaranteed that libopenblas.so.0
will lose the battle.
However, you erroneously decided that Rscript
is better, but it's not: Rscript
is a tiny binary (11K on my system; compare to 2.4MB for libR.so
), and approximately all it does is exec
of R
. This is trivial to see in strace
output:
strace -e trace=execve /usr/bin/Rscript --default-packages=base --vanilla /dev/null
execve("/usr/bin/Rscript", ["/usr/bin/Rscript", "--default-packages=base", "--vanilla", "/dev/null"], [/* 42 vars */]) = 0
execve("/usr/lib/R/bin/R", ["/usr/lib/R/bin/R", "--slave", "--no-restore", "--vanilla", "--file=/dev/null", "--args"], [/* 43 vars */]) = 0
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=89625, si_status=0, si_utime=0, si_stime=0} ---
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=89626, si_status=0, si_utime=0, si_stime=0} ---
execve("/usr/lib/R/bin/exec/R", ["/usr/lib/R/bin/exec/R", "--slave", "--no-restore", "--vanilla", "--file=/dev/null", "--args"], [/* 51 vars */]) = 0
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=89630, si_status=0, si_utime=0, si_stime=0} ---
+++ exited with 0 +++
Which means that by the time your script starts executing, libblas.so.3
has been loaded, and libopenblas.so.0
that will be loaded as a dependency of mmperf.so
will not actually be used for anything.
is it possible at all to make it work
Probably. I can think of two possible solutions:
- Pretend that
libopenblas.so.0
is actuallylibblas.so.3
- Rebuild entire
R
package againstlibopenblas.so
.
For #1, you need to ln -s libopenblas.so.0 libblas.so.3
, then make sure that your copy of libblas.so.3
is found before the system one, by setting LD_LIBRARY_PATH
appropriately.
This appears to work for me:
mkdir /tmp/libblas
# pretend that libc.so.6 is really libblas.so.3
cp /lib/x86_64-linux-gnu/libc.so.6 /tmp/libblas/libblas.so.3
LD_LIBRARY_PATH=/tmp/libblas /usr/bin/Rscript /dev/null
Error in dyn.load(file, DLLpath = DLLpath, ...) :
unable to load shared object '/usr/lib/R/library/stats/libs/stats.so':
/usr/lib/liblapack.so.3: undefined symbol: cgemv_
During startup - Warning message:
package ‘stats’ in options("defaultPackages") was not found
Note how I got an error (my "pretend" libblas.so.3
doesn't define symbols expected of it, since it's really a copy of libc.so.6
).
You can also confirm which version of libblas.so.3
is getting loaded this way:
LD_DEBUG=libs LD_LIBRARY_PATH=/tmp/libblas /usr/bin/Rscript /dev/null |& grep 'libblas\.so\.3'
91533: find library=libblas.so.3 [0]; searching
91533: trying file=/usr/lib/R/lib/libblas.so.3
91533: trying file=/usr/lib/x86_64-linux-gnu/libblas.so.3
91533: trying file=/usr/lib/jvm/java-7-openjdk-amd64/jre/lib/amd64/server/libblas.so.3
91533: trying file=/tmp/libblas/libblas.so.3
91533: calling init: /tmp/libblas/libblas.so.3
For #2, you said:
I have no root access on machines I want to test, so actual linking to OpenBLAS is impossible.
but that seems to be a bogus argument: if you can build libopenblas
, surely you can also build your own version of R
.
Update:
You mentioned in the beginning that libblas.so.3 and libopenblas.so.0 define the same symbol, what does this mean? They have different SONAME, is that insufficient to distinguish them by the system?
The symbols and the SONAME
have nothing to do with each other.
You can see symbols in the output from readelf -Ws libblas.so.3
and readelf -Ws libopenblas.so.0
. Symbols related to BLAS
, such as cgemv_
, will appear in both libraries.
Your confusion about SONAME
possibly comes from Windows. The DLL
s on Windows are designed completely differently. In particular, when FOO.DLL
imports symbol bar
from BAR.DLL
, both the name of the symbol (bar
) and the DLL
from which that symbol was imported (BAR.DLL
) are recorded in the FOO.DLL
s import table.
That makes it easy to have R
import cgemv_
from BLAS.DLL
, while MMPERF.DLL
imports the same symbol from OPENBLAS.DLL
.
However, that makes library interpositioning hard, and works completely differently from the way archive libraries work (even on Windows).
Opinions differ on which design is better overall, but neither system is likely to ever change its model.
There are ways for UNIX to emulate Windows-style symbol binding: see RTLD_DEEPBIND
in dlopen man page. Beware: these are fraught with peril, likely to confuse UNIX experts, are not widely used, and likely to have implementation bugs.
Update 2:
you mean I compile R and install it under my home directory?
Yes.
Then when I want to invoke it, I should explicitly give the path to my version of executable program, otherwise the one on the system might be invoked instead? Or, can I put this path at the first position of environment variable $PATH to cheat the system?
Either way works.
*********************
Solution 1:
*********************
Thanks to Employed Russian, my problem is finally solved. The investigation requires important skills in Linux system debugging and patching, and I believe this is a great asset I learned. Here I would post a solution, as well as correcting several points in my original post.
1 About invoking R
In my original post, I mentioned there are two ways to launch R, either via R
or Rscript
. However, I have wrongly exaggerated their difference. Let's now investigate their start-up process, via an important Linux debugging facility strace
(see man strace
). There are actually lots of interesting things happening after we type a command in the shell, and we can use
strace -e trace=process [command]
to trace all system calls involving process management. As a result we can watch the fork, wait, and execution steps of a process. Though not stated in the manual page, @Employed Russian shows that it is possible to specify only a subclass of process
, for example, execve
for the execution steps.
For R
we have
~/Desktop/dgemm$ time strace -e trace=execve R --vanilla < /dev/null > /dev/null
execve("/usr/bin/R", ["R", "--vanilla"], [/* 70 vars */]) = 0
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=5777, si_status=0, si_utime=0, si_stime=0} ---
execve("/usr/lib/R/bin/exec/R", ["/usr/lib/R/bin/exec/R", "--vanilla"], [/* 79 vars */]) = 0
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=5778, si_status=0, si_utime=0, si_stime=0} ---
+++ exited with 0 +++
real 0m0.345s
user 0m0.256s
sys 0m0.068s
while for Rscript
we have
~/Desktop/dgemm$ time strace -e trace=execve Rscript --default-packages=base --vanilla /dev/null
execve("/usr/bin/Rscript", ["Rscript", "--default-packages=base", "--vanilla", "/dev/null"], [/* 70 vars */]) = 0
execve("/usr/lib/R/bin/R", ["/usr/lib/R/bin/R", "--slave", "--no-restore", "--vanilla", "--file=/dev/null"], [/* 71 vars */]) = 0
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=5822, si_status=0, si_utime=0, si_stime=0} ---
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=5823, si_status=0, si_utime=0, si_stime=0} ---
execve("/usr/lib/R/bin/exec/R", ["/usr/lib/R/bin/exec/R", "--slave", "--no-restore", "--vanilla", "--file=/dev/null"], [/* 80 vars */]) = 0
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=5827, si_status=0, si_utime=0, si_stime=0} ---
+++ exited with 0 +++
real 0m0.063s
user 0m0.020s
sys 0m0.028s
We have also used time
to measure the start-up time. Note that
Rscript
is about 5.5 times faster thanR
. One reason is thatR
will load 6 default packages on start-up, whileRscript
only loads onebase
package by control:--default-packages=base
. But it is still much faster even without this setting.- In the end both start-up processes are directed to
$(R RHOME)/bin/exec/R
, and in my original post, I have already exploitedreadelf -d
to show that this executable will loadlibR.so
, which are linked withlibblas.so.3
. According to @Employed Russian's explanation, the BLAS library loaded first will win, so there is no way my original method will work. - To successfully run
strace
, we have used the amazing file/dev/null
as input file and output file when necessary. For example,Rscript
demands an input file, whileR
demands both. We feed the null device to both to make the command run smoothly and the output clean. The null device is a physically existing file, but works amazingly. When reading from it, it contains nothing; while writing to it, it discards everything.
2. Cheat R
Now since libblas.so
will be loaded anyway, the only thing we can do is to provide our own version of this library. As I have said in the original post, if we have root access, this is really easy, by using update-alternatives --config libblas.so.3
, so that the system Linux will help us complete this switch. But @Employed Russian offers an awesome way to cheat the system without root access: let's check how R finds BLAS library on start-up, and make sure we feed our version before the system default is found! To monitor how shared libraries are found and loaded, use environment variable LD_DEBUG
.
There are a number of Linux environment variables with prefix LD_
, as documented in man ld.so
. These variables can be assigned before an executable, so that we can change the running feature of a program. Some useful variables include:
LD_LIBRARY_PATH
for setting run time library search path;LD_DEBUG
for tracing look-up and loading of shared libraries;LD_TRACE_LOADED_OBJECTS
for displaying all loaded library by a program (behaves similar toldd
);LD_PRELOAD
for forcing injecting a library to a program at the very start, before all other libraries are looked for;LD_PROFILE
andLD_PROFILE_OUTPUT
for profiling one specified shared library. R user who have read section 3.4.1.1 sprof of Writing R extensions should recall that this is used for profiling compiled code from within R.
The use of LD_DEBUG
can be seen by:
~/Desktop/dgemm$ LD_DEBUG=help cat
Valid options for the LD_DEBUG environment variable are:
libs display library search paths
reloc display relocation processing
files display progress for input file
symbols display symbol table processing
bindings display information about symbol binding
versions display version dependencies
scopes display scope information
all all previous options combined
statistics display relocation statistics
unused determined unused DSOs
help display this help message and exit
To direct the debugging output into a file instead of standard output a filename can be specified using the LD_DEBUG_OUTPUT environment variable.
Here we are particularly interested in using LD_DEBUG=libs
. For example,
~/Desktop/dgemm$ LD_DEBUG=libs Rscript --default-packages=base --vanilla /dev/null |& grep blas
5974: find library=libblas.so.3 [0]; searching
5974: trying file=/usr/lib/R/lib/libblas.so.3
5974: trying file=/usr/lib/i386-linux-gnu/i686/sse2/libblas.so.3
5974: trying file=/usr/lib/i386-linux-gnu/i686/cmov/libblas.so.3
5974: trying file=/usr/lib/i386-linux-gnu/i686/libblas.so.3
5974: trying file=/usr/lib/i386-linux-gnu/sse2/libblas.so.3
5974: trying file=/usr/lib/i386-linux-gnu/libblas.so.3
5974: trying file=/usr/lib/jvm/java-7-openjdk-i386/jre/lib/i386/client/libblas.so.3
5974: trying file=/usr/lib/libblas.so.3
5974: calling init: /usr/lib/libblas.so.3
5974: calling fini: /usr/lib/libblas.so.3 [0]
shows various attempts that R program tried to locate and load libblas.so.3
. So if we could provide our own version of libblas.so.3
, and make sure R finds it first, then the problem is solved.
Let's first make a symbolic link libblas.so.3
in our working path to the OpenBLAS library libopenblas.so
, then expand default LD_LIBRARY_PATH
with our working path (and export it):
~/Desktop/dgemm$ ln -sf libopenblas.so libblas.so.3
~/Desktop/dgemm$ export LD_LIBRARY_PATH = $(pwd):$LD_LIBRARY_PATH ## put our working path at top
Now let's check again the library loading process:
~/Desktop/dgemm$ LD_DEBUG=libs Rscript --default-packages=base --vanilla /dev/null |& grep blas
6063: find library=libblas.so.3 [0]; searching
6063: trying file=/usr/lib/R/lib/libblas.so.3
6063: trying file=/usr/lib/i386-linux-gnu/i686/sse2/libblas.so.3
6063: trying file=/usr/lib/i386-linux-gnu/i686/cmov/libblas.so.3
6063: trying file=/usr/lib/i386-linux-gnu/i686/libblas.so.3
6063: trying file=/usr/lib/i386-linux-gnu/sse2/libblas.so.3
6063: trying file=/usr/lib/i386-linux-gnu/libblas.so.3
6063: trying file=/usr/lib/jvm/java-7-openjdk-i386/jre/lib/i386/client/libblas.so.3
6063: trying file=/home/zheyuan/Desktop/dgemm/libblas.so.3
6063: calling init: /home/zheyuan/Desktop/dgemm/libblas.so.3
6063: calling fini: /home/zheyuan/Desktop/dgemm/libblas.so.3 [0]
Great! We have successfully cheated R.
3. Experiment with OpenBLAS
~/Desktop/dgemm$ Rscript --default-packages=base --vanilla mmperf.R
GFLOPs = 8.77
Now, everything works as expected!
4. Unset LD_LIBRARY_PATH
(to be safe)
It is a good practice to unset LD_LIBRARY_PATH
after use.
~/Desktop/dgemm$ unset LD_LIBRARY_PATH
*********************
Solution 2:
*********************
Here we offer another solution, by exploiting environment variable LD_PRELOAD
mentioned in our solution 1. The use of LD_PRELOAD
is more "brutal", as it forces loading a given library into the program before any other program, even before the C library libc.so
! This is often used for urgent patching in Linux development.
As shown in the part 2 of the original post, the shared BLAS library libopenblas.so
has SONAME libopenblas.so.0
. An SONAME is an internal name that dynamic library loader would seek at run time, so we need to make a symbolic link to libopenblas.so
with this SONAME:
~/Desktop/dgemm$ ln -sf libopenblas.so libopenblas.so.0
then we export it:
~/Desktop/dgemm$ export LD_PRELOAD=$(pwd)/libopenblas.so.0
Note that a full path to libopenblas.so.0
needs be fed to LD_PRELOAD
for a successful load, even if libopenblas.so.0
is under $(pwd)
.
Now we launch Rscript
and check what happens by LD_DEBUG
:
~/Desktop/dgemm$ LD_DEBUG=libs Rscript --default-packages=base --vanilla /dev/null |& grep blas
4860: calling init: /home/zheyuan/Desktop/dgemm/libopenblas.so
4860: calling init: /home/zheyuan/Desktop/dgemm/libopenblas.so
4865: calling init: /home/zheyuan/Desktop/dgemm/libopenblas.so
4868: calling fini: /home/zheyuan/Desktop/dgemm/libopenblas.so [0]
4870: calling init: /home/zheyuan/Desktop/dgemm/libopenblas.so
4869: calling init: /home/zheyuan/Desktop/dgemm/libopenblas.so
4867: calling fini: /home/zheyuan/Desktop/dgemm/libopenblas.so [0]
4860: find library=libblas.so.3 [0]; searching
4860: trying file=/usr/lib/R/lib/libblas.so.3
4860: trying file=/usr/lib/i386-linux-gnu/i686/sse2/libblas.so.3
4860: trying file=/usr/lib/i386-linux-gnu/i686/cmov/libblas.so.3
4860: trying file=/usr/lib/i386-linux-gnu/i686/libblas.so.3
4860: trying file=/usr/lib/i386-linux-gnu/sse2/libblas.so.3
4860: trying file=/usr/lib/i386-linux-gnu/libblas.so.3
4860: trying file=/usr/lib/jvm/java-7-openjdk-i386/jre/lib/i386/client/libblas.so.3
4860: trying file=/usr/lib/libblas.so.3
4860: calling init: /usr/lib/libblas.so.3
4860: calling init: /home/zheyuan/Desktop/dgemm/libopenblas.so
4874: calling init: /home/zheyuan/Desktop/dgemm/libopenblas.so
4876: calling init: /home/zheyuan/Desktop/dgemm/libopenblas.so
4860: calling fini: /home/zheyuan/Desktop/dgemm/libopenblas.so [0]
4860: calling fini: /usr/lib/libblas.so.3 [0]
Comparing with what we saw in solution 1 by cheating R with our own version of libblas.so.3
, we can see that
libopenblas.so.0
is loaded first, hence found first byRscript
;- after
libopenblas.so.0
is found,Rscript
goes on searching and loadinglibblas.so.3
. However, this will play no effect by the "first come, first serve" rule, explained in the original answer.
Good, everything works, so we test our mmperf.c
program:
~/Desktop/dgemm$ Rscript --default-packages=base --vanilla mmperf.R
GFLOPs = 9.62
The outcome 9.62 is bigger than 8.77 we saw in the earlier solution merely by chance. As a test for using OpenBLAS we don't run the experiment many times for preciser result.
Then as usual, we unset environment variable in the end:
~/Desktop/dgemm$ unset LD_PRELOAD