For faster R use OpenBLAS

来源:互联网 发布:中核华兴达丰网络办公 编辑:程序博客网 时间:2024/05/29 08:32

From: http://www.stat.cmu.edu/~nmv/2013/07/09/for-faster-r-use-openblas-instead-better-than-atlas-trivial-to-switch-to-on-ubuntu/

Installing additional BLAS libraries on Ubuntu

For Ubuntu, there are currently three different BLAS options that can be easily chosen: "libblas" the reference BLAS, "libatlas" the ATLAS BLAS, and "libopenblas" the OpenBLAS. Their package names are

$ apt-cache search libblaslibblas-dev - Basic Linear Algebra Subroutines 3, static librarylibblas-doc - Basic Linear Algebra Subroutines 3, documentationlibblas3gf - Basic Linear Algebra Reference implementations, shared librarylibatlas-base-dev - Automatically Tuned Linear Algebra Software, generic staticlibatlas3gf-base - Automatically Tuned Linear Algebra Software, generic sharedlibblas-test - Basic Linear Algebra Subroutines 3, testing programslibopenblas-base - Optimized BLAS (linear algebra) library based on GotoBLAS2libopenblas-dev - Optimized BLAS (linear algebra) library based on GotoBLAS2

Since libblas already comes with Ubuntu, we only need to install the other two for our tests. (NOTE: In the following command, delete 'libatlas3gf-base' if you don't want to experiment with ATLAS.):

$ sudo apt-get install libopenblas-base libatlas3gf-base

Switching between BLAS libraries

Now we can switch between the different BLAS options that are installed:

$ sudo update-alternatives --config libblas.so.3There are 3 choices for the alternative libblas.so.3gf (providing /usr/lib/libblas.so.3gf).Selection Path Priority Status------------------------------------------------------------* 0 /usr/lib/openblas-base/libopenblas.so.0 40 auto mode1 /usr/lib/atlas-base/atlas/libblas.so.3gf 35 manual mode2 /usr/lib/libblas/libblas.so.3gf 10 manual mode3 /usr/lib/openblas-base/libopenblas.so.0 40 manual modePress enter to keep the current choice[*], or type selection number:
    Side note: If the above returned:
    update-alternatives: error: no alternatives for libblas.so.3gf

    Try

    $ sudo update-alternatives --config libblas.so.3

    instead. See the comments at the end of the post for further details.

From the selection menu, I picked 3, so it now shows that choice 3 (OpenBLAS) is selected:

$ sudo update-alternatives --config libblas.so.3gfThere are 3 choices for the alternative libblas.so.3gf (providing /usr/lib/libblas.so.3gf).Selection Path Priority Status------------------------------------------------------------0 /usr/lib/openblas-base/libopenblas.so.0 40 auto mode1 /usr/lib/atlas-base/atlas/libblas.so.3gf 35 manual mode2 /usr/lib/libblas/libblas.so.3gf 10 manual mode* 3 /usr/lib/openblas-base/libopenblas.so.0 40 manual mode

And we can pull the same trick to choose between LAPACK implementations. From the output we can see that OpenBLAS does not provide a new LAPACK implementation, but ATLAS does:

$ sudo update-alternatives --config liblapack.so.3There are 2 choices for the alternative liblapack.so.3gf (providing /usr/lib/liblapack.so.3gf).Selection Path Priority Status------------------------------------------------------------* 0 /usr/lib/atlas-base/atlas/liblapack.so.3gf 35 auto mode1 /usr/lib/atlas-base/atlas/liblapack.so.3gf 35 manual mode2 /usr/lib/lapack/liblapack.so.3gf 10 manual mode

So we will do nothing in this case, since OpenBLAS is supposed to use the reference implementation (which is already selected).

Checking that R is using the right BLAS

Now we can check that everything is working by starting R in a new terminal:

$ RR version 3.0.1 (2013-05-16) -- "Good Sport"Copyright (C) 2013 The R Foundation for Statistical ComputingPlatform: x86_64-pc-linux-gnu (64-bit)...snip...Type 'q()' to quit R.>

Great. Let's see if R is using the BLAS and LAPACK libraries we selected. To do so, we open another terminal so that we can run a few more shell commands. First, we find the PID of the R process we just started. Your output will look something like this:

$ ps aux | grep exec/R1000 18065 0.4 1.0 200204 87568 pts/1 Sl+ 09:00 0:00 /usr/lib/R/bin/exec/Rroot 19250 0.0 0.0 9396 916 pts/0 S+ 09:03 0:00 grep --color=auto exec/R

The PID is the second number on the '/usr/lib/R/bin/exec/R' line. To see
which BLAS and LAPACK libraries are loaded with that R session, we use the "list open files" command:

$ lsof -p 18065 | grep 'blas\|lapack'R 18065 nathanvan mem REG 8,1 9342808 12857980 /usr/lib/lapack/liblapack.so.3gf.0R 18065 nathanvan mem REG 8,1 19493200 13640678 /usr/lib/openblas-base/libopenblas.so.0

As expected, the R session is using the reference LAPACK (/usr/lib/lapack/liblapack.so.3gf.0) and OpenBLAS (/usr/lib/openblas-base/libopenblas.so.0)

Testing the different BLAS/LAPACK combinations

I used Simon Urbanek's most recent benchmark script. To follow along, first download it to your current working directory:

$ curl http://r.research.att.com/benchmarks/R-benchmark-25.R -O

And then run it:

$ cat R-benchmark-25.R | time R --slaveLoading required package: MatrixLoading required package: latticeLoading required package: SuppDistsWarning message:In library(package, lib.loc = lib.loc, character.only = TRUE, logical.return = TRUE, :there is no package called SuppDists...snip...

Ooops. I don't have the SuppDists package installed. I can easily load it via Michael Rutter's ubuntu PPA:

$ sudo apt-get install r-cran-suppdists

Now Simon's script works wonderfully. Full output

$ cat R-benchmark-25.R | time R --slaveLoading required package: MatrixLoading required package: latticeLoading required package: SuppDistsWarning messages:1: In remove("a", "b") : object 'a' not found2: In remove("a", "b") : object 'b' not foundR Benchmark 2.5===============Number of times each test is run__________________________: 3I. Matrix calculation---------------------Creation, transp., deformation of a 2500x2500 matrix (sec): 1.365666666666672400x2400 normal distributed random matrix ^1000____ (sec): 0.959Sorting of 7,000,000 random values__________________ (sec): 1.0612800x2800 cross-product matrix (b = a' * a)_________ (sec): 1.777Linear regr. over a 3000x3000 matrix (c = a \ b')___ (sec): 1.00866666666667--------------------------------------------Trimmed geom. mean (2 extremes eliminated): 1.13484335940626II. Matrix functions--------------------FFT over 2,400,000 random values____________________ (sec): 0.566999999999998Eigenvalues of a 640x640 random matrix______________ (sec): 1.379Determinant of a 2500x2500 random matrix____________ (sec): 1.69Cholesky decomposition of a 3000x3000 matrix________ (sec): 1.51366666666667Inverse of a 1600x1600 random matrix________________ (sec): 1.40766666666667--------------------------------------------Trimmed geom. mean (2 extremes eliminated): 1.43229160585452III. Programmation------------------3,500,000 Fibonacci numbers calculation (vector calc)(sec): 1.10533333333333Creation of a 3000x3000 Hilbert matrix (matrix calc) (sec): 1.169Grand common divisors of 400,000 pairs (recursion)__ (sec): 2.267Creation of a 500x500 Toeplitz matrix (loops)_______ (sec): 1.213Escoufier's method on a 45x45 matrix (mixed)________ (sec): 1.32600000000001--------------------------------------------Trimmed geom. mean (2 extremes eliminated): 1.23425893178325Total time for all 15 tests_________________________ (sec): 19.809Overall mean (sum of I, II and III trimmed means/3)_ (sec): 1.26122106386747--- End of test ---134.75user 16.06system 1:50.08elapsed 137%CPU (0avgtext+0avgdata 1949744maxresident)k448inputs+0outputs (3major+1265968minor)pagefaults 0swaps

Where the elapsed time at the very bottom is the part that we care about. With OpenBLAS and the reference LAPACK, the script took 1 minute and 50 seconds to run. By changing around the selections with update-alternatives, we can test out R with ATLAS (3:21) or R with the reference BLAS (9:13). For my machine, OpenBLAS is a clear winner.

0 0
原创粉丝点击