Using R — .Call(“hello”)

来源:互联网 发布:矩阵不可逆有什么性质 编辑:程序博客网 时间:2024/06/06 13:17
This entry is part 10 of 14 in the series Using R

In an introductory post on R APIs to C code, Calling C Code ‘Hello World!’, we explored the .C() function with some ‘Hello World!’ baby steps.  In this post we will make a leap forward by implementing the same functionality using the .Call() function.

Is .Call() better than .C()?

A heated but friendly conversation took place on the r-devel email forum this past March about R’s copying of arguments and the merits of .C() and .Call().  It is perhaps best to just include a highlight fromthis exchange.  Here is Simon Urbanek responding to Hervé Pagès:

The important differences between the two R interfaces to C code are summarized here:

.C()

  • allows you to write simple C code that knows nothing about R
  • only simple data types can be passed
  • all argument type conversion and checking must be done in R
  • all memory allocation must be done in R
  • all arguments are copied locally before being passed to the C function (memory bloat)

.Call()

  • allows you to write simple R code
  • allows for complex data types
  • allows for a C function return value
  • allows C function to allocate memory
  • does not require wasteful argument copying
  • requires much more knowledge of R internals
  • is the recommended, modern approach for serious C programmers

To allow readers to compare for themselves how difficult or easy it is to switch from .C() to .Call() we will re-implement our three “Hello World!” examples using the .Call() interface.

Getting used to SEXP

The first thing you have to embrace when using the .Call() interface is the new way of dealing with R objects inside your C code.  Excellent introductory information and example code is available here:

  • Calling C code from R (Sigal Blay, 2004) *
  • Calling other languages from R (R.M. Ripley, 2009) *
  • R API cheat sheet (Simon Urbanek, 2012) *

In preparation for working with .Call() you will want to familiarize yourself with the location of R’s include files.  The following Unix shell commands show how to find where R is installed and then look at the contents of the include directory:

Here’s what they contain:

Rconfig.hvarious configuration flagsRdefines.hlots of macros of interest, includes Rinternals.hRembedded.hfunction declarations for embedding R in C programsR_extdirectory of include files for specific data types, etc.R.hincludes all the files found in R_extRinterface.hprovides hooks for external GUIsRinternals.hcore R data structuresRmath.hmath constants and function declarationsRversion.hversion string componentsS.hmacros for S/R compatibility

With the .Call() interface, the C function needs to be of type SEXP — a pointer to a SEXPREC or Simple EXPression RECord.  We’ll get the definition ofSEXP and everything else we need by including bothR.h and Rdefines.h in our code.  So here is the C code for our first, brain dead C function —helloA1.c:

Note that, even though we are returning R_NilValue (aka NULL), the function is declared to be of typeSEXP.  The function will always be of type SEXP, as will any arguments.  It will be up to the C code to convert other data types into and out ofSEXP.  As in the previous post, you should compile this code with R CMD SHLIB helloA1.c.  Here is the very simple R function we need to add towrappers.R:

Finally, what does it look like when invoked from R?

Whew!  That was a lot of complexity just to run “Hello World!”.  However, the value of this complexity will become apparent as we move forward.

PROTECT against garbage collection

One of the things R does well is pick up the garbage we leave lying around.  (If you’ve ever lived through a garbage haulers’ strike you know this is a good thing.)  Unused objects are disposed of after they are no longer needed (i.e. after there are no more active references to them) to free up memory.  As we write C code that uses R functions and structures we need to make sure that R knows when it should not toss something out and, after we are done, when it is again OK.  This is done with thePROTECT and UNPROTECT functions.

Here is our next iteration of “Hello World!” where we will allocate space for an R character vector, assign our greeting to the first element and then return the vector:

Note that we allocate memory for a character vector of length # with NEW_CHARACTER(#).  It is worth taking a look in the R include files to see how this and similar macros are defined:

So we could have used allocVector(STRSXP,1) instead of NEW_CHARACTER(1) and you will see plenty of the former in R source code and packages.   Similarly you can grep for “_ELT” or “mkChar” and learn about those.  There really isn’t any definitive source for information and you will have to get comfortable googling, poking around source code examples, examining the R include files and even checking theR-devel mailing list to get a sense of the R functions that are available for getting C code to work with R objects.  I would recommend spending some time withRinternals.h and Rdefines.h.

After R CMD SHLIB‘ing we will again create a very simple wrapper and then run the code from R:

Double Whew!  So far it still seems like .Call() is a big headache.  But we haven’t really tried to do anything in our C code yet.  The complexity/benefit balance evens out a little in our final example.

Casting about in the R header files

The title of this section really says it all.  As you start to do more in your C code you will need to learn how to cast character strings into SEXP objects, SEXP objects into integers,etc. etc.  There is a finite, but large, amount to know before you become expert.  The two links in the “Getting used to SEXP” section above have excellent examples as doesProgramming with Data: Using and Extending R by Dirk Eddelbuettel.

Here is our last “Hello World!” example, the one that counts the characters in incoming greetings.  This example shows how R macros defined inRdefines.h are used to extract elements from a vector, how vector elements are cast intochar and int and how you need toUNPROTECT the same number of elements that you placed on thePROTECT stack.

After R CMD SHLIB, here is the wrapper and the R session:

Yes, it’s still at the double Whew! level but we did some worthwhile things like allocate space for R objects and correctly harness garbage collection.  If there were any halfway decent API docs for all this I would have no hesitation in recommending the .Call() interface to anyone writing C code.  As it is, however, there will be a painful learning curve.  If all you are doing is processing a vector of numbers and returning a simple scalar or vector result then the .C() interface will certainly be much easier — assuming you can take the memory hit.  If, on the other hand, you are doing things like using a C library to convert a bunch of raw data into more complex structures then you are going to have to learn to do things the R way.

But there is hope!  In the next post we will investigate using the Rcpp package to simplify this robust but complex interface to C code.  Hopefully we won’t have to become C++ wizards to do so.

Example Packages using .Call()

The .Call() interface is heavily used in many R packages.  Along with poring overWriting R Extensions document it is important to have some example code to work from.  Here is a running list of the packages I found with useful example code:

  • Rcsdp — R interface to the CSDP semidefinite programming library.

 More Information

Hadley Wickham has written an excellent tutorial on using the .Call() interface.

0 0