Programming over R
来源:互联网 发布:比比多味豆淘宝 编辑:程序博客网 时间:2024/06/05 04:52
R
is a very fluid language amenable to meta-programming, or alterations of the language itself. This has allowed the late user-driven introduction of a number of powerful features such as magrittr pipes, the foreach system, futures, data.table, and dplyr. Please read on for some small meta-programming effects we have been experimenting with.
Meta-Programming
Meta-programming is a powerful tool that allows one to re-shape a programming language or write programs that automate parts of working with a programming language.
Meta-programming itself has the central contradiction that one hopes nobody else is doing meta-programming, but that they are instead dutifully writing referentially transparent code that is safe to perform transformations over, so that one can safely introduce their own clever meta-programming. For example: one would hate to lose the ability to use a powerful package such as future because we already “used up all the referential transparency” for some minor notational effect or convenience.
That being said, R
is an open system and it is fun to play with the notation. I have been experimenting with different notations for programming over R
for a while, and thought I would demonstrate a few of them here.
Let Blocks
We have been using let
to code over non-standard evaluation (NSE) packages in R
for a while now. This allows code such as the following:
library("dplyr")library("wrapr")d <- data.frame(x = c(1, NA))cname <- 'x'rname <- paste(cname, 'isNA', sep = '_')let(list(COL = cname, RES = rname), d %>% mutate(RES = is.na(COL))) # x x_isNA # 1 1 FALSE # 2 NA TRUE
let
is in fact quite handy notation that will work in a non-deprecated manner with both dplyr 0.5
and dplyr 0.6
. It is how we are future-proofing our current dplyr
workflows.
Unquoting
dplyr 0.6
is introducing a new execution system (alternately called rlang
or tidyeval
, see here) which uses a notation more like the following (but fewer parenthesis, and with the ability to control left-hand side of an in-argument assignment):
beval(d %>% mutate(x_isNA = is.na((!!cname))))
The inability to re-map the right-hand side of the apparent assignment is because the “(!! )
” notation doesn’t successfully masquerade as a lexical token valid on the left-hand side of assignments or function argument bindings.
And there was an R language proposal for a notation like the following (but without the quotes, and with some care to keep it syntactically distinct from other uses of “@”):
ateval('d %>% mutate(@rname = is.na(@cname))')
beval
and ateval
are just curiosities implemented to try and get a taste of the new dplyr
notation, and we don’t recommend using them in production — their ad-hoc demonstration implementations are just not powerful enough to supply a uniform interface. dplyr
itself seems to be replacing a lot of R
‘s execution framework to achieve stronger effects.
Write Arrow
We are experimenting with “write arrow” (a deliberate homophone of “right arrow”). It allows the convenient storing of a pipe result into a variable chosen by name.
library("dplyr")library("replyr")'x' -> whereToStoreResult7 %>% sin %>% cos %->_% whereToStoreResultprint(x) ## [1] 0.7918362
Notice, the value “7” is stored in the variable “x” not in a variable named “whereToStoreResult”. “whereToStoreResult” was able to name where to store the value parametrically.
This allows code such as the following:
for(i in 1:3) { i %->_% paste0('x',i)}
(Please run the above to see the automatic creation of variables named “x1”, “x2”, and “x3”, storing values 1,2, and 3 respectively.)
We know left to right assignment is heterodox; but the notation is very slick if you are consistent with it, and add in some formatting rules (such as insisting on a line break after each pipe stage).
Conclusion
One wants to use meta-programming with care. In addition to bringing in desired convenience it can have unexpected effects and interactions deeper in a language or when exposed to other meta-programming systems. This is one reason why a “seemingly harmless” proposal such as “user defined unary functions” or “at unquoting” takes so long to consider. This is also why new language features are best tried in small packages first (so users can easily chose to include them or not in their larger workflow) to drive public request for comments (RFC) processes or allow the ideas to evolve (and not be frozen at their first good idea, a great example of community accepted change being Haskel’s switch from request chaining IO to monadic IO; the first IO system “seemed inevitable” until it was completely replaced).
- Programming over R
- R Programming: Part 2 - Programming with R
- R Programming -- basic R expressions
- R Programming --vectors
- R programming -- Metrix
- R Programming -- Summary Statistics
- R Programming -- Factors
- R Programming -- data frames
- R Programming Note 1
- R Programming Note 2
- R Programming Note 3
- R Programming Note 4
- R Programming Note 5
- R Programming week1-Subsetting
- R Programming Notes
- R programming(1)
- R Programming Assignment 1
- Functional Programming in R
- span 固定宽度,内容自适应容器自动换行
- Navicat 快捷键
- 跟随屏幕滚动后固定导航到顶端
- java小实现map家族
- java----------华为机试------------合并表记录
- Programming over R
- spring源码(8)注册解析的BeanDefinition
- lintcode(57)三数之和
- Machine Learning知识点一览
- NYOJ 303
- 多源有权图的最短路径 floyd算法(动态规划能解决负权边)7.1.3
- 微博视频怎么下载?微博视频下载和保存工具
- Java中List转换为数组,数组转List
- Linux文件权限