R’s Scoping
来源:互联网 发布:酷派手机自动开启数据 编辑:程序博客网 时间:2024/06/10 13:27
[Update: 10 September 2010 I didn’t study Radford Neal’s example closely enough before making an even bigger mess of things. I’d like to blame it on HTML formatting, which garbled Radford’s formatting and destroyed everyone else’s examples, but I was actually just really confused about what was going on in R. So I’m scratching most of the blog entry and my comments, and replacing them with Radford’s example and a pointer to the manual.]
A Better Mousetrap
There’s been an ongoing discussion among computational statisticians about writing something better than R, in terms of both speed and comprehensibility:
- Andrew Gelman: The Future of R
- Julien Cornebise (via Christian Robert): On R Shortcomings
- Radford Neal: Two Surprising Things about R (following up his earlier series, Design flaws in R)
Radford Neal’s Example
Radford’s example had us define two functions,
> f = function () { + g = function () a+b+ a = 10+ g()+ }> h = function () { + a = 100+ b = 200+ f()+ }> b=3> h()[1] 13
This illustrates what’s going on, assuming you can parse R. I see it, I believe it. The thing to figure out is why a=10 was picked up in the call to g() in f, but b=200 was not picked up in the call to f() in h. Instead, the global assignment b=3 was picked up.
RTFM
Even after I RTFM-ed, I was still confused.
- Venables, W. N., D. M. Smith and the R Core Development Team. 2010. Introduction to R 2.11.1.
It has a section 10.7 titled “Scope”, but I found their example
cube <- function(n) { sq <- function() n*n n*sq()}
and the following explanation confusing,
The variable n in the function sq is not an argument to that function. Therefore it is a free variable and the scoping rules must be used to ascertain the value that is to be associated with it. Under static scope (S-Plus) the value is that associated with a global variable named n. Under lexical scope (R) it is the parameter to the function cube since that is the active binding for the variable n at the time the function sq was defined. The difference between evaluation in R and evaluation in S-Plus is that S-Plus looks for a global variable called n while R first looks for a variable called n in the environment created when cube was invoked.
I was particularly confused by the “environment created when cube was invoked” part, because I couldn’t reconcile it with Radford’s example.
Let’s consider a slightly simpler example without nested function calls.
> j =10> f = function(x) j*x> f(3)[1] 30> j =12> f(3)[1] 36
This shows it can’t be the value of j at the time f is defined, because it changes when I change j later. I think it’s actually determining how it’s going to find j when it’s defined. If there’s a value of j that’s lexically in scope (not just defined in the current environment), it’ll use that value. If not, it’ll use the environment of the caller. And things that go on in subsequent function definitions and calls, as Radford’s example illustrates, don’t count.
Am I the only one who finds this confusing? At least with all your help, I think I finally understand what R’s doing.
13 Responses to “R’s Scoping”
- R’s Scoping
- R Programming week2 Functions and Scoping Rules
- Emulating dynamic scoping in GNU R
- R: 如何理解变量和环境的Lexical Scoping Rule
- Scoping & Hoisting
- static scoping and dynamic scoping
- Lexical Scoping 和 Dynamic Scoping
- S&R&S_V9.5.1118
- S&R&S_V9.5.1118
- LOG_ARCHIVE_FORMAT %r%s%t
- Bien s r
- Python_%r&%s
- #R#Google's R Style Guide
- Scoping.py源代码分析
- JavaScript Scoping and Hoisting
- JavaScript Scoping and Hoisting
- Day1-4.Scoping
- 040902 R DataGrid's RadioButton
- ZOJ2313 Chinese Girls' Amusement(大数运算,找规律)
- 题目1139:最大子矩阵
- 使用Maven插件整合protocol buffer
- 一篇SSM框架整合友好的文章(一)
- 虚拟机的基本管理和快照
- R’s Scoping
- CMake学习笔记(二)——CMake语法
- QT在构造函数中退出程序
- Android中的socket编程,基础
- iOS移动端架构的那些事
- 判定Java源文件名称
- Nginx slab的实现 --- 第二篇“基于页的内存分配”
- 注解(Annotation)自定义注解(二)--运行时注解解析
- 【解题报告】UVALive 3938 线段树深入使用
September 9, 2010 at 1:21 pm | Reply
Hey, Bob–you should be posting this stuff on our main blog now!
September 10, 2010 at 12:27 pm | Reply
HQ’s still working out brand management issues. I think a post like this one would’ve made sense on your blog. I’ll start posting there soon.
I’m both excited and intimidated by the size of your audience.
Luckily, I don’t mind being wrong in public (once per topic). Especially when I can get tutelage from the likes of Radford Neal!
September 9, 2010 at 2:11 pm | Reply
I believe you’re incorrect about scoping in R, as the following example shows:
> f <- function(x) { y g f(4)
Error in g(x) : object ‘y’ not found
As in most languages, it’s possible to create global variables in R, which is what your example shows. However, functions effectively use lexical scope, if you define that as ‘called functions won’t accidentally see my variables’.
Personally I *love* the R language. I know there’s a lot of talk about redesigning it or replacing it somehow, but I’m skeptical that it’s a good idea.
September 10, 2010 at 11:56 am | Reply
Thanks. I updated the body of the blog post to point to the comments.
I think the function definition got garbled somehow (or maybe it’s just an unfamiliar R syntax convention).
September 9, 2010 at 3:32 pm | Reply
You’re wrong about R’s scoping rules. It uses lexical scoping.
Here’s an example demonstrating this:
> f = function ()
+ { g = function () a+b
+ a = 10
+ g()
+ }
>
> h = function ()
+ { a = 100
+ b = 200
+ f()
+ }
>
> b = 3
> print(h())
[1] 13
The expression a+b is evaluated with b from the global environment, and a from the lexically enclosing environment of g. The b inside h is not seen even though with dynamic scoping it would take precedence over the global b.
September 10, 2010 at 1:25 am | Reply
Looks like you’ve tripped over lambda calculus and closures, things that are extremely common in many languages (particularly functional languages) but NOT in the world of Java and C derivatives. This is one of the best features of Javascript, in my opinion far more useful than the prototyping that gets more attention. And one of the most obvious shortcomings in Java (although generics was a nice alternative that reduced the need for closures in some cases). Even Java’s granddaddy, Smalltalk, has these features. Perhaps the confusion (between your interpretation of the problem and Radford’s) stems from something akin to Javascripts slightly flawed implementation of closures whereby variables in the topmost scope are actually global but all other variables are properly scoped.
September 10, 2010 at 11:54 am | Reply
Ironic, given that I used to teach programming language theory and write about denotational semantics! And I got my feet wet in professional programming by integrating the C implementation of Javascript (ECMAScript, technically) into SpeechWorks’s semantic interpreter!!!
As you say, there’s really nothing like a closure in C or Java. About as close as I get is writing search algorithms with a continuation-passing style.
September 10, 2010 at 10:04 am | Reply
Here’s an even simpler example:
> f <- function(x) { y g f(4)
Error in g(x) : object ‘y’ not found
September 10, 2010 at 4:06 pm | Reply
Super-simple example of lexical scoping in R:
> x g f <- function() {x f()
[1] “A”
If R was dynamically scoped, the ‘x’ in g() would take its value from the calling environment, where it is ‘B’. However, because R is lexically scope, it comes from the environment where g() is defined, where it is ‘A’.
September 10, 2010 at 6:43 pm | Reply
> This is also why I’m still unclear about Radford’s example, becuase the a=10
> was part of the environment when g() was called in h, but b=200 was not part
> of the environment when f() was called in h.
The difference is that a=10 is part of the environment where g() was DEFINED in f. But the b=200 is not part of the environment where f() is DEFINED. That unbound variables take their values from the defining, rather than calling, environment is what makes R (and most other languages) lexically scoped.
September 10, 2010 at 6:53 pm | Reply
> This shows it can’t be the value of j at the time f is defined, because
> it changes when I change j later. I think it’s actually determining how
> it’s going to find j when it’s defined.
Right. This example is no more mysterious than referencing an instance variable in java. If the variable’s value is changed, then subsequent references will see this change. In your example, f() and j are defined in the same environment. This is where the free variable j in f() is bound. When you change j’s value in that environment, f() picks it up.
September 12, 2010 at 6:45 pm | Reply
Thanks for the explanation in the previous comment.
Java’s bit more restrictive. For instance, you can’t copy the R style and write:
You have to declare the variable
a
to be a static class variable, or you have to define a local variable before the anonymous inner class and declare it final.And there’s no way to do the equivalent of R’s attaching a list, which promotes a data structure to local variable. Turns out that doesn’t quite work the way I was thinking it did in R, either. For instance,
but it works if there’s not already a value.
September 13, 2010 at 12:09 pm | Reply
From Christian Robert’s latest blog post on R, Simply Start Over and Build Something Better, I found this amazing snippet:
Cool!