Decorators in R

Jun 7, 2016 4 min read R

I recently became interested in the idea of writing decorators in R, as the title of this blog post may suggest. A decorator is a function that takes another function as input, and extends the behavior in some manner. A use-case is when you find yourself writing several functions, and each shares some small aspect of functionality or code. A reasonable example is as follows.

Consider the case where you are writing several functions in R, each of which takes some non-trivial amount of time to compute. You’d like to time each of them. One approach would be the following:

Another approach is to use a decorator. Let’s define our R decorator like so:

And then we can apply our decorator:

For additional style points, we can define an infix operator to represent this operation:

The chief benefit from my perspective is that things are now a little more modular. I could change the decorator to write out the timings, or some other message, into a log if I wished. This would then modify every function that I decorate, which is better in my estimation than going into each function and changing whatever the shared functionality is. Less duplicated code is a good thing. The downsides include more cognitive overhead induced by the new syntax, and I could envision certain scenarios in which this approach obfuscates the behavior of a given function.

A digression on scoping rules in R

The ability to write closures in R enables this technique. An excellent reference on the subject comes from Hadley . I will note that one potential area of confusion that could arise in this context is relating to R’s scoping rules. If you used the decorator to create an object that the decorated function depends on, the result would be an error that the object can’t be found by the decorated function. This is because R uses lexical scoping as opposed to dynamic scoping. In this context, this means that the enclosing environment of functions when they are created is very important. Here’s an example of where this would fail:

To make this work, you’d have to change the enclosure of f to the anonymous function in the decorator.

Now, when f does not see the variable a in its scope, it can search in its enclosure, and it will find the a defined.

For more on this, see ?environment for the following:

Environments consist of a frame, or collection of named objects, and a pointer to an enclosing environment. The most common example is the frame of variables local to a function call; its enclosure is the environment where the function was defined (unless changed subsequently). The enclosing environment is distinguished from the parent frame: the latter (returned by parent.frame) refers to the environment of the caller of a function. Since confusion is so easy, it is best never to use ‘parent’ in connection with an environment (despite the presence of the function parent.env).

When get or exists search an environment with the default inherits = TRUE, they look for the variable in the frame, then in the enclosing frame, and so on.

The global environment .GlobalEnv, more often known as the user’s workspace, is the first item on the search path. It can also be accessed by globalenv(). On the search path, each item’s enclosure is the next item.

The object .BaseNamespaceEnv is the namespace environment for the base package. The environment of the base package itself is available as baseenv().

If one follows the chain of enclosures found by repeatedly calling parent.env from any environment, eventually one reaches the empty environment emptyenv(), into which nothing may be assigned.

The replacement function parent.env<- is extremely dangerous as it can be used to destructively change environments in ways that violate assumptions made by the internal C code. It may be removed in the near future.

The replacement form of environment, is.environment, baseenv, emptyenv and globalenv are primitive functions.

System environments, such as the base, global and empty environments, have names as do the package and namespace environments and those generated by attach(). Other environments can be named by giving a “name” attribute, but this needs to be done with care as environments have unusual copying semantics.

Finishing thoughts

I don’t know if I’ll actually make use of this technique in the future, but it seemed fun and fulfilled a use-case for me earlier today, so figured I’d spend some time to write up the process.

Josh Weinstock

Assistant Professor

Assistant Professor at Emory