Hot Takes for R

Why you should stop pointing and accept equality.

Jun 28, 2020

The arrow assigment operator <- is useless. Before I’m crucified by the R community, hear me out and read this post.

Every time I read code written by an academic, lecturer or someone who uses R frequently, I come across the arrow symbol <- used for assignment of variables. Never in my career have I seen someone systematically use the equals symbol = across their code.

Benefits of the arrow

A frequent association with <- is in how assignment works in R. The variable on the right hand side of the operator is assigned to the one on the left. Hence the arrow makes a lot of sense. We can also do it the other way around, for instance:

3 -> x
y <- 5
cat("x is", x, "and y is", y, "\n")

## x is 3 and y is 5

So the arrow has a benefit when teaching programming, so if you’re a beginner it is obvious which way around variables are assigned. If you’re not a beginner, it might reinforce this knowledge so that you don’t make mistakes.

You can also use the arrow inside of functions to assign variables, for example:

system.time(x <- solve(matrix(rnorm(100^2), 100, 100)))

##    user  system elapsed 
##   0.002   0.000   0.004

Now we can view x separately, even though it was assigned inside the system.time function.

x[1:5, 1:5]

##             [,1]        [,2]        [,3]       [,4]        [,5]
## [1,]  0.50730275 -0.35703351 -0.39847262  0.7788050  0.14551130
## [2,] -0.18092188  0.23194703  0.11982541 -0.4136690 -0.04548487
## [3,] -0.09788994  0.08614508  0.10585201 -0.1223789 -0.01548020
## [4,]  0.06537141 -0.16506482  0.09142846  0.1438222  0.02654319
## [5,]  0.03935626 -0.05008914 -0.09419370  0.2286113 -0.03929192

This is perhaps its most useful application, which you cannot do with =. The = sign inside of a function argument is strictly used for matching the function argument with the variable you’re passing through.

The arrow also has historical significance, since R’s predecessor, S, used <- exclusively. This R-bloggers post explains that S was based on an older language called APL, which was designed on a keyboard that had an arrow key exactly like <-. But our keyboards now only have a key for =, right?

Why you should accept the equals sign

But I’m here today to tell you to not use <- and to use = instead. Start by asking yourself why you use the arrow? Maybe you have historical reasons and used R before 2001, or more likely, you’re following convention for coding in R that even styling guides recommend.

Firstly, no other programming language uses the arrows, at least, none of the most frequently used ones such as Python, MATLAB, C++, Julia, Javascript, etc. So if you’re like me and use R alongside other programming languages, why would you bother using <- instead of =? Wouldn’t you like consistency across the languages you write in, at least so that your muscle memory doesn’t have to change depending on whether you’re fitting a Neural network in Python, or a GAM in R?

Okay fair enough, maybe you don’t mind switching coding styles depending on what language you’re writing in, after all, you are going to be changing a lot more than just the assignment operator. So what other benefits does = have?

There is a button for it on the keyboard.
Consistency between function arguments and assignment.
Increased readability and neatness since it has fewer character (admittedly, this is subjective).
Similarity with equality operator (==).
No confusion between for example x<-2 (\(x=2\)) and x < -2 (\(x < -2\)).
Consistency with mathematics itself.

In general, I prefer to use the equals assigment operator over the arrow, because I like to code in more than just one language.

The neat full stop

While I’m on the subject of the arrow, using a full stop in a variable name brings a lot of confusion. This one is a lot less controversial than disregarding the arrow in my opinion. We can name a variable in R as

some.variable = 1

This looks neat! But in other languages, this would throw an error. Why is that? Languages like Python use . as a class operator, and you use it to access elements of a class exclusively, so you cannot use it in variable names. But R doesn’t have this problem, right?

When defining an S3 class in R, you can overwrite some default functions (such as print or plot) with a new function that handles these default operations in a different way for your specific S3 class. To do this for an S3 class called mys3class, you would write a new functions as follows:

print.mys3class = function(x, ...){
  ...
}
plot.mys3class = function(x, ...){
  ...
}

Look familiar? So full stops do have a purpose in R apart from assigning neat variable names. For me, I don’t like using full stops for the main reason I don’t like using the <- operator: consistency. If I’m using <- or ., it will be for a specific purpose where I cant use = or _ (however, these are rules that I’ve broken myself, and you can probably find instances of it in my portfolios).

So whilst neither the arrow (<-) for assignment nor the full stop (.) for variable naming are completely useless, better alternatives do exist. However, if you value your code looking neat above all else, and aren’t bothered by cross platform consistency; then you can use R’s exclusive <-, or its inconsistent . without issue.

plot(1:5, 1:5)

R programming

Daniel Williams

CDT Student

I have a PhD in statistics/machine learning/data science/AI (whatever you would like to call it) from the University of Bristol, under the COMPASS CDT. I previously studied a masters in mathematics at the University of Exeter. My research was primarily on truncated density estimation and unnormalised models. But I am also interested in AI more generally, including all the learnings, Machine, Deep and Reinforcement (as well as some others!).