Hot Takes for R
Why you should stop pointing and accept equality.
The arrow assigment operator <-
is useless. Before I’m crucified by the R community, hear me out and read this post.
Every time I read code written by an academic, lecturer or someone who uses R frequently, I come across the arrow symbol <-
used for assignment of variables. Never in my career have I seen someone systematically use the equals symbol =
across their code.
Benefits of the arrow
A frequent association with <-
is in how assignment works in R. The variable on the right hand side of the operator is assigned to the one on the left. Hence the arrow makes a lot of sense. We can also do it the other way around, for instance:
3 -> x
y <- 5
cat("x is", x, "and y is", y, "\n")
## x is 3 and y is 5
So the arrow has a benefit when teaching programming, so if you’re a beginner it is obvious which way around variables are assigned. If you’re not a beginner, it might reinforce this knowledge so that you don’t make mistakes.
You can also use the arrow inside of functions to assign variables, for example:
system.time(x <- solve(matrix(rnorm(100^2), 100, 100)))
## user system elapsed
## 0.002 0.000 0.004
Now we can view x
separately, even though it was assigned inside the system.time
function.
x[1:5, 1:5]
## [,1] [,2] [,3] [,4] [,5]
## [1,] 0.50730275 -0.35703351 -0.39847262 0.7788050 0.14551130
## [2,] -0.18092188 0.23194703 0.11982541 -0.4136690 -0.04548487
## [3,] -0.09788994 0.08614508 0.10585201 -0.1223789 -0.01548020
## [4,] 0.06537141 -0.16506482 0.09142846 0.1438222 0.02654319
## [5,] 0.03935626 -0.05008914 -0.09419370 0.2286113 -0.03929192
This is perhaps its most useful application, which you cannot do with =
. The =
sign inside of a function argument is strictly used for matching the function argument with the variable you’re passing through.
The arrow also has historical significance, since R’s predecessor, S, used <-
exclusively. This R-bloggers post explains that S was based on an older language called APL, which was designed on a keyboard that had an arrow key exactly like <-
. But our keyboards now only have a key for =
, right?
Why you should accept the equals sign
But I’m here today to tell you to not use <-
and to use =
instead. Start by asking yourself why you use the arrow? Maybe you have historical reasons and used R before 2001, or more likely, you’re following convention for coding in R that even styling guides recommend.
Firstly, no other programming language uses the arrows, at least, none of the most frequently used ones such as Python, MATLAB, C++, Julia, Javascript, etc. So if you’re like me and use R alongside other programming languages, why would you bother using <-
instead of =
? Wouldn’t you like consistency across the languages you write in, at least so that your muscle memory doesn’t have to change depending on whether you’re fitting a Neural network in Python, or a GAM in R?
Okay fair enough, maybe you don’t mind switching coding styles depending on what language you’re writing in, after all, you are going to be changing a lot more than just the assignment operator. So what other benefits does =
have?
- There is a button for it on the keyboard.
- Consistency between function arguments and assignment.
- Increased readability and neatness since it has fewer character (admittedly, this is subjective).
- Similarity with equality operator (
==
). - No confusion between for example
x<-2
(\(x=2\)) andx < -2
(\(x < -2\)). - Consistency with mathematics itself.
In general, I prefer to use the equals assigment operator over the arrow, because I like to code in more than just one language.
The neat full stop
While I’m on the subject of the arrow, using a full stop in a variable name brings a lot of confusion. This one is a lot less controversial than disregarding the arrow in my opinion. We can name a variable in R as
some.variable = 1
This looks neat! But in other languages, this would throw an error. Why is that? Languages like Python use .
as a class operator, and you use it to access elements of a class exclusively, so you cannot use it in variable names. But R doesn’t have this problem, right?
When defining an S3 class in R, you can overwrite some default functions (such as print
or plot
) with a new function that handles these default operations in a different way for your specific S3 class. To do this for an S3 class called mys3class
, you would write a new functions as follows:
print.mys3class = function(x, ...){
...
}
plot.mys3class = function(x, ...){
...
}
Look familiar? So full stops do have a purpose in R apart from assigning neat variable names. For me, I don’t like using full stops for the main reason I don’t like using the <-
operator: consistency. If I’m using <-
or .
, it will be for a specific purpose where I cant use =
or _
(however, these are rules that I’ve broken myself, and you can probably find instances of it in my portfolios).
So whilst neither the arrow (<-
) for assignment nor the full stop (.
) for variable naming are completely useless, better alternatives do exist. However, if you value your code looking neat above all else, and aren’t bothered by cross platform consistency; then you can use R’s exclusive <-
, or its inconsistent .
without issue.
plot(1:5, 1:5)