Statistical Computing 1

This set of portfolios contains the notes that were submitted for the first statistical computing module in the COMPASS CDT.

To view these portfolios, please see the menu bar on the left.

These notes focus primarily on using R. R has a lot of advantages for data science, but personally my favourite is its simplicity. In other programming languages such as Python, there are many different types of data structures (such as pandas data frames, numpy arrays etc.), where R has a consistent infrastructure which will not get confusing.

For these reasons, R is a good beginners language, and does not require a lot of learning. It is not perfect though, as it can be quite slow in some cases, and does not have as wide usage as Python.

The basics of R are covered first, as well as programming paradigms such as reproducibility, using Github.