This document will help you complete the following steps:

  1. Install R.
  2. Install RStudio.
  3. Install Rmarkdown and make an R Notebook.
  4. Complete Assignment #0 (due the first day of class).

If you need help, you may contact KIS at 847-467-2100 (Global Hub) or 312-503-0159 (Weiboldt) or by email at kis@kellogg.northwestern.edu. They have tested these instructions and should be able to help. Please include a screenshot of any errors you encounter.

Install R

Download and run the installer for your operating system. If the installer asks questions, accept the defaults unless you have a reason for doing otherwise.

Are you running an old version of Mac OS X? The current versions of R require OS X 10.11 or higher. If you are using an older version of OS X, scroll down the page to see instructions to handle “Binaries for legacy OS X systems.” Follow the directions to install either R 3.3.3 (10.9 or higher) or R 3.2.1 (10.6-10.8). This will probably (but no guarantees) be sufficient for your needs at Kellogg. However, if you are running a version of OS X older than 10.11, you really should first update the OS.

Install RStudio

RStudio and R are not the same thing! RStudio provides convenient access to R through a graphical interface, as well as numerous other benefits. You need to install both R and RStudio, and we recommend installing R first.

RStudio installers for Windows, OS X, and Linux are at https://www.rstudio.com/. Download and run the installer for your operating system.

Optional: Create a new RStudio Project

An RStudio project helps you manage all the code and data associated with a particular task. You can create projects to organize your work in different ways. We recommend creating one RStudio project for each assignment in each course as this will give you the most flexibility. When you create a project, RStudio simply creates a file with a “.Rproj” suffix in whatever location you deem to be the project folder.

To create a new project in RStudio:

  1. From the File menu, select New Project.
  2. Click New Directory.
  3. Click New Project.
  4. Type the name of the new folder in which you want to store all course assignments and data. RStudio will automatically create this folder for you. For example, you might call this folder strt-analytics-assignments.
  5. Specify the folder in which RStudio should create this project folder.
  6. Click on Create Project.

RStudio will create the project and open it. Information about the project is stored in the project directory in a new file called strt-analytics-assignments.Rproj. The easiest way to open the project in the future is to double click on this .Rproj file. In the steps above, you can alternatively choose to place the new project in an existing folder, rather than creating a new one for the project.

You should store files for each course within this project directory. When you work on these files, you should first open the project in RStudio.

R Notebooks

An R Notebook is a tool for combining text, chunks of code, and output into a single, pretty document. The code for generating a notebook is stored in a file with a .Rmd file extension, which you can open and edit in RStudio. You can write code and text in the notebook inside Rstudio, and execute (a.k.a. run) chunks of code one at a time or all in one go. When you execute code within the notebook, the results appear beneath the code. If you are familiar with Stata, you can think of a notebook as putting together your do-file (code), comments, and output all in one file.

When you’re done creating your notebook in Rstudio with all the code and text you want, you can output it to one of several nicely formatted file formats. For our purposes, HTML files will suffice; they can then be opened in a web browser (which is probably how you opened this notebook output), or printed to PDF if you wish.

Technical digression: R also allows you to use knitr with LaTeX to output PDF files with more sophisticated formatting. LaTeX is a powerful typesetting program usually used for scientific papers, but it is extremely finicky (i.e. often throws crytpic errors and refuses to output files), so we are not using it for this class.

Update R and install Rmarkdown

Before you’re able to make a notebook, you will need to have Rmarkdown working on your computer. This requires your version of R to be up to date, and then installing Rmarkdown. So, launch RStudio and in the console (the left or bottom-left pane, labeled “Console” in a tiny font in its top-left corner) type the following lines of code, with each line followed by the Enter key:

install.packages("installr")
library(installr)
updateR()

When you run the updateR() command, a dialog box might pop up. If it informs you that your version of R is already up to date, then you are good to go. If not, it will either automatically update R, or you will need to select the option Update R (sometimes written as R (updateR)) and follow the prompts to get the latest version of R.

You are now ready to install Rmarkdown:

install.packages("rmarkdown")

RStudio might ask you, “Do you want to install from sources the package which needs compilation?” If that happens, just type y in the console and press Enter, and the installation should proceed smoothly.

You are now ready to use Rmarkdown to create a well-formatted document that knits together R code, text, and output.

Using R notebooks

To open an existing notebook, you can double-click on the corresponding .Rmd file, which will open it up in RStudio, where you can directly edit the notebook.

To create a new notebook in RStudio:

  1. From the File menu, select New File and then R Notebook.
  2. This will open a new .Rmd file, that you can then save to your computer using File and Save As.
  3. To tell R to make this into an HTML file, the first few lines of your notebook should look something like this:

(triple dash here, see .Rmd file)

title: “Assignment #: Title”

author: Firsname Lastname

subtitle: “Analytics for Strategy”

output: html_notebook

(triple dash here, see .Rmd file)

You can then begin typing text, code, etc.

Here is what code looks like once it is executed and outputted to a notebook:

## this is a comment line (anything following a # symbol that appears earlier in the same line will be treated by R as a text comment, not code)
## let's make sure R knows that 1+1=2
1+1  ## you can also put a comment on the same line as code, after the code
[1] 2
## let's do something more exciting and plot a parabola:
x <- c(-5,-4,-3,-2,-1,0,1,2,3,4,5)  ## the <- symbols tell R that it should create an object called x that stores a series of integers from -5 to 5
y <- x^2  ## this line tells R to create an object called y that takes each element of x and squares it
x  ## this line tells R to just show us what's in x (not recommended for large objects!)
 [1] -5 -4 -3 -2 -1  0  1  2  3  4  5
y  ## you can already guess what this line does
 [1] 25 16  9  4  1  0  1  4  9 16 25
plot( x, y )  ## this line tells R to create a scatter plot of y on x

rm(x,y)  ## this line deletes x and y from memory; if we now try to display them, R will give us an error
x
Error: object 'x' not found

When working in a notebook inside Rstudio, you will need to surround chunks of code with “flags” that tell R where your code begins and ends. If you open the .Rmd file used to generate this notebook in RStudio, you will see those flags surrounding the greyed-out areas that indicate code. You can also use a shortcut instead of the somewhat clumsy ``` flags by clicking the Insert Chunk button on the toolbar or by pressing Ctrl+Alt+I.

To make R actually run a chunk of code that you have written in your notebook, you will need to tell it to run (a.k.a. execute) the chunk by clicking the Run button within the chunk or by placing your cursor inside it and pressing Ctrl+Shift+Enter.

Outputting notebooks to HTML

When you save the notebook, an HTML file containing the code and output will be saved alongside it (click the Preview button or press Ctrl+Shift+K to preview the HTML file). You can also set the right side of your screen in Rstudio to preview the document (instead of the preview opening in a new window) by clicking the gear button next to the Preview button (both located directly above the code) and then select the option Preview in Viewer Pane. The preview in the right pane will auto-update every time you save the notebook (using the Save button or good old Ctrl+S). In order to get all the output from your code to appear in the HTML file or its preview, make sure execute all the code.

Complete Assignment #0

You are now ready to create R notebooks! It is now time to complete Assignment #0, which is available on Canvas. This assignment will serve three purposes:

  1. Make sure that everything installed correctly.

  2. Install additional packages that you will need throughout the quarter. These are essentially plug-ins that give you access to extra functionality that is not automatically included in the base installation of R.

  3. Get you comfortable with R basics and provide a quick refresher of linear regression and of coding.

If you find Assignment #0 difficult, you are strongly encouraged to brush up on R and/or regression concepts prior to the start of the quarter. Further resources to help you do this at home can be found on the course site. Information about in-person R help will be circulated to the class listserv.

---
title: "Getting Started With R, RStudio, and R Notebooks"
author: Professors Brett Gordon, Robert McDonald, and Elena Prager
subtitle: "Kellogg School of Management"
output: html_notebook
---



This document will help you complete the following steps:

1. Install R.
2. Install RStudio.
3. Install Rmarkdown and make an R Notebook.
4. Complete Assignment #0 (due the first day of class).

If you need help, you may contact KIS at 847-467-2100
(Global Hub) or 312-503-0159 (Weiboldt) or by email at
kis@kellogg.northwestern.edu. They have tested these instructions and
should be able to help. Please include a screenshot of any errors you encounter.


# Install R 

Download and run the installer for your operating system. If the installer asks questions, accept the defaults unless you have a reason for doing otherwise.

* **Windows:** [https://cran.r-project.org/bin/windows/base/](https://cran.r-project.org/bin/windows/base/). At the top of the page, click on the link to the executable file under "Download R 3.5.1 for Windows" or similar. *The version number may be different when you read this.*

* **Mac OS X:** [https://cran.r-project.org/bin/macosx/](https://cran.r-project.org/bin/macosx/). Under "Latest release", look for the link on the left to download the package "R-3.5.1.pkg" or similar. *The version number may be different when you read this.*

*Are you running an old version of Mac OS X?* The current versions of R require OS X 10.11 or higher. If you are using an older version of OS X, scroll down the page to see instructions to handle "Binaries for legacy OS X systems." Follow the directions to install either R 3.3.3 (10.9 or higher) or R 3.2.1
(10.6-10.8). This will probably (but no guarantees) be sufficient for
your needs at Kellogg.  However, if you are running a version of OS X
older than 10.11, you really should first update the OS.


# Install RStudio

RStudio and R are not the same thing! RStudio provides convenient access to R through a graphical interface, as well as numerous other benefits. *You need to install both R and RStudio, and we recommend installing R first.*

RStudio installers for Windows, OS X, and Linux are at [https://www.rstudio.com/](https://www.rstudio.com/products/rstudio/download/#download). Download and run the installer for your operating system.


# Optional: Create a new RStudio Project

An **RStudio project** helps you manage all the code and data associated with a particular task. You can create projects to organize your work in different ways. We recommend creating one RStudio project for each assignment in each course as this will give you the most flexibility. When you create a project, RStudio simply creates a file with a ".Rproj" suffix in whatever location you deem to be the project folder. 

To create a new project in RStudio:

1. From the File menu, select New Project.
2. Click New Directory.
3. Click New Project.
4. Type the name of the new folder in which you want to store all course assignments and data. RStudio will automatically create this folder for you. For example, you might call this folder *strt-analytics-assignments*.
5. Specify the folder in which RStudio should create this project folder.
6. Click on Create Project.

RStudio will create the project and open it. Information about the project is stored in the project directory in a new file called *strt-analytics-assignments.Rproj*. The easiest way to open the project in the future is to double click on this .Rproj file. In the steps above, you can alternatively choose to place the new project in an existing folder, rather than creating a new one for the project. 

**You should store files for each course within this project directory.** When you work on these files, you should first open the project in RStudio. 


# R Notebooks

An **R Notebook** is a tool for combining text, chunks of code, and output into a single, pretty document. The code for generating a notebook is stored in a file with a **.Rmd file extension**, which you can open and edit in RStudio. You can write code and text in the notebook inside Rstudio, and execute (a.k.a. run) chunks of code one at a time or all in one go. When you execute code within the notebook, the results appear beneath the code. If you are familiar with Stata, you can think of a notebook as putting together your do-file (code), comments, and output all in one file.

When you're done creating your notebook in Rstudio with all the code and text you want, you can output it to one of several nicely formatted file formats. For our purposes, HTML files will suffice; they can then be opened in a web browser (which is probably how you opened this notebook output), or printed to PDF if you wish.

Technical digression: R also allows you to use *knitr* with LaTeX to output PDF files with more sophisticated formatting. LaTeX is a powerful typesetting program usually used for scientific papers, but it is extremely finicky (i.e. often throws crytpic errors and refuses to output files), so we are not using it for this class.


# Update R and install Rmarkdown

Before you're able to make a notebook, you will **need to have Rmarkdown working** on your computer. This requires your **version of R to be up to date**, and then installing Rmarkdown. So, launch RStudio and in the console (the left or bottom-left pane, labeled "Console" in a tiny font in its top-left corner) type the following lines of code, with each line followed by the *Enter* key:
```{r, results='hide'}
install.packages("installr")
library(installr)
updateR()
```
When you run the *updateR()* command, a dialog box might pop up. If it informs you that your version of R is already up to date, then you are good to go. If not, it will either automatically update R, or you will need to select the option *Update R* (sometimes written as *R (updateR)*) and follow the prompts to get the latest version of R. 

You are now ready to install Rmarkdown:
```{r, results='hide'}
install.packages("rmarkdown")
```
RStudio might ask you, "Do you want to install from sources the package which needs compilation?" If that happens, just type *y* in the console and press *Enter*, and the installation should proceed smoothly.

You are now ready to use Rmarkdown to create a well-formatted document that knits together R code, text, and output.


# Using R notebooks

**To open an existing notebook**, you can double-click on the corresponding .Rmd file, which will open it up in RStudio, where you can directly edit the notebook.

**To create a new notebook** in RStudio:

1. From the File menu, select New File and then R Notebook.
2. This will open a new .Rmd file, that you can then save to your computer using File and Save As.
3. To tell R to make this into an HTML file, the first few lines of your notebook should look something like this:

*(triple dash here, see .Rmd file)*

title: "Assignment #: Title"

author: Firsname Lastname

subtitle: "Analytics for Strategy"

output: html_notebook

*(triple dash here, see .Rmd file)*


You can then begin typing text, code, etc.

Here is what code looks like once it is executed and outputted to a notebook:
```{r}
## this is a comment line (anything following a # symbol that appears earlier in the same line will be treated by R as a text comment, not code)
## let's make sure R knows that 1+1=2
1+1  ## you can also put a comment on the same line as code, after the code
## let's do something more exciting and plot a parabola:
x <- c(-5,-4,-3,-2,-1,0,1,2,3,4,5)  ## the <- symbols tell R that it should create an object called x that stores a series of integers from -5 to 5
y <- x^2  ## this line tells R to create an object called y that takes each element of x and squares it
x  ## this line tells R to just show us what's in x (not recommended for large objects!)
y  ## you can already guess what this line does
plot( x, y )  ## this line tells R to create a scatter plot of y on x
rm(x,y)  ## this line deletes x and y from memory; if we now try to display them, R will give us an error
x
```


When working in a notebook inside Rstudio, **you will need to surround chunks of code with "flags" that tell R where your code begins and ends**. If you open the .Rmd file used to generate this notebook in RStudio, you will see those flags surrounding the greyed-out areas that indicate code. You can also use a shortcut instead of the somewhat clumsy ``` flags by clicking the *Insert Chunk* button on the toolbar or by pressing *Ctrl+Alt+I*. 

To make R actually run a chunk of code that you have written in your notebook, you will need to tell it to run (a.k.a. execute) the chunk by clicking the *Run* button within the chunk or by placing your cursor inside it and pressing *Ctrl+Shift+Enter*. 


# Outputting notebooks to HTML

When you save the notebook, an HTML file containing the code and output will be saved alongside it (click the *Preview* button or press *Ctrl+Shift+K* to preview the HTML file). You can also set the right side of your screen in Rstudio to preview the document (instead of the preview opening in a new window) by clicking the *gear* button next to the *Preview* button (both located directly above the code) and then select the option *Preview in Viewer Pane*. The preview in the right pane will auto-update every time you save the notebook (using the *Save* button or good old *Ctrl+S*). In order to get all the output from your code to appear in the HTML file or its preview, make sure execute all the code.


# Complete Assignment #0

You are now ready to create R notebooks! It is now time to complete Assignment #0, which is available on Canvas. This assignment will serve three purposes:

1. Make sure that everything installed correctly.

2. Install additional packages that you will need throughout the quarter. These are essentially plug-ins that give you access to extra functionality that is not automatically included in the base installation of R.

3. Get you comfortable with R basics and provide a quick refresher of linear regression and of coding.

**If you find Assignment #0 difficult**, you are strongly encouraged to brush up on R and/or regression concepts **prior to the start of the quarter**. Further resources to help you do this at home can be found on the course site. Information about in-person R help will be circulated to the class listserv.