Week 14

Big Data

Monday, Apr 6

Learning objectives

  • R’s memory architecture with package lobstr
  • Understand copy-on-modify for atomic vectors, lists, and data frames
  • Reading big data into R
  • Manipulating big data with package multidplyr


Databases and SQL

Wednesday, Apr 8

Learning objectives

  • Understand DBMS and terminology
  • Connect to a SQL database
  • Query a database in R
  • Connecting dplyr to a database


Lab 11

Friday, Apr 10



Exercise of the week

  1. Add Salaries from package Lahman as a table to your in-memory database.

  2. Compute the team salaries for each team in 2016 and display the 5 teams with the highest payroll. Which team had the lowest payroll in that year?

  3. Which 10 teams had the highest winning percentage since 1990?

  4. Combine the batting and salaries tables? Take a look at ?dplyr::join.