# Week 10

## Lab 8

Open lab

#### Objectives

• Build Makefile

## Profiling and Parallelization

#### Learning objectives

• Parallelize using sockets with makeCluster()
• Parallelize loops with packages doMC and foreach
• Interpret results from code profiling and implement changes

#### Materials

• Slides: HTML, Rmd

• Supplementary

• Getting Started with doMC and foreach vignette
• Nesting foreach loops vignette
• profvis guide

## Big Data

#### Learning objectives

• R’s memory architecture with package lobstr
• Understand copy-on-modify for atomic vectors, lists, and data frames
• Reading big data into R
• Manipulating big data with package multidplyr

## Exercise of the week

At what vector length x is lobstr::obj_size(integer(x)) consistently half the number of bytes as lobstr::obj_size(numeric(x)) less the initial overhead? Is it ever?

Starting from 0 we can see that

lobstr::obj_size(integer(0))
lobstr::obj_size(numeric(0))

## 48 B
## 48 B


Based on the below code can you deduce how R handles this data in memory?

diff(sapply(0:100, function(x) lobstr::obj_size(integer(x))))

##   [1]  8  0  8  0 16  0  0  0 16  0  0  0 16  0  0  0 64  0  0  0  0  0  0
##  [24]  0  0  0  0  0  0  0  0  0  8  0  8  0  8  0  8  0  8  0  8  0  8  0
##  [47]  8  0  8  0  8  0  8  0  8  0  8  0  8  0  8  0  8  0  8  0  8  0  8
##  [70]  0  8  0  8  0  8  0  8  0  8  0  8  0  8  0  8  0  8  0  8  0  8  0
##  [93]  8  0  8  0  8  0  8  0

diff(sapply(0:100, function(x) lobstr::obj_size(numeric(x))))

##   [1]  8  8 16  0 16  0 16  0 64  0  0  0  0  0  0  0  8  8  8  8  8  8  8
##  [24]  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8
##  [47]  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8
##  [70]  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8
##  [93]  8  8  8  8  8  8  8  8

Previous
Next