Week 10

Lab 8

Monday, Oct 28

Open lab

Objectives

  • Finalize Task 3
  • Build Makefile
  • Add and revise your analysis

Profiling and Parallelization

Tuesday, Oct 29

Learning objectives

  • Parallelize using sockets with makeCluster()
  • Parallelize loops with packages doMC and foreach
  • Interpret results from code profiling and implement changes

Materials


Big Data

Thursday, Oct 31

Learning objectives

  • R’s memory architecture with package lobstr
  • Understand copy-on-modify for atomic vectors, lists, and data frames
  • Reading big data into R
  • Manipulating big data with package multidplyr

Materials


Exercise of the week

At what vector length x is lobstr::obj_size(integer(x)) consistently half the number of bytes as lobstr::obj_size(numeric(x)) less the initial overhead? Is it ever?

Starting from 0 we can see that

lobstr::obj_size(integer(0))
lobstr::obj_size(numeric(0))
## 48 B
## 48 B

are both 48 bytes. More information about this initial memory overhead is available here.

Based on the below code can you deduce how R handles this data in memory?

diff(sapply(0:100, function(x) lobstr::obj_size(integer(x))))
##   [1]  8  0  8  0 16  0  0  0 16  0  0  0 16  0  0  0 64  0  0  0  0  0  0
##  [24]  0  0  0  0  0  0  0  0  0  8  0  8  0  8  0  8  0  8  0  8  0  8  0
##  [47]  8  0  8  0  8  0  8  0  8  0  8  0  8  0  8  0  8  0  8  0  8  0  8
##  [70]  0  8  0  8  0  8  0  8  0  8  0  8  0  8  0  8  0  8  0  8  0  8  0
##  [93]  8  0  8  0  8  0  8  0
diff(sapply(0:100, function(x) lobstr::obj_size(numeric(x))))
##   [1]  8  8 16  0 16  0 16  0 64  0  0  0  0  0  0  0  8  8  8  8  8  8  8
##  [24]  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8
##  [47]  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8
##  [70]  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8
##  [93]  8  8  8  8  8  8  8  8
Previous
Next