There are two interesting packages for working with data on-disk that may be nicer for what we are doing so we can work with **really** big datasets. * [disk.frame](https://www.brodrigues.co/blog/2019-09-03-disk_frame/) * [matter](https://www.bioconductor.org/packages/devel/bioc/vignettes/matter/inst/doc/matter.pdf)