Process and format amplicon count data
Abstract
This notebook formats 16S rRNA amplicon counts that have been mapped to HAMBI 16S genes.
1 Setup
Libraries and global variables
Set up some directories
Show/hide code
Untar and decompress
2 Reading and small formatting of data
Coverage data
Show/hide code
ampfiles <- fs::dir_ls(
path = ampdir,
all = FALSE,
recurse = TRUE,
type = "file",
glob = "*.rpkm",
regexp = NULL,
invert = FALSE,
fail = TRUE
)
ampslurped <- readr::read_tsv(
ampfiles,
comment = "#",
col_names = c(
"strainID",
"Length",
"Bases",
"Coverage",
"count",
"RPKM",
"Frags",
"FPKM"
),
col_types = "cddddddd",
id = "file_name"
)
format the data nicely
cleanup temporary directory