Skip to contents

This function prepares the countMatrix and metadata for fitting by converting the countMatrix to a long format and joining it with metadata. Optionally, it can apply median ratio normalization and a custom transformation to the countMatrix.

Usage

prepareData2fit(
  countMatrix,
  metadata,
  response_name = "kij",
  groupID = "geneID",
  row_threshold = 0,
  transform = NULL,
  normalization = NULL
)

Arguments

countMatrix

Count matrix.

metadata

Metadata data frame.

response_name

String referring to the target variable name that is being modeled and predicted (default: "kij").

groupID

String referring to the group variable name (default: "geneID").

row_threshold

Numeric threshold for removing rows with all counts below a specified value. Default 0. This filtering is applied before transformation and normalization.

transform

A custom R expression to apply to each element of the countMatrix. This expression should be provided as a character string. For example, to apply log transformation, use "log(x)". Note that x represents each element in the countMatrix. See examples for more details. The transformation is applied before normalization (if normalization = TRUE).

normalization

a vector character specifying method to use (default: NULL, possible choices: c('MRN', 'TTM')) - MRN: median ratio normalization - TMM: Trimmed Mean of M-values

Value

Data frame suitable for fitting.

Examples

# Initialize variables and create mock RNA-Seq data
list_var <- init_variable()
#> Variable name should not contain digits, spaces, or special characters.
#> If any of these are present, they will be removed from the variable name.
mock_data <- mock_rnaseq(list_var, n_genes = 3, 2,2)
#> Building mu_ij matrix
#> INFO: 1 genes have all(mu_ij) < 1, indicating very low counts. Consider removing them for future analysis using prepareData2fit with row_threshold = 10. To detect them in future experiment, try increasing sequencing depth.
#> k_ij ~ Nbinom(mu_ij, dispersion)
#> Counts simulation: Done
# Prepare data for fitting with log transformation
data2fit <- prepareData2fit(mock_data$counts, mock_data$metadata, transform = "log(x)")
# Prepare data for fitting with custom expression
data2fit <- prepareData2fit(mock_data$counts, mock_data$metadata, transform = "sqrt(x + 1)")