Skip to contents

In the realm of RNA-seq analysis, various key experimental parameters play a crucial role in influencing the statistical power to detect expression changes. Parameters such as sequencing depth, the number of replicates, and others are expected to impact statistical power. To navigate the selection of optimal values for these experimental parameters, we introduce a comprehensive statistical framework known as HTRfit, underpinned by computational simulation. Moreover, HTRfit offers seamless compatibility with DESeq2 outputs, facilitating a comprehensive evaluation of RNA-seq analysis.

HTRfit simulation workflow

In this modeling framework, counts denoted as \(K_{ij}\) for gene i and sample j are generated using a negative binomial distribution. The negative binomial distribution considers a fitted mean \(\mu_{ij}\) and a gene-specific dispersion parameter \(dispersion_i\). The fitted mean \(\mu_{ij}\) is determined by a parameter, \(q_{ij}\), which is proportionally related to the sum of all effects specified using init_variable() or add_interaction(). If basal gene expressions are provided, the \(q_{ij}\) values are scaled accordingly using the gene-specific basal expression value (\(bexpr_i\)). Furthermore, the coefficients \(\beta_i\) represent the natural logarithm fold changes for gene i across each column of the model matrix X. The dispersion parameter \(dispersion_i\) plays a crucial role in defining the relationship between the variance of observed counts and their mean value. In simpler terms, it quantifies how far we expect observed counts to deviate from the mean value for each genes. In addition, HTRfit allows for sequencing depth control using a scalar value specific to each sample (\(s_j\)) applied on the \(\mu_{ij}\) value.