Abstract

staRoxy is a dedicated R package designed to streamline oxylipidomics abundance data analysis [1]. It provides functions for data cleaning, normalization, statistical modeling, and integrated visualization for exploratory and differential analyses, as well as specialized tools for profile comparison and precursor identification and enrichment. This vignette describes the recommended workflow and core functionalities of the package.

Funding

The development of the staRoxy package has been partially supported by the São Paulo Research Foundation (FAPESP; grant number 2025/00242-1) and FAPESP/CEPID-Redoxoma (grant number 2013/07937-8).

Acknowledgments

We thank Nicholas A. Brown, Giuseppe G. F. Leite, Daniela Bizinelli, Vinícius A. Pereira, and other friends for their valuable feedback and discussions during the development of the package. Special thanks to Cynthia D. Diniz for the logo design.

Citation

If you use staRoxy in published research, please cite: [1]

Several other R packages are available for lipidomics and metabolomics analyses, including MetaboAnalystR, xcms, MSnbase, LipidMS, and lipidr, among others.

Overview

Lipidomics experiments generate quantitative matrices in which lipids are measured across multiple biological conditions. Final concentration units are typically nmol/ml or nmol/mg, although they may vary depending on the analytical method or adaptations used. Proper analysis requires systematic preprocessing, normalization, quality control, and statistical modeling.

The staRoxy package provides a reproducible framework for oxylipidomics abundance data analysis, originally developed for an oxylipin profiling protocol [1]. It integrates:

The workflow is designed to move from raw abundance matrices to biologically interpretable results in a standardized and transparent manner.

Setup, Initialization, and Data Import

First, install the package. staRoxy is currently not available on the Comprehensive R Archive Network (CRAN). The development version can be installed via BiocManager, which will also install the required dependencies:

if (!require("BiocManager", quietly = TRUE))
    install.packages("BiocManager")
BiocManager::install("pefjordao/staRoxy")

Then, load the package:

library(staRoxy)
#> ── staRoxy v0.2.0 ───────────────────────────── University of São Paulo - USP ──
#> ℹ Oxylipidomics Abundance Data Analysis Pipeline
#> Developed by: Pedro Henrique F. Jordão & Eduardo M. Reis
#> Official Repository: <https://github.com/pefjordao/staRoxy>
#> ────────────────────────────────────────────────────────────────────────────────

The staRoxy package requires two input files:

  1. A oxylipid abundance data frame (data_path). Supported formats include .xlsx, .xls, .csv (comma or semicolon separated), .tsv, .txt, or .dat.
  2. A metadata data frame (meta_path) describing sample annotations.

Input Data Structure

Abundance Data Frame File

  • First column: Should be named “oxylipin”, and contain the names of all identified oxylipin species.
  • Remaining columns: Should correspond to individual samples.
  • Values: The first row should contain sample names, and the subsequent rows should contain the quantitative abundance values for each oxylipin.

Note: Missing (NA), non-detected (below the limit of detection, LOD), or non-quantified values (below the limit of quantification, LOQ) should be left empty. Zero values are automatically converted to NA.

Metadata file

  • First column: Should be named “sample” and contain the name of each sample. Sample names must match exactly those provided in the abundance data frame file.
  • Second column: Should be named “group” and contain the experimental group associated with each sample.
  • Additional optional columns may be added to specify covariates (e.g., batch, sex, age).

Oxylipin Nomenclature

Oxylipin nomenclature is notoriously inconsistent across the literature. To ensure reproducibility, staRoxy adopts a standardized naming convention based on the Cayman Chemical catalog and LIPID MAPS guidelines [1]. Abbreviated names are used throughout the package to improve readability in plots and tables.

Automatic Harmonization

For the 125 core oxylipins identified in the reference protocol [1], staRoxy includes an automatic cleaning engine. During data import, the package detects and fixes common encoding artifacts (such as corrupted Greek letters \(\alpha\), \(\beta\), \(\gamma\) and symbols like \(\Delta\)) that frequently occur when exporting data from mass spectrometry software to Excel.

Manual Entry and Unicode Escape Sequences

While the automatic engine covers the standard protocol, using Unicode escape sequences is highly recommended for all users and becomes essential when adding new features or custom precursors to the database.

  • For the 125 Core Oxylipins: Even though staRoxy can fix these names automatically, providing them in Unicode (or standardized format) simplifies the processing and avoids any ambiguity.

  • For New/Custom Oxylipins: If you are expanding the database with your own features (see the Customizing the Database section below), you must use Unicode escape sequences. Since the automatic engine may not recognize custom naming patterns, Unicode is the only way to ensure a “perfect string match” and prevent errors when moving files between Windows (Latin-1) and Mac/Linux (UTF-8).

Symbol Meaning Unicode Sequence Example
\(\alpha\) Alpha \u03b1 "PGF2\u03b1"
\(\beta\) Beta \u03b2 "8-iso-15-keto-PGF2\u03b2"
\(\gamma\) Gamma \u03b3 "13-HOTrE-\u03b3"
\(\Delta\) Delta \u0394 "15-deoxy-\u039412,14-PGJ2"

Example Data

The staRoxy package provides three example datasets for testing and demonstration:

Note: All datasets were analyzed using staRoxy (v0.2.0) and was described in the original protocol [1]. For the purposes of this vignette, we will focus on the cellular pellet dataset.

You can load these datasets into your environment using the data function:

data(data_oxy_lps_pellet)
data(metadata_oxy_lps_pellet)

Creating the staRoxy Object

The pipeline begins by importing the abundance data frame and the metadata, ensuring proper matching between samples and their annotations. If either file (abundance data frame or metadata) contains samples not present in the other, these will be automatically removed and reported to the user. In addition, samples containing 100% missing values (NA) are automatically excluded to prevent issues in downstream analyses. This process is performed using the read_oxy function, which creates an object that stores the data and is used throughout the downstream analysis pipeline:

staRoxy_object <- read_oxy(data_path = data_oxy_lps_pellet,
                           meta_path = metadata_oxy_lps_pellet)
#> 
#> ── staRoxy Data Summary ────────────────────────────────────────────────────────
#> ✔ Successful upload of 16 features across 8 samples.
#> 
#> ── Experimental Groups: ──
#> 
#> • Control: n = 4
#> • LPS: n = 4
#> ────────────────────────────────────────────────────────────────────────────────

Missing Value Filtering and Log2 Transformation

Before statistical modeling, the data must be filtered and transformed to meet the assumptions of linear models. The filter_oxy function provides three filtering strategies:

  1. Type I: \(\ge\)prop valid values in AT LEAST one group.
  2. Type II: \(\ge\)prop valid values across ALL samples.
  3. Type III: strict mode. NO missing values allowed.

The proportion of valid values can be adjusted using the prop argument (default = 0.5).

staRoxy_object <- filter_oxy(staRoxy_object,
                             type = "I",
                             prop = 0.5)
#> ℹ Type I: >50% valid values in AT LEAST one group.
#> ✔ Filtering complete: 15 oxylipins remaining.

Note: Type I filtering favors the retention of oxylipins that are exclusive to one of the analyzed groups.

After filtering, transform_oxy performs:

staRoxy_object <- transform_oxy(staRoxy_object)
#> ✔ Log2 transformation complete.
#> ℹ All oxylipins retained: no zero variance or full NA profiles detected.

Quality Control

Before conducting statistical tests, it is essential to evaluate the global structure and robustness of the data. This step ensures that the data meets the assumptions required for robust linear modeling.

Data Distribution

The function check_data_distribution provides a diagnostic quality-control report by:

  • Assessing sample size per group.
  • Identifying critical group imbalance.
  • Evaluating data scale compatibility with log2 transformation.
  • Emitting warnings regarding potential statistical power limitations.

This function assesses whether the data have been appropriately transformed to the log2 scale using the Pearson’s skewness coefficient (\(\gamma_1\)) [3], which measures the asymmetry of the distribution relative to its mean:

\[\gamma_1 = E\left[\left(\frac{X-\mu}{\sigma}\right)^3\right] = \frac{\sum_{i=1}^{n} (x_i - \bar{x})^3 / n}{s^3}\]

Where:

  • \(x_i\): abundance of an individual oxylipin in sample \(i\).
  • \(\bar{x}\): sample mean.
  • \(n\): number of observations.
  • \(s\): standard deviation.
  • \((x_i - \bar{x})^3\): third central moment; preserves sign to indicate skew direction (positive or negative tail).
  • \(|\gamma_1| < 1.2\): acceptable distribution (likely already log2-transformed).
  • \(|\gamma_1| > 1.2\): strong skewness; log2 transformation is recommended if not already performed (to avoid double transformation).

To assess group imbalance, this function evaluates the ratio between the largest and smallest group sizes in order to identify potential biases in linear modeling. The imbalance ratio (\(R\)) [4] is defined as:

\[R = \frac{\max(n_{group})}{\min(n_{group})}\]

check_data_distribution(staRoxy_object)
#> 
#> ── Model Advisor Report ────────────────────────────────────────────────────────
#> ℹ Testing distribution symmetry using Pearson's skewness coefficient (γ1).
#> ℹ Scale Check: Distribution is reasonably symmetric (γ1 = -0.55).
#> ✔ Data scale looks appropriate for linear modeling (likely log-transformed).
#> ℹ Checking sample sizes per group:
#> ✔ Group 'Control': n = 4. Sample size is adequate.
#> ✔ Group 'LPS': n = 4. Sample size is adequate.
#> ℹ Imbalance Ratio (R): 1x.
#> ✔ Experimental design is balanced (R <= 2.0). Assumptions are likely met.
#> ────────────────────────────────────────────────────────────────────────────────

If \(R > 2.0\) (e.g., one group is at least twice the size of another), the function issues a warning indicating that caution is required when interpreting variance-related results, as unbalanced designs may affect the homogeneity of dispersion and impact downstream statistical analyses.

Missing Data Diagnostic

After initial filtering, abundance data may still contain missing values (NA). Therefore, it is important to assess the origin of these missing values, distinguishing between Missing Not At Random (MNAR), typically associated with values below the Limit of Detection (LOD) or Limit of Quantification (LOQ), and Missing At Random (MAR), which often arise from experimental or technical variability.

The plot_na_diagnostic function provides two visualization modes. With mode = "cor", it generates a scatter plot relating Mean Log2 Abundance to the Proportion of Missing Values. A negative correlation, where oxylipins with a higher proportion of NA are observed at lower abundances, suggests an MNAR pattern:

plot_na_diagnostic(staRoxy_object,
                   mode = "cor")

Alternatively, with mode = "dist", the function generates plots relating Log2 Abudance to either Density or Cumulative Fraction. If the distribution of oxylipins with NA values (TRUE) shows higher density or is shifted toward lower abundances, this indicates that these features are primarily detected at low concentrations and become missing below the LOD or LOQ, consistent with an MNAR pattern:

plot_na_diagnostic(staRoxy_object,
                   mode = "dist")

Note: In highly sparse datasets where no oxylipins reach 100% detection, the function automatically partitions features based on median missingness. Features are then compared between “High Missingness” and “Low/Average Missingness” groups, ensuring that diagnostic analyses remain informative even in sparse datasets.

Based on this diagnostic, it is possible to determine how missing values should be handled in downstream functions that require explicit NA treatment, such as plot_pca_oxy, plot_heatmap_oxy, profile_test, limma_oxy, and loading_dissimilarity (see below). In each function, the na_method argument defines the imputation strategy to be used, either "minprob" (for MNAR-type missingness) or "rf" (for MAR-type missingness).

Exploratory Analysis

Exploratory analysis allows inspection of the overall structure of the data prior to hypothesis testing and may also help identify oxylipins with distinct patterns between experimental groups.

Analyzed and Exclusive Oxylipins

The function get_exclusives_oxy identifies oxylipins that were detected in only one of the analyzed groups. Note that data imputation is performed only in analyses that require it, and the original data frame remains unchanged throughout.

get_exclusives_oxy(staRoxy_object)
#> 
#> ── Exclusive Oxylipins Detection ───────────────────────────────────────────────
#> ℹ Group 'Control': No exclusive oxylipins found.
#> ✔ Group 'LPS': 7 exclusive oxylipins found.
#>   - 13-oxo-ODE
#>   - 9-oxo-ODE
#>   - 9-HODE
#>   - 11-HEPE
#>   - 14(15)-EpETE
#>   - 13-HDoHE
#>   - PGE3
#> ────────────────────────────────────────────────────────────────────────────────

Conversely, the function get_analyzed_oxy returns the names of all oxylipins retained after filtering:

get_analyzed_oxy(staRoxy_object)
#> ℹ Found 15 oxylipins detected in the dataset.
#>  [1] "12-HHTrE"             "13-oxo-ODE"           "9-oxo-ODE"           
#>  [4] "9-HODE"               "11-HEPE"              "14(15)-EpETE"        
#>  [7] "11-HETE"              "15-HETrE"             "15-deoxy-Δ12,14-PGD2"
#> [10] "13-HDoHE"             "15-keto-PGE2"         "PGE3"                
#> [13] "PGE2"                 "PGD2"                 "PGF2α"

Descriptive Statistics

Descriptive statistics can be computed using the descriptives_oxy function, which returns oxylipin identifiers, experimental groups, number of valid values, number of missing values (NA), mean, median, standard deviation, standard error of the mean, and coefficient of variation. The sort_by argument allows sorting the results in decreasing order based on a selected metric:

descriptives_oxy(staRoxy_object,
                 sort_by = "mean")
#> 
#> ── Descriptive Statistics ──────────────────────────────────────────────────────
#> ℹ Sorting by: "mean" | Groups: "LPS, Control"
#>               Feature   group n_valid n_na  mean median   sd  sem cv_perc
#>          14(15)-EpETE     LPS       4    0 10.15  10.09 0.36 0.18    3.58
#>                  PGE2     LPS       4    0 10.02   9.69 0.71 0.36    7.11
#>              13-HDoHE     LPS       4    0  9.70   9.86 0.39 0.19    4.02
#>              12-HHTrE     LPS       4    0  9.56   9.46 0.38 0.19    4.01
#>               11-HEPE     LPS       4    0  9.40   9.37 0.20 0.10    2.14
#>               11-HETE     LPS       4    0  9.29   9.31 0.10 0.05    1.08
#>                  PGD2     LPS       4    0  8.92   8.93 0.21 0.11    2.40
#>                 PGF2α     LPS       4    0  8.67   8.53 0.38 0.19    4.42
#>                  PGE3     LPS       3    1  8.49   8.22 0.53 0.31    6.30
#>          15-keto-PGE2     LPS       4    0  8.26   7.90 0.82 0.41    9.99
#>             9-oxo-ODE     LPS       2    2  7.75   7.75 0.08 0.05    0.99
#>                9-HODE     LPS       3    1  7.34   7.34 0.19 0.11    2.61
#>          15-keto-PGE2 Control       3    1  7.12   7.09 0.19 0.11    2.65
#>              12-HHTrE Control       4    0  7.07   6.93 0.58 0.29    8.16
#>            13-oxo-ODE     LPS       4    0  7.06   7.17 0.46 0.23    6.53
#>                  PGE2 Control       4    0  7.05   7.02 0.21 0.11    3.00
#>                 PGF2α Control       3    1  6.88   6.77 0.21 0.12    3.06
#>              15-HETrE     LPS       4    0  6.53   6.53 0.11 0.06    1.70
#>               11-HETE Control       4    0  6.03   5.91 0.29 0.15    4.88
#>                  PGD2 Control       3    1  5.89   5.95 0.23 0.13    3.95
#>  15-deoxy-Δ12,14-PGD2     LPS       3    1  4.14   4.04 0.19 0.11    4.54
#>              15-HETrE Control       2    2  3.86   3.86 0.25 0.18    6.45
#>  15-deoxy-Δ12,14-PGD2 Control       3    1  3.53   3.51 0.21 0.12    5.84
#>               11-HEPE Control       0    4   NaN     NA   NA   NA      NA
#>              13-HDoHE Control       0    4   NaN     NA   NA   NA      NA
#>            13-oxo-ODE Control       0    4   NaN     NA   NA   NA      NA
#>          14(15)-EpETE Control       0    4   NaN     NA   NA   NA      NA
#>                9-HODE Control       0    4   NaN     NA   NA   NA      NA
#>             9-oxo-ODE Control       0    4   NaN     NA   NA   NA      NA
#>                  PGE3 Control       0    4   NaN     NA   NA   NA      NA
#> ────────────────────────────────────────────────────────────────────────────────

The group argument can be used to restrict the output to a specific experimental group:

descriptives_oxy(staRoxy_object,
                 group = "LPS")
#> 
#> ── Descriptive Statistics ──────────────────────────────────────────────────────
#> ℹ Sorting by: "feature" | Groups: "LPS"
#>               Feature group n_valid n_na  mean median   sd  sem cv_perc
#>               11-HEPE   LPS       4    0  9.40   9.37 0.20 0.10    2.14
#>               11-HETE   LPS       4    0  9.29   9.31 0.10 0.05    1.08
#>              12-HHTrE   LPS       4    0  9.56   9.46 0.38 0.19    4.01
#>              13-HDoHE   LPS       4    0  9.70   9.86 0.39 0.19    4.02
#>            13-oxo-ODE   LPS       4    0  7.06   7.17 0.46 0.23    6.53
#>          14(15)-EpETE   LPS       4    0 10.15  10.09 0.36 0.18    3.58
#>              15-HETrE   LPS       4    0  6.53   6.53 0.11 0.06    1.70
#>  15-deoxy-Δ12,14-PGD2   LPS       3    1  4.14   4.04 0.19 0.11    4.54
#>          15-keto-PGE2   LPS       4    0  8.26   7.90 0.82 0.41    9.99
#>                9-HODE   LPS       3    1  7.34   7.34 0.19 0.11    2.61
#>             9-oxo-ODE   LPS       2    2  7.75   7.75 0.08 0.05    0.99
#>                  PGD2   LPS       4    0  8.92   8.93 0.21 0.11    2.40
#>                  PGE2   LPS       4    0 10.02   9.69 0.71 0.36    7.11
#>                  PGE3   LPS       3    1  8.49   8.22 0.53 0.31    6.30
#>                 PGF2α   LPS       4    0  8.67   8.53 0.38 0.19    4.42
#> ────────────────────────────────────────────────────────────────────────────────

Principal Component Analysis

Principal Component Analysis (PCA) is a fundamental step in omics data analysis. It provides a global overview of:

  • Sample clustering.
  • Group separation.
  • Outlier sample detection.
  • Overall data quality.

The function plot_pca_oxy generates a publication-ready PCA plot allowing customization of colors and graphical parameters:

# You can define group-specific colors for visualization
my_colors <- c(
  "LPS" = "#EE7942", 
  "Control"  = "#698B22"
)

plot_pca_oxy(staRoxy_object,
             colors = my_colors)
#> ! Removed 7 oxylipins with zero variance/data in at least one group.
#> ✔ Filtering complete: 8/15 oxylipins retained.
#> ℹ Imputing via 'minprob': assuming missingness is due to low concentration (MNAR).

Note: Missing values (NA) are handled through a multi-step quality control pipeline. Oxylipins are retained only if they have valid measurements in at least 50% of samples (≥50%) in at least one experimental group (user-adjustable). Missingness is assessed via the plot_na_diagnostic function, distinguishing between Missing Not At Random (MNAR) and Missing At Random (MAR). For MNAR-type oxylipins, missing values values are imputed using the group-wise arithmetic mean (equivalent to the geometric mean on the log2 scale). When an entire group lacks detection, values are imputed as below the Limit of Detection (LOD) or failed to meet the Limit of Quantification (LOQ) criteria, using the global minimum for that oxylipin, with stochastic noise sampled from a normal distribution (10% of the global standard deviation) to preserve variance and ensure numerical stability. For MAR-type oxylipins, missing values are imputed using a Random Forest approach (via the missForest package), leveraging the global metabolic profile. Finally, oxylipins completely absent from any experimental group (when require_all_groups = TRUE) or features with zero variance after imputation are automatically removed. This prevents mathematical singularities in downstream multivariate analyses and improves model robustness and biological interpretability.

Z-score Heatmap

The function plot_heatmap_oxy generates a publication-ready heatmap, allowing customization of colors and graphical parameters.

The function applies a \(Z\)-score transformation per oxylipin, placing all oxylipins on the same scale. In the resulting heatmap:

  • A value of \(0\) represents the mean abundance of that oxylipin
  • Positive values indicate values above the mean
  • Negative values indicate values below the mean
plot_heatmap_oxy(staRoxy_object,
                 colors = my_colors)
#> ! Removed 7 oxylipins with zero variance/data in at least one group.
#> ✔ Filtering complete: 8/15 oxylipins retained.
#> ℹ Imputing via 'minprob': assuming missingness is due to low concentration (MNAR).

Note: Missing values (NA) are handled through a multi-step quality control pipeline. Oxylipins are retained only if they have valid measurements in at least 50% of samples (≥50%) in at least one experimental group (user-adjustable). Missingness is assessed via the plot_na_diagnostic function, distinguishing between Missing Not At Random (MNAR) and Missing At Random (MAR). For MNAR-type oxylipins, missing values values are imputed using the group-wise arithmetic mean (equivalent to the geometric mean on the log2 scale). When an entire group lacks detection, values are imputed as below the Limit of Detection (LOD) or failed to meet the Limit of Quantification (LOQ) criteria, using the global minimum for that oxylipin, with stochastic noise sampled from a normal distribution (10% of the global standard deviation) to preserve variance and ensure numerical stability. For MAR-type oxylipins, missing values are imputed using a Random Forest approach (via the missForest package), leveraging the global metabolic profile. Finally, oxylipins completely absent from any experimental group (when require_all_groups = TRUE) or features with zero variance after imputation are automatically removed. This prevents mathematical singularities in downstream multivariate analyses and improves model robustness and biological interpretability.

For hierarchical clustering, a fully imputed matrix is used to compute Euclidean distances. Original NA positions are then restored for visualization and displayed in a distinct color (default: "grey90"), distinguishing true non-detections from low-abundance measurements. NA values can be hidden from the plot using the hide_na = TRUE argument. However, note that hierarchical clustering will be recomputed accordingly:

plot_heatmap_oxy(staRoxy_object,
                 colors = my_colors,
                 hide_na = TRUE)
#> ! Removed 7 oxylipins with zero variance/data in at least one group.
#> ✔ Filtering complete: 8/15 oxylipins retained.
#> ℹ Imputing via 'minprob': assuming missingness is due to low concentration (MNAR).

Sample names can also be hidden using show_sample_names = FALSE for cleaner visualization:

plot_heatmap_oxy(staRoxy_object,
                 colors = my_colors,
                 show_sample_names = FALSE)
#> ! Removed 7 oxylipins with zero variance/data in at least one group.
#> ✔ Filtering complete: 8/15 oxylipins retained.
#> ℹ Imputing via 'minprob': assuming missingness is due to low concentration (MNAR).

Oxylipin Profile Analysis

The function profile_test allows direct comparison of oxylipin abundance patterns across experimental groups, providing a multivariate overview of oxylipidomic signatures.

The data are standardized so that each oxylipin has a mean of \(0\) and a variance of \(1\) (\(Z\)-score scaling). This step is essential because oxylipins present at very high concentrations could otherwise dominate the analysis, masking biologically relevant signals from lower-abundance molecules.

The function then performs a PERMANOVA analysis [5] to test whether the global oxylipin composition differs between experimental groups. This test provides the following metrics:

profile_test(staRoxy_object)
#> ! Removed 7 oxylipins with zero variance/data in at least one group.
#> ✔ Filtering complete: 8/15 oxylipins retained.
#> ℹ Imputing via 'minprob': assuming missingness is due to low concentration (MNAR).
#> 
#> ── Multivariate Profile Analysis (PERMANOVA) ───────────────────────────────────
#> ℹ Groups: Control, LPS | Oxylipins: 8 | Permutations: 10000
#> ℹ Total Variance Explained (R²): 0.885 | P-value: 0.026
#> ────────────────────────────────────────────────────────────────────────────────

When more than two groups are present and the global test is statistically significant, the function automatically performs pairwise PERMANOVA comparisons with Bonferroni correction, allowing identification of the specific groups responsible for the observed differences.

For visualization, if plot = TRUE, the function generates a PCoA (Principal Coordinates Analysis) plot based on Euclidean distance, which should produce results equivalent to those obtained with PCA. This visualization summarizes the behavior of oxylipin profiles across conditions and is particularly useful for identifying coordinated changes in oxylipid mediators and for comparing global oxylipidomic signatures between biological groups:

profile_test(staRoxy_object,
             colors = my_colors,
             plot = TRUE)
#> ! Removed 7 oxylipins with zero variance/data in at least one group.
#> ✔ Filtering complete: 8/15 oxylipins retained.
#> ℹ Imputing via 'minprob': assuming missingness is due to low concentration (MNAR).
#> 
#> ── Multivariate Profile Analysis (PERMANOVA) ───────────────────────────────────
#> ℹ Groups: Control, LPS | Oxylipins: 8 | Permutations: 10000
#> ℹ Total Variance Explained (R²): 0.885 | P-value: 0.026
#> ────────────────────────────────────────────────────────────────────────────────

Note: Missing values (NA) are handled through a multi-step quality control pipeline. Oxylipins are retained only if they have valid measurements in at least 50% of samples (≥50%) in at least one experimental group (user-adjustable). Missingness is assessed via the plot_na_diagnostic function, distinguishing between Missing Not At Random (MNAR) and Missing At Random (MAR). For MNAR-type oxylipins, missing values values are imputed using the group-wise arithmetic mean (equivalent to the geometric mean on the log2 scale). When an entire group lacks detection, values are imputed as below the Limit of Detection (LOD) or failed to meet the Limit of Quantification (LOQ) criteria, using the global minimum for that oxylipin, with stochastic noise sampled from a normal distribution (10% of the global standard deviation) to preserve variance and ensure numerical stability. For MAR-type oxylipins, missing values are imputed using a Random Forest approach (via the missForest package), leveraging the global metabolic profile. Finally, oxylipins completely absent from any experimental group (when require_all_groups = TRUE) or features with zero variance after imputation are automatically removed. This prevents mathematical singularities in downstream multivariate analyses and improves model robustness and biological interpretability.

Differential Abundance Analysis

This section demonstrates how to perform differential abundance analysis of oxylipins between experimental groups in order to identify Differentially Abundant Oxylipins (DAO).

The function limma_oxy uses the limma (Linear Models for Microarray Data) framework [6], which is a gold standard for the analysis of experiments with relatively small sample sizes. This framework applies an empirical Bayes approach that, in simple terms, borrows information across all oxylipins to stabilize variance estimates and improve the robustness of statistical tests. More details can be found in the original publication.

This function automatically fits a linear model for each oxylipin and generates all possible pairwise comparisons between experimental groups:

stats <- limma_oxy(staRoxy_object)
#> ! Removed 7 oxylipins with zero variance/data in at least one group.
#> ✔ Filtering complete: 8/15 oxylipins retained.
#> ℹ Imputing via 'minprob': assuming missingness is due to low concentration (MNAR).
#> ✔ limma differential analysis complete.

To inspect the comparisons performed between experimental groups (referred to as “contrasts”), the function list_contrasts can be used:

list_contrasts(stats)
#> 
#> ── Experimental Contrasts ──────────────────────────────────────────────────────
#> ℹ Found 1 comparison in this analysis:
#> • Contrast 1: Control vs. LPS
#> ────────────────────────────────────────────────────────────────────────────────

Note: Missing values (NA) are handled through a multi-step quality control pipeline. Oxylipins are retained only if they have valid measurements in at least 50% of samples (≥50%) in at least one experimental group (user-adjustable). Missingness is assessed via the plot_na_diagnostic function, distinguishing between Missing Not At Random (MNAR) and Missing At Random (MAR). For MNAR-type oxylipins, missing values values are imputed using the group-wise arithmetic mean (equivalent to the geometric mean on the log2 scale). When an entire group lacks detection, values are imputed as below the Limit of Detection (LOD) or failed to meet the Limit of Quantification (LOQ) criteria, using the global minimum for that oxylipin, with stochastic noise sampled from a normal distribution (10% of the global standard deviation) to preserve variance and ensure numerical stability. For MAR-type oxylipins, missing values are imputed using a Random Forest approach (via the missForest package), leveraging the global metabolic profile. Finally, oxylipins completely absent from any experimental group (when require_all_groups = TRUE) or features with zero variance after imputation are automatically removed. This prevents mathematical singularities in downstream multivariate analyses and improves model robustness and biological interpretability.

Differential Abundance Analysis Visualization

Rank Plot

Next, it is possible to visualize the Log2 Fold-change (Log2FC), Standard Error of the Mean (SEM), and adjusted \(P\)-values (Benjamini–Hochberg correction) using the function plot_rank_oxy:

plot_rank_oxy(stats,
              contrast = 1,
              top_n = 15,
              rank_by = "logFC")

By default, the function ranks oxylipins according to the absolute Log2FC, but they can also be ranked according to the adjusted \(P\)-value:

plot_rank_oxy(stats = stats,
              contrast = 1,
              top_n = 15,
              rank_by = "p")

Violin and Box Plots

Another way to visualize oxylipin abundance across groups is through boxplots and violin plots. The function plot_violin_oxy allows visualization of the abundance of a selected oxylipin across all experimental groups, typically displayed on the log2 scale, with customizable colors and graphical parameters:

plot_violin_oxy(staRoxy_object,
                stats,
                oxylipin = "PGD2",
                colors = my_colors,
                show_sig = TRUE,
                sig_y_nudge = 0.4)
#> ! 1 missing/infinite value(s) removed.

Abundance can also be visualized on the original (non-log2) scale using the log_scale = FALSE argument, and to define the measurement unit prior to transformation using the unit argument:

plot_violin_oxy(staRoxy_object,
                stats,
                oxylipin = "PGD2",
                colors = my_colors,
                show_sig = TRUE,
                sig_y_nudge = 0.6,
                log_scale = FALSE,
                unit = "fmol/mg")
#> ! 1 missing/infinite value(s) removed.

Multiple oxylipins may also be displayed simultaneously by providing a vector of feature names to the oxylipin argument. In this mode, significance bars are automatically disabled to avoid visual clutter.

target_oxylipins <- c("11-HETE", "PGD2", "PGE2", "15-HETrE")

plot_violin_oxy(staRoxy_object,
                stats,
                oxylipin = target_oxylipins,
                colors = my_colors)
#> ! 3 missing/infinite value(s) removed.
#> ℹ Significance bars are disabled when plotting multiple oxylipins simultaneously.

Correlation Analysis

Certain oxylipin species may exhibit patterns of interest that warrant further investigation. To support this, staRoxy provides two approaches for correlation analysis: between two oxylipins within the same group, or for a single oxylipin across two experimental groups.

The function cor_within_oxy computes correlations between two oxylipins within the same group using log2-transformed abundance values:

cor_within_oxy(staRoxy_object,
               oxylipin1 = "11-HETE",
               oxylipin2 = "PGE2",
               group = "LPS",
               colors = my_colors)
#> 
#> ── Intra-group Correlation Analysis ────────────────────────────────────────────
#> ℹ Comparison: '11-HETE' vs. 'PGE2' | Group: 'LPS'
#> ℹ Method: SPEARMAN (Normal=FALSE, Outliers=TRUE)
#> ℹ R = 1 | P = 0.0833 | Result: No significant correlation
#> ────────────────────────────────────────────────────────────────────────────────

Alternatively, the function cor_between_oxy allows comparison of the log2-transformed abundance of a single oxylipin between two experimental groups:

cor_between_oxy(staRoxy_object,
                oxylipin = "11-HETE",
                group1 = "LPS",
                group2 = "Control")
#> 
#> ── Inter-group Correlation Analysis ────────────────────────────────────────────
#> ℹ Lipid: '11-HETE' | Comparison: 'LPS' vs. 'Control'
#> ℹ Method: SPEARMAN (Normal=TRUE, Outliers=TRUE)
#> ℹ R = -0.6 | P = 0.417 | Result: No significant correlation
#> ────────────────────────────────────────────────────────────────────────────────

Note: Both functions assess normality and outliers to automatically select the appropriate correlation method (Pearson or Spearman) when method = "auto" (default). Alternatively, the method can be specified manually ("pearson" or "spearman").

Loading Pattern Dissimilarity Analysis

The staRoxy package provides a method to assess the stability of biological architecture between two conditions by comparing oxylipin contributions to principal component structure.

While conventional PCA focuses on sample clustering to identify global differences between conditions, the loading_dissimilarity function shifts the focus to the biological architecture of the system. It evaluates whether the underlying relationships between oxylipins, their relative contributions to the system’s variance, remain stable or undergo reorganization between two conditions.

In this analysis, we perform independent PCAs for each experimental group. For each group \(g \in \{1, 2\}\), the loadings (\(L\)) of principal component \(i\) (default: PC1, specified by pc_index = 1) are extracted as:

\[L_{g,i} = [w_{1,i}, w_{2,i}, \dots, w_{j,i}, \dots, w_{n,i}]^T\]

Here, \(L_{g,i}\) denotes the loading vector for group \(g\) and component \(i\), \(w_j\) represents the loading of the \(j\)-th oxylipin, \(n\) is the number of oxylipins shared between groups after filtering, and \(T\) denotes the transpose operator.

Similarity between groups is quantified by correlating the loading vectors using Pearson or Spearman correlation (method = "auto" by default). The resulting global similarity metric (\(R^2\)) is defined as:

\[R^2 = \text{cor}(L_{1,i}, L_{2,i})^2\]

This function performs PCA independently for each group. Because Eigenvector signs in PCA are arbitrary, the function automatically detects and corrects axis inversion by reflecting one component when necessary, ensuring meaningful comparisons. If \(\text{cor}(L_{1,i}, L_{2,i}) < 0\), the sign of \(L_{2,i}\) is internally inverted to ensure that comparisons reflect the magnitude of association.

Note: This approach is conceptually related to Tucker’s Congruence Coefficient (\(\phi\)), a standard metric of factor similarity [7].

High correlation indicates that oxylipin contributions are preserved between groups (stable biological structure), whereas low correlation suggests a reorganization of lipid relationships, where oxylipins may change their relative importance or interaction patterns.

Divergence is defined as:

\[\text{Divergence} = 1 - R^2\]

Statistical significance is assessed using a Fisher \(Z\)-transformation to test whether the observed correlation (\(r_{obs}\)) differs from a predefined stability threshold (r_stability, default = 0.6):

\[z = \text{atanh}(r) = \frac{1}{2} \ln \left( \frac{1 + r}{1 - r} \right)\]

Where:

\[Z = \frac{z_{obs} - z_{stability}}{SE}\]

Where:

The \(P\)-value for divergence is then computed from the standard normal distribution:

\[P(\text{div}) = \Phi(Z)\] Where:

In addition, loading_dissimilarity generates a scatter plot comparing loadings between groups. Points are colored according to \(\Delta\)Loading, and oxylipins with \(\Delta\)Loading ≥ threshold (default = 0.3) are labeled. Features in quadrants II and IV represent oxylipins with the strongest divergence in contribution between groups:

loading_dissimilarity(staRoxy_object,
                      group1 = "LPS",
                      group2 = "Control",
                      pc_index = 1)
#> ! Removed 7 oxylipins with zero variance/data in at least one group.
#> ✔ Filtering complete: 8/15 oxylipins retained.
#> ℹ Imputing via 'minprob': assuming missingness is due to low concentration (MNAR).
#> 
#> ── Loading Pattern Dissimilarity Analysis ──────────────────────────────────────
#> ℹ Comparison: 'LPS' vs 'Control' (PC1)
#> ℹ Method: SPEARMAN (Normal=FALSE, Outliers=TRUE)
#> ! Reflection: Eigenvector sign was inverted to maximize congruence.
#> ℹ Correlation (R): 0.095 | Global Congruence (R²): 0.9%
#> ℹ Divergence (1-R²): 99.1% | Significance: P(div) = 0.091
#> ────────────────────────────────────────────────────────────────────────────────

Note: Missing values (NA) are handled through a multi-step quality control pipeline. Oxylipins are retained only if they have valid measurements in at least 50% of samples (≥50%) in at least one experimental group (user-adjustable). Missingness is assessed via the plot_na_diagnostic function, distinguishing between Missing Not At Random (MNAR) and Missing At Random (MAR). For MNAR-type oxylipins, missing values values are imputed using the group-wise arithmetic mean (equivalent to the geometric mean on the log2 scale). When an entire group lacks detection, values are imputed as below the Limit of Detection (LOD) or failed to meet the Limit of Quantification (LOQ) criteria, using the global minimum for that oxylipin, with stochastic noise sampled from a normal distribution (10% of the global standard deviation) to preserve variance and ensure numerical stability. For MAR-type oxylipins, missing values are imputed using a Random Forest approach (via the missForest package), leveraging the global metabolic profile. Finally, oxylipins completely absent from any experimental group (when require_all_groups = TRUE) or features with zero variance after imputation are automatically removed. This prevents mathematical singularities in downstream multivariate analyses and improves model robustness and biological interpretability.

Precursor Identification

It is possible to classify and quantify the number of oxylipins belonging to each precursor class using the classify_oxy function:

classify_oxy(staRoxy_object,
             return_mode = "summary")
#> 
#> ── Oxylipin Classification Summary ─────────────────────────────────────────────
#> ℹ Counts represent unique species detected per precursor/group.
#>        Precursor Control LPS Total_Unique_Detected
#>   ARA (20:4 n-6)       7   7                     7
#>  DGLA (20:3 n-6)       1   1                     1
#>   DHA (22:6 n-3)       0   1                     1
#>   EPA (20:5 n-3)       0   3                     3
#>    LA (18:2 n-6)       0   3                     3

The return_mode = "summary" argument (default) can be set to "species" to return a binary table, where \(1\) indicates Presence and \(0\) indicates Absence within each group:

classify_oxy(staRoxy_object,
             return_mode = "species")
#> 
#> ── Species Presence/Absence Matrix ─────────────────────────────────────────────
#> ℹ 1 = Presence (detected in group) | 0 = Absence
#>              Oxylipin       Precursor Control LPS
#>               11-HETE  ARA (20:4 n-6)       1   1
#>              12-HHTrE  ARA (20:4 n-6)       1   1
#>  15-deoxy-Δ12,14-PGD2  ARA (20:4 n-6)       1   1
#>          15-keto-PGE2  ARA (20:4 n-6)       1   1
#>                  PGD2  ARA (20:4 n-6)       1   1
#>                  PGE2  ARA (20:4 n-6)       1   1
#>                 PGF2α  ARA (20:4 n-6)       1   1
#>              15-HETrE DGLA (20:3 n-6)       1   1
#>              13-HDoHE  DHA (22:6 n-3)       0   1
#>               11-HEPE  EPA (20:5 n-3)       0   1
#>          14(15)-EpETE  EPA (20:5 n-3)       0   1
#>                  PGE3  EPA (20:5 n-3)       0   1
#>            13-oxo-ODE   LA (18:2 n-6)       0   1
#>                9-HODE   LA (18:2 n-6)       0   1
#>             9-oxo-ODE   LA (18:2 n-6)       0   1

Note: Oxylipin identifiers should match the nomenclature adopted in the reference protocol [1], as naming conventions often vary across the literature. The complete list of supported identifiers can be accessed using the show_database_oxy function. To improve readability, abbreviated names are used throughout the package (see the Oxylipin Nomenclature section for details).

Precursor Enrichment Analysis

The function oxy_ora performs an Over-representation Analysis (ORA) to determine whether the diversity of oxylipins detected in the analyzed groups deviates from the expected biological distribution.

ORA is performed using a hypergeometric test (one-vs-all) following established approaches in the literature [8]. This analysis compares the observed diversity of each precursor class with its theoretical frequency in the reference universe (125 oxylipins by defalt) [1]. The enrichment magnitude is summarized by the Enrichment Score (\(ES\)), calculated as:

\[ES = \frac{k / n}{K / N}\]

Where:

An \(ES > 1\) indicates that a precursor class is over-represented in the analyzed group relative to the reference universe, whereas \(ES < 1\) indicates under-representation.

The statistical significance of the enrichment is determined using the hypergeometric test:

\[P(X \geq k) = \sum_{i=k}^{\min(n, K)} \frac{\binom{K}{i} \binom{N-K}{n-i}}{\binom{N}{n}}\]

Where:

In addition, the function generates a Precursor Enrichment plot, which displays the \(ES\) for each precursor class across groups.

oxy_ora(staRoxy_object)
#> 
#> ── Detailed Over-representation analysis (ORA) ─────────────────────────────────
#> ✔ Significant ENRICHMENT found:
#> ARA (20:4 n-6) (Control): ES = 1.99x | P = 0.0131
#> ────────────────────────────────────────────────────────────────────────────────

Note: ORA can also be performed between two specific groups when more than two experimental groups are present. This can be achieved using the argument compare_two_groups = TRUE and specifying the groups with the arguments group1 and group2.

Customizing the Oxylipin Database

By default, ORA and precursor classification are performed using a reference universe composed of the 125 core oxylipins defined in the reference protocol [1]. However, different analytical platforms may target broader or more restricted oxylipin panels. To ensure accurate enrichment statistics and precursor classification, the reference universe should reflect the actual coverage of the experimental assay.

The current reference universe can be inspected using the get_oxy_universe function, which returns a data frame containing standardized oxylipin names and their corresponding precursor classes:

get_oxy_universe()
#>                            Oxylipin       Precursor
#> 1                  tetranor-12-HETE  ARA (20:4 n-6)
#> 2                          12-HHTrE  ARA (20:4 n-6)
#> 3                        13-oxo-ODE   LA (18:2 n-6)
#> 4                           9-HOTrE  ALA (18:3 n-3)
#> 5                         9-oxo-ODE   LA (18:2 n-6)
#> 6                        13-HOTrE-γ  GLA (18:3 n-6)
#> 7                          13-HOTrE  ALA (18:3 n-3)
#> 8                            9-HODE   LA (18:2 n-6)
#> 9                           10-HODE   LA (18:2 n-6)
#> 10                          12-HODE   LA (18:2 n-6)
#> 11                          13-HODE   LA (18:2 n-6)
#> 12                      9(10)-EpOME   LA (18:2 n-6)
#> 13                     12(13)-EpOME   LA (18:2 n-6)
#> 14                           9-HODA   OA (18:1 n-9)
#> 15                          10-HODA   OA (18:1 n-9)
#> 16                      9,10-DiHOME   LA (18:2 n-6)
#> 17                     12,13-DiHOME   LA (18:2 n-6)
#> 18             15-deoxy-Δ12,14-PGJ2  ARA (20:4 n-6)
#> 19                       15-oxo-ETE  ARA (20:4 n-6)
#> 20                           5-HEPE  EPA (20:5 n-3)
#> 21                           9-HEPE  EPA (20:5 n-3)
#> 22                           8-HEPE  EPA (20:5 n-3)
#> 23                          11-HEPE  EPA (20:5 n-3)
#> 24                          12-HEPE  EPA (20:5 n-3)
#> 25                          15-HEPE  EPA (20:5 n-3)
#> 26                          18-HEPE  EPA (20:5 n-3)
#> 27                     14(15)-EpETE  EPA (20:5 n-3)
#> 28                     17(18)-EpETE  EPA (20:5 n-3)
#> 29                           5-HETE  ARA (20:4 n-6)
#> 30                           8-HETE  ARA (20:4 n-6)
#> 31                           9-HETE  ARA (20:4 n-6)
#> 32                          11-HETE  ARA (20:4 n-6)
#> 33                          12-HETE  ARA (20:4 n-6)
#> 34                     14 / 15-HETE  ARA (20:4 n-6)
#> 35                          16-HETE  ARA (20:4 n-6)
#> 36                          17-HETE  ARA (20:4 n-6)
#> 37                          18-HETE  ARA (20:4 n-6)
#> 38                          19-HETE  ARA (20:4 n-6)
#> 39                          20-HETE  ARA (20:4 n-6)
#> 40                      5(6)-EpETrE  ARA (20:4 n-6)
#> 41                      8(9)-EpETrE  ARA (20:4 n-6)
#> 42                    11(12)-EpETrE  ARA (20:4 n-6)
#> 43                    14(15)-EpETrE  ARA (20:4 n-6)
#> 44                       15-oxo-EDE  EDA (20:2 n-6)
#> 45                          5-HETrE   MA (20:3 n-9)
#> 46                         15-HETrE DGLA (20:3 n-6)
#> 47            2,3-dinor-8-iso-PGF2α  ARA (20:4 n-6)
#> 48                   10-Nitrooleate   OA (18:1 n-9)
#> 49                    9-Nitrooleate   OA (18:1 n-9)
#> 50                    tetranor-PGDM  ARA (20:4 n-6)
#> 51                             PGA2  ARA (20:4 n-6)
#> 52                             PGB2  ARA (20:4 n-6)
#> 53                             PGJ2  ARA (20:4 n-6)
#> 54             15-deoxy-Δ12,14-PGD2  ARA (20:4 n-6)
#> 55                      12-oxo-LTB4  ARA (20:4 n-6)
#> 56                             LTB4  ARA (20:4 n-6)
#> 57                     6-trans-LTB4  ARA (20:4 n-6)
#> 58              6-trans-12-epi-LTB4  ARA (20:4 n-6)
#> 59                      12-epi-LTB4  ARA (20:4 n-6)
#> 60                      8,15-DiHETE  ARA (20:4 n-6)
#> 61                     14,15-DiHETE  EPA (20:5 n-3)
#> 62                      8,9-DiHETrE  ARA (20:4 n-6)
#> 63                       17-oxo-DHA  DHA (22:6 n-3)
#> 64                   2,3-dinor-TXB2  ARA (20:4 n-6)
#> 65                          4-HDoHE  DHA (22:6 n-3)
#> 66                          7-HDoHE  DHA (22:6 n-3)
#> 67                          8-HDoHE  DHA (22:6 n-3)
#> 68                         10-HDoHE  DHA (22:6 n-3)
#> 69                         11-HDoHE  DHA (22:6 n-3)
#> 70                         13-HDoHE  DHA (22:6 n-3)
#> 71                         14-HDoHE  DHA (22:6 n-3)
#> 72                    16 / 17-HDoHE  DHA (22:6 n-3)
#> 73                         19-HDoHE  DHA (22:6 n-3)
#> 74                         20-HDoHE  DHA (22:6 n-3)
#> 75                       7(8)-EpDPE  DHA (22:6 n-3)
#> 76                     10(11)-EpDPE  DHA (22:6 n-3)
#> 77                     13(14)-EpDPE  DHA (22:6 n-3)
#> 78                     16(17)-EpDPE  DHA (22:6 n-3)
#> 79                     19(20)-EpDPE  DHA (22:6 n-3)
#> 80                       17-oxo-DPA  DPA (22:5 n-3)
#> 81                             PGE3  EPA (20:5 n-3)
#> 82                     15-keto-PGE2  ARA (20:4 n-6)
#> 83                             LXA5  EPA (20:5 n-3)
#> 84                             RvE1  EPA (20:5 n-3)
#> 85                         epi-LXA4  ARA (20:4 n-6)
#> 86                      15-epi-LXA4  ARA (20:4 n-6)
#> 87                             LXB4  ARA (20:4 n-6)
#> 88                  20-hydroxy-LTB4  ARA (20:4 n-6)
#> 89                             PGD2  ARA (20:4 n-6)
#> 90       13,14-dihydro-15-keto-PGD2  ARA (20:4 n-6)
#> 91                             PGE2  ARA (20:4 n-6)
#> 92       13,14-dihydro-15-keto-PGE2  ARA (20:4 n-6)
#> 93                    15-keto-PGF2α  ARA (20:4 n-6)
#> 94              8-iso-15-keto-PGF2β  ARA (20:4 n-6)
#> 95                            PGF3α  EPA (20:5 n-3)
#> 96                      8-iso-PGF3α  EPA (20:5 n-3)
#> 97               15-F2t-Isoprostane  ARA (20:4 n-6)
#> 98  11β-13,14-dihydro-15-keto-PGF2α  ARA (20:4 n-6)
#> 99                        11β-PGF2α  ARA (20:4 n-6)
#> 100                   15-keto-PGF1α DGLA (20:3 n-6)
#> 101                      8-iso-PGE1 DGLA (20:3 n-6)
#> 102                            PGD1  ARA (20:4 n-6)
#> 103                           PGF2α  ARA (20:4 n-6)
#> 104               8,12-iso-iPF2α-VI  ARA (20:4 n-6)
#> 105             13,14-dihydro-PGF2α  ARA (20:4 n-6)
#> 106                           PGF1α DGLA (20:3 n-6)
#> 107                             PDX  DHA (22:6 n-3)
#> 108                       Maresin 1  DHA (22:6 n-3)
#> 109                       Maresin 2  DHA (22:6 n-3)
#> 110                            RvD5  DHA (22:6 n-3)
#> 111                    19,20-DiHDPA  DHA (22:6 n-3)
#> 112                 20-carboxy-LTB4  ARA (20:4 n-6)
#> 113                            TXB3  EPA (20:5 n-3)
#> 114                 20-hydroxy-PGE2  ARA (20:4 n-6)
#> 115                     6-keto-PGE1 DGLA (20:3 n-6)
#> 116                Δ17-6-keto-PGF1α DGLA (20:3 n-6)
#> 117                            TXB2  ARA (20:4 n-6)
#> 118                    6-keto-PGF1α DGLA (20:3 n-6)
#> 119 6,15-diketo-13,14-dihydro-PGF1α DGLA (20:3 n-6)
#> 120                            TXB1 DGLA (20:3 n-6)
#> 121                            RvD1  DHA (22:6 n-3)
#> 122                            RvD2  DHA (22:6 n-3)
#> 123                            RvD3  DHA (22:6 n-3)
#> 124             16,16-dimethyl-PGD2  ARA (20:4 n-6)
#> 125               1a,1b-dihomo-PGE2  AdA (22:4 n-6)

To adapt the database to specific experimental conditions, users can remove existing oxylipins or add custom features and precursor annotations using the edit_oxy_universe function.

Note: When adding custom oxylipins or precursor names containing Greek letters, Unicode escape sequences are strongly recommended to ensure consistent string matching across operating systems (see the Oxylipin Nomenclature section for details).

First, store the default universe in a new object:

my_universe <- get_oxy_universe()

To remove oxylipins not included in your analytical method (e.g., PGE2 and PGD2), use the remove argument:

my_universe <- edit_oxy_universe(my_universe,
                                 remove = c("PGE2", "PGD2"))
#> ✔ Removed 2 oxylipins from the universe.

Conversely, new oxylipin species can be added by specifying both the feature name and its precursor class using the add_name and add_precursor arguments. In the example below, a hypothetical alpha-oxylipin is added using the appropriate Unicode escape sequence:

my_universe <- edit_oxy_universe(univ = my_universe,
                             add_name = "New-Oxylipin\u03b1",
                             add_precursor = "ARA (20:4 n-6)")
#> ✔ Added 1 new oxylipin. Current universe size: 124.

Once the customized universe is defined, it can be supplied directly to the universe argument in oxy_ora. The function will automatically recalculate the total universe size (\(N\)) and precursor class sizes (\(K\)) according to the modified database, ensuring that enrichment statistics accurately reflect the experimental design:

oxy_ora(staRoxy_object,
        universe = my_universe)
#> 
#> ── Detailed Over-representation analysis (ORA) ─────────────────────────────────
#> ℹ No significant precursor ENRICHMENT detected.
#> ────────────────────────────────────────────────────────────────────────────────

Adaptation to Other Omics Data

Although originally developed for oxylipin profiling [1], staRoxy may also be applicable to other mass spectrometry–based datasets, including proteomics and metabolomics. With the exception of classify_oxy and oxy_ora, most functions are expected to be compatible with these data types.

R Package Dependencies

staRoxy depends on several widely accessible packages:

Session and Environment Information

#> R version 4.5.1 (2025-06-13 ucrt)
#> Platform: x86_64-w64-mingw32/x64
#> Running under: Windows 11 x64 (build 26200)
#> 
#> Matrix products: default
#>   LAPACK version 3.12.1
#> 
#> locale:
#> [1] LC_COLLATE=Portuguese_Brazil.utf8  LC_CTYPE=Portuguese_Brazil.utf8   
#> [3] LC_MONETARY=Portuguese_Brazil.utf8 LC_NUMERIC=C                      
#> [5] LC_TIME=Portuguese_Brazil.utf8    
#> 
#> time zone: America/Sao_Paulo
#> tzcode source: internal
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] staRoxy_0.2.0
#> 
#> loaded via a namespace (and not attached):
#>  [1] tidyselect_1.2.1      viridisLite_0.4.3     dplyr_1.2.1          
#>  [4] farver_2.1.2          viridis_0.6.5         S7_0.2.2             
#>  [7] fastmap_1.2.0         digest_0.6.39         lifecycle_1.0.5      
#> [10] cluster_2.1.8.2       Cairo_1.7-0           statmod_1.5.1        
#> [13] magrittr_2.0.5        compiler_4.5.1        rlang_1.2.0          
#> [16] sass_0.4.10           tools_4.5.1           yaml_2.3.12          
#> [19] data.table_1.18.4     knitr_1.51            ggsignif_0.6.4       
#> [22] labeling_0.4.3        RColorBrewer_1.1-3    withr_3.0.2          
#> [25] purrr_1.2.2           BiocGenerics_0.56.0   grid_4.5.1           
#> [28] stats4_4.5.1          colorspace_2.1-2      ggplot2_4.0.3        
#> [31] scales_1.4.0          iterators_1.0.14      MASS_7.3-65          
#> [34] dichromat_2.0-1       cli_3.6.6             rmarkdown_2.31       
#> [37] vegan_2.7-3           crayon_1.5.3          generics_0.1.4       
#> [40] otel_0.2.0            rstudioapi_0.18.0     rjson_0.2.23         
#> [43] readxl_1.4.5          cachem_1.1.0          stringr_1.6.0        
#> [46] splines_4.5.1         parallel_4.5.1        cellranger_1.1.0     
#> [49] matrixStats_1.5.0     vctrs_0.7.3           Matrix_1.7-5         
#> [52] jsonlite_2.0.0        IRanges_2.44.0        GetoptLong_1.1.1     
#> [55] S4Vectors_0.48.1      ggrepel_0.9.8         clue_0.3-68          
#> [58] foreach_1.5.2         limma_3.66.0          tidyr_1.3.2          
#> [61] jquerylib_0.1.4       glue_1.8.1            codetools_0.2-20     
#> [64] cowplot_1.2.0         stringi_1.8.7         shape_1.4.6.1        
#> [67] gtable_0.3.6          ComplexHeatmap_2.26.1 tibble_3.3.1         
#> [70] pillar_1.11.1         htmltools_0.5.9       circlize_0.4.18      
#> [73] R6_2.6.1              doParallel_1.0.17     evaluate_1.0.5       
#> [76] lattice_0.22-9        png_0.1-9             bslib_0.10.0         
#> [79] Rcpp_1.1.1-1.1        gridExtra_2.3         nlme_3.1-169         
#> [82] permute_0.9-10        mgcv_1.9-4            xfun_0.57            
#> [85] forcats_1.0.1         prettydoc_0.4.1       pkgconfig_2.0.3      
#> [88] GlobalOptions_0.1.4

References

  1. (in press).
  2. Chaves-Filho AB, Diniz LS, Santos RS, Lima RS, Oreliana H, Pinto IFD, et al. Plasma oxylipin profiling by high-resolution mass spectrometry reveals signatures of inflammation and hypermetabolism in amyotrophic lateral sclerosis. Free Radic Biol Med. 2023;208:285–298.
  3. Bulmer MG. Principles of Statistics. Dover Publications; 1979.
  4. García V, Sánchez JS, Mollineda RA. On the effectiveness of preprocessing methods when dealing with different levels of class imbalance. Knowledge-Based Systems. 2011;24(1):13–21.
  5. Anderson MJ. A new method for non-parametric multivariate analysis of variance. Austral Ecology. 2001;26:32–46.
  6. Ritchie ME, Phipson B, Wu D, Hu Y, Law CW, Shi W, Smyth GK. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Research. 2015;43(7):e47.
  7. Lorenzo-Seva U, ten Berge JMF. Tucker’s congruence coefficient as a meaningful index of factor similarity. Methodology. 2006;2(2):57–64.
  8. Yu G, Wang LG, Han Y, He QY. clusterProfiler: an R package for comparing biological themes among gene clusters. OMICS. 2012;16(5):284–287.