Abstract
staRoxy is a dedicated R package designed to streamline
oxylipidomics abundance data analysis [1]. It provides functions for
data cleaning, normalization, statistical modeling, and integrated
visualization for exploratory and differential analyses, as well as
specialized tools for profile comparison and precursor identification
and enrichment. This vignette describes the recommended workflow and
core functionalities of the package.
Funding
The development of the staRoxy package has been
partially supported by the São Paulo Research Foundation (FAPESP; grant
number 2025/00242-1) and FAPESP/CEPID-Redoxoma (grant number
2013/07937-8).
Acknowledgments
We thank Nicholas A. Brown, Giuseppe G. F. Leite, Daniela Bizinelli, Vinícius A. Pereira, and other friends for their valuable feedback and discussions during the development of the package. Special thanks to Cynthia D. Diniz for the logo design.
Citation
If you use staRoxy in published research, please cite:
[1]
Several other R packages are available for lipidomics and metabolomics analyses, including MetaboAnalystR, xcms, MSnbase, LipidMS, and lipidr, among others.
Overview
Lipidomics experiments generate quantitative matrices in which lipids are measured across multiple biological conditions. Final concentration units are typically nmol/ml or nmol/mg, although they may vary depending on the analytical method or adaptations used. Proper analysis requires systematic preprocessing, normalization, quality control, and statistical modeling.
The staRoxy package provides a reproducible framework
for oxylipidomics abundance data analysis, originally developed for an
oxylipin profiling protocol [1]. It integrates:
- Structured data import and validation.
- Missing-value filtering and diagnostics.
- Data transformation.
- Quality control diagnostics.
- Exploratory data analysis.
- Oxylipin profile analysis.
- Differential abundance analysis using linear modeling.
- Correlation analysis.
- Loading pattern dissimilarity analysis.
- Precursor identification and enrichment analysis.
The workflow is designed to move from raw abundance matrices to biologically interpretable results in a standardized and transparent manner.
Setup, Initialization, and Data Import
First, install the package. staRoxy is currently not
available on the Comprehensive R Archive Network (CRAN). The development
version can be installed via BiocManager, which will also install the
required dependencies:
if (!require("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("pefjordao/staRoxy")
Then, load the package:
library(staRoxy)
#> ── staRoxy v0.2.0 ───────────────────────────── University of São Paulo - USP ──
#> ℹ Oxylipidomics Abundance Data Analysis Pipeline
#> Developed by: Pedro Henrique F. Jordão & Eduardo M. Reis
#> Official Repository: <https://github.com/pefjordao/staRoxy>
#> ────────────────────────────────────────────────────────────────────────────────The staRoxy package requires two input files:
- A oxylipid abundance data frame
(
data_path). Supported formats include.xlsx,.xls,.csv(comma or semicolon separated),.tsv,.txt, or.dat. - A metadata data frame (
meta_path) describing sample annotations.
Input Data Structure
Abundance Data Frame File
- First column: Should be named
“oxylipin”, and contain the names of all identified oxylipin species. - Remaining columns: Should correspond to individual samples.
- Values: The first row should contain sample names, and the subsequent rows should contain the quantitative abundance values for each oxylipin.
Note: Missing (NA), non-detected (below
the limit of detection, LOD), or non-quantified values (below the limit
of quantification, LOQ) should be left empty. Zero values are
automatically converted to NA.
Metadata file
- First column: Should be named
“sample”and contain the name of each sample. Sample names must match exactly those provided in the abundance data frame file. - Second column: Should be named
“group”and contain the experimental group associated with each sample. - Additional optional columns may be added to specify covariates (e.g., batch, sex, age).
Oxylipin Nomenclature
Oxylipin nomenclature is notoriously inconsistent across the
literature. To ensure reproducibility, staRoxy adopts a
standardized naming convention based on the Cayman Chemical catalog and
LIPID MAPS guidelines [1]. Abbreviated names are used throughout the
package to improve readability in plots and tables.
Automatic Harmonization
For the 125 core oxylipins identified in the reference protocol [1],
staRoxy includes an automatic cleaning engine. During data
import, the package detects and fixes common encoding artifacts (such as
corrupted Greek letters \(\alpha\),
\(\beta\), \(\gamma\) and symbols like \(\Delta\)) that frequently occur when
exporting data from mass spectrometry software to Excel.
Manual Entry and Unicode Escape Sequences
While the automatic engine covers the standard protocol, using Unicode escape sequences is highly recommended for all users and becomes essential when adding new features or custom precursors to the database.
For the 125 Core Oxylipins: Even though
staRoxycan fix these names automatically, providing them in Unicode (or standardized format) simplifies the processing and avoids any ambiguity.For New/Custom Oxylipins: If you are expanding the database with your own features (see the Customizing the Database section below), you must use Unicode escape sequences. Since the automatic engine may not recognize custom naming patterns, Unicode is the only way to ensure a “perfect string match” and prevent errors when moving files between Windows (Latin-1) and Mac/Linux (UTF-8).
| Symbol | Meaning | Unicode Sequence | Example |
| \(\alpha\) | Alpha | \u03b1 |
"PGF2\u03b1" |
| \(\beta\) | Beta | \u03b2 |
"8-iso-15-keto-PGF2\u03b2" |
| \(\gamma\) | Gamma | \u03b3 |
"13-HOTrE-\u03b3" |
| \(\Delta\) | Delta | \u0394 |
"15-deoxy-\u039412,14-PGJ2" |
Example Data
The staRoxy package provides three example datasets for
testing and demonstration:
- Cellular Pellets: Intracellular oxylipin profiles
from RAW 264.7 cells, comparing basal conditions (Control) against
inflammatory activation (LPS-treated). (Data:
data_oxy_lps_pellet, Metadata:metadata_oxy_lps_pellet). - Culture Supernatant: Secreted oxylipin signatures
from the same experimental groups, reflecting the specialized
pro-resolving or pro-inflammatory mediators released into the
extracellular environment. (Data:
data_oxy_lps_supernatant, Metadata:metadata_oxy_lps_supernatant). - Blood Plasma: A biological study on an Amyotrophic
Lateral Sclerosis (ALS) mouse model (SOD1-G93A) at asymptomatic and
symptomatic stages, published by Chaves-Filho et al. (2023) [2] (Data:
data_oxy_plasma, Metadata:metadata_oxy_plasma).
Note: All datasets were analyzed using
staRoxy (v0.2.0) and was described in the original protocol
[1]. For the purposes of this vignette, we will focus on the cellular
pellet dataset.
You can load these datasets into your environment using the
data function:
Creating the staRoxy Object
The pipeline begins by importing the abundance data frame and the
metadata, ensuring proper matching between samples and their
annotations. If either file (abundance data frame or metadata) contains
samples not present in the other, these will be automatically removed
and reported to the user. In addition, samples containing 100% missing
values (NA) are automatically excluded to prevent issues in
downstream analyses. This process is performed using the
read_oxy function, which creates an object that stores the
data and is used throughout the downstream analysis pipeline:
staRoxy_object <- read_oxy(data_path = data_oxy_lps_pellet,
meta_path = metadata_oxy_lps_pellet)
#>
#> ── staRoxy Data Summary ────────────────────────────────────────────────────────
#> ✔ Successful upload of 16 features across 8 samples.
#>
#> ── Experimental Groups: ──
#>
#> • Control: n = 4
#> • LPS: n = 4
#> ────────────────────────────────────────────────────────────────────────────────Missing Value Filtering and Log2 Transformation
Before statistical modeling, the data must be filtered and
transformed to meet the assumptions of linear models. The
filter_oxy function provides three filtering
strategies:
- Type I: \(\ge\)
propvalid values in AT LEAST one group. - Type II: \(\ge\)
propvalid values across ALL samples. - Type III: strict mode. NO missing values allowed.
The proportion of valid values can be adjusted using the
prop argument (default = 0.5).
staRoxy_object <- filter_oxy(staRoxy_object,
type = "I",
prop = 0.5)
#> ℹ Type I: >50% valid values in AT LEAST one group.
#> ✔ Filtering complete: 15 oxylipins remaining.Note: Type I filtering favors the retention of oxylipins that are exclusive to one of the analyzed groups.
After filtering, transform_oxy performs:
- Log2 transformation of abundance values.
- Removal of lipids with zero variance or full missing values
(
NA).
Quality Control
Before conducting statistical tests, it is essential to evaluate the global structure and robustness of the data. This step ensures that the data meets the assumptions required for robust linear modeling.
Data Distribution
The function check_data_distribution provides a
diagnostic quality-control report by:
- Assessing sample size per group.
- Identifying critical group imbalance.
- Evaluating data scale compatibility with log2 transformation.
- Emitting warnings regarding potential statistical power limitations.
This function assesses whether the data have been appropriately transformed to the log2 scale using the Pearson’s skewness coefficient (\(\gamma_1\)) [3], which measures the asymmetry of the distribution relative to its mean:
\[\gamma_1 = E\left[\left(\frac{X-\mu}{\sigma}\right)^3\right] = \frac{\sum_{i=1}^{n} (x_i - \bar{x})^3 / n}{s^3}\]
Where:
- \(x_i\): abundance of an individual oxylipin in sample \(i\).
- \(\bar{x}\): sample mean.
- \(n\): number of observations.
- \(s\): standard deviation.
- \((x_i - \bar{x})^3\): third central moment; preserves sign to indicate skew direction (positive or negative tail).
- \(|\gamma_1| < 1.2\): acceptable distribution (likely already log2-transformed).
- \(|\gamma_1| > 1.2\): strong skewness; log2 transformation is recommended if not already performed (to avoid double transformation).
To assess group imbalance, this function evaluates the ratio between the largest and smallest group sizes in order to identify potential biases in linear modeling. The imbalance ratio (\(R\)) [4] is defined as:
\[R = \frac{\max(n_{group})}{\min(n_{group})}\]
check_data_distribution(staRoxy_object)
#>
#> ── Model Advisor Report ────────────────────────────────────────────────────────
#> ℹ Testing distribution symmetry using Pearson's skewness coefficient (γ1).
#> ℹ Scale Check: Distribution is reasonably symmetric (γ1 = -0.55).
#> ✔ Data scale looks appropriate for linear modeling (likely log-transformed).
#> ℹ Checking sample sizes per group:
#> ✔ Group 'Control': n = 4. Sample size is adequate.
#> ✔ Group 'LPS': n = 4. Sample size is adequate.
#> ℹ Imbalance Ratio (R): 1x.
#> ✔ Experimental design is balanced (R <= 2.0). Assumptions are likely met.
#> ────────────────────────────────────────────────────────────────────────────────If \(R > 2.0\) (e.g., one group is at least twice the size of another), the function issues a warning indicating that caution is required when interpreting variance-related results, as unbalanced designs may affect the homogeneity of dispersion and impact downstream statistical analyses.
Missing Data Diagnostic
After initial filtering, abundance data may still contain missing
values (NA). Therefore, it is important to assess the
origin of these missing values, distinguishing between Missing
Not At Random (MNAR), typically associated with values below
the Limit of Detection (LOD) or Limit of
Quantification (LOQ), and Missing At Random
(MAR), which often arise from experimental or technical
variability.
The plot_na_diagnostic function provides two
visualization modes. With mode = "cor", it generates a
scatter plot relating Mean Log2 Abundance to
the Proportion of Missing Values. A negative
correlation, where oxylipins with a higher proportion of NA
are observed at lower abundances, suggests an MNAR pattern:
Alternatively, with mode = "dist", the function
generates plots relating Log2 Abudance to
either Density or Cumulative Fraction.
If the distribution of oxylipins with NA values
(TRUE) shows higher density or is shifted toward lower
abundances, this indicates that these features are primarily detected at
low concentrations and become missing below the LOD or LOQ, consistent
with an MNAR pattern:
Note: In highly sparse datasets where no oxylipins reach 100% detection, the function automatically partitions features based on median missingness. Features are then compared between “High Missingness” and “Low/Average Missingness” groups, ensuring that diagnostic analyses remain informative even in sparse datasets.
Based on this diagnostic, it is possible to determine how
missing values should be handled in downstream functions that require
explicit NA treatment, such as plot_pca_oxy,
plot_heatmap_oxy, profile_test,
limma_oxy, and loading_dissimilarity (see
below). In each function, the na_method argument defines
the imputation strategy to be used, either "minprob" (for
MNAR-type missingness) or "rf" (for MAR-type
missingness).
Exploratory Analysis
Exploratory analysis allows inspection of the overall structure of the data prior to hypothesis testing and may also help identify oxylipins with distinct patterns between experimental groups.
Analyzed and Exclusive Oxylipins
The function get_exclusives_oxy identifies oxylipins
that were detected in only one of the analyzed groups. Note that
data imputation is performed only in analyses that require it, and the
original data frame remains unchanged throughout.
get_exclusives_oxy(staRoxy_object)
#>
#> ── Exclusive Oxylipins Detection ───────────────────────────────────────────────
#> ℹ Group 'Control': No exclusive oxylipins found.
#> ✔ Group 'LPS': 7 exclusive oxylipins found.
#> - 13-oxo-ODE
#> - 9-oxo-ODE
#> - 9-HODE
#> - 11-HEPE
#> - 14(15)-EpETE
#> - 13-HDoHE
#> - PGE3
#> ────────────────────────────────────────────────────────────────────────────────Conversely, the function get_analyzed_oxy returns the
names of all oxylipins retained after filtering:
Descriptive Statistics
Descriptive statistics can be computed using the
descriptives_oxy function, which returns oxylipin
identifiers, experimental groups, number of valid values, number of
missing values (NA), mean, median, standard deviation,
standard error of the mean, and coefficient of variation. The
sort_by argument allows sorting the results in decreasing
order based on a selected metric:
descriptives_oxy(staRoxy_object,
sort_by = "mean")
#>
#> ── Descriptive Statistics ──────────────────────────────────────────────────────
#> ℹ Sorting by: "mean" | Groups: "LPS, Control"
#> Feature group n_valid n_na mean median sd sem cv_perc
#> 14(15)-EpETE LPS 4 0 10.15 10.09 0.36 0.18 3.58
#> PGE2 LPS 4 0 10.02 9.69 0.71 0.36 7.11
#> 13-HDoHE LPS 4 0 9.70 9.86 0.39 0.19 4.02
#> 12-HHTrE LPS 4 0 9.56 9.46 0.38 0.19 4.01
#> 11-HEPE LPS 4 0 9.40 9.37 0.20 0.10 2.14
#> 11-HETE LPS 4 0 9.29 9.31 0.10 0.05 1.08
#> PGD2 LPS 4 0 8.92 8.93 0.21 0.11 2.40
#> PGF2α LPS 4 0 8.67 8.53 0.38 0.19 4.42
#> PGE3 LPS 3 1 8.49 8.22 0.53 0.31 6.30
#> 15-keto-PGE2 LPS 4 0 8.26 7.90 0.82 0.41 9.99
#> 9-oxo-ODE LPS 2 2 7.75 7.75 0.08 0.05 0.99
#> 9-HODE LPS 3 1 7.34 7.34 0.19 0.11 2.61
#> 15-keto-PGE2 Control 3 1 7.12 7.09 0.19 0.11 2.65
#> 12-HHTrE Control 4 0 7.07 6.93 0.58 0.29 8.16
#> 13-oxo-ODE LPS 4 0 7.06 7.17 0.46 0.23 6.53
#> PGE2 Control 4 0 7.05 7.02 0.21 0.11 3.00
#> PGF2α Control 3 1 6.88 6.77 0.21 0.12 3.06
#> 15-HETrE LPS 4 0 6.53 6.53 0.11 0.06 1.70
#> 11-HETE Control 4 0 6.03 5.91 0.29 0.15 4.88
#> PGD2 Control 3 1 5.89 5.95 0.23 0.13 3.95
#> 15-deoxy-Δ12,14-PGD2 LPS 3 1 4.14 4.04 0.19 0.11 4.54
#> 15-HETrE Control 2 2 3.86 3.86 0.25 0.18 6.45
#> 15-deoxy-Δ12,14-PGD2 Control 3 1 3.53 3.51 0.21 0.12 5.84
#> 11-HEPE Control 0 4 NaN NA NA NA NA
#> 13-HDoHE Control 0 4 NaN NA NA NA NA
#> 13-oxo-ODE Control 0 4 NaN NA NA NA NA
#> 14(15)-EpETE Control 0 4 NaN NA NA NA NA
#> 9-HODE Control 0 4 NaN NA NA NA NA
#> 9-oxo-ODE Control 0 4 NaN NA NA NA NA
#> PGE3 Control 0 4 NaN NA NA NA NA
#> ────────────────────────────────────────────────────────────────────────────────The group argument can be used to restrict the output to
a specific experimental group:
descriptives_oxy(staRoxy_object,
group = "LPS")
#>
#> ── Descriptive Statistics ──────────────────────────────────────────────────────
#> ℹ Sorting by: "feature" | Groups: "LPS"
#> Feature group n_valid n_na mean median sd sem cv_perc
#> 11-HEPE LPS 4 0 9.40 9.37 0.20 0.10 2.14
#> 11-HETE LPS 4 0 9.29 9.31 0.10 0.05 1.08
#> 12-HHTrE LPS 4 0 9.56 9.46 0.38 0.19 4.01
#> 13-HDoHE LPS 4 0 9.70 9.86 0.39 0.19 4.02
#> 13-oxo-ODE LPS 4 0 7.06 7.17 0.46 0.23 6.53
#> 14(15)-EpETE LPS 4 0 10.15 10.09 0.36 0.18 3.58
#> 15-HETrE LPS 4 0 6.53 6.53 0.11 0.06 1.70
#> 15-deoxy-Δ12,14-PGD2 LPS 3 1 4.14 4.04 0.19 0.11 4.54
#> 15-keto-PGE2 LPS 4 0 8.26 7.90 0.82 0.41 9.99
#> 9-HODE LPS 3 1 7.34 7.34 0.19 0.11 2.61
#> 9-oxo-ODE LPS 2 2 7.75 7.75 0.08 0.05 0.99
#> PGD2 LPS 4 0 8.92 8.93 0.21 0.11 2.40
#> PGE2 LPS 4 0 10.02 9.69 0.71 0.36 7.11
#> PGE3 LPS 3 1 8.49 8.22 0.53 0.31 6.30
#> PGF2α LPS 4 0 8.67 8.53 0.38 0.19 4.42
#> ────────────────────────────────────────────────────────────────────────────────Principal Component Analysis
Principal Component Analysis (PCA) is a fundamental step in omics data analysis. It provides a global overview of:
- Sample clustering.
- Group separation.
- Outlier sample detection.
- Overall data quality.
The function plot_pca_oxy generates a publication-ready
PCA plot allowing customization of colors and graphical parameters:
# You can define group-specific colors for visualization
my_colors <- c(
"LPS" = "#EE7942",
"Control" = "#698B22"
)
plot_pca_oxy(staRoxy_object,
colors = my_colors)
#> ! Removed 7 oxylipins with zero variance/data in at least one group.
#> ✔ Filtering complete: 8/15 oxylipins retained.
#> ℹ Imputing via 'minprob': assuming missingness is due to low concentration (MNAR).Note: Missing values (NA) are handled
through a multi-step quality control pipeline. Oxylipins are retained
only if they have valid measurements in at least 50% of samples (≥50%)
in at least one experimental group (user-adjustable). Missingness is
assessed via the plot_na_diagnostic function,
distinguishing between Missing Not At Random (MNAR) and Missing At
Random (MAR). For MNAR-type oxylipins, missing values values are imputed
using the group-wise arithmetic mean (equivalent to the geometric mean
on the log2 scale). When an entire group lacks detection,
values are imputed as below the Limit of Detection (LOD) or failed to
meet the Limit of Quantification (LOQ) criteria, using the global
minimum for that oxylipin, with stochastic noise sampled from a normal
distribution (10% of the global standard deviation) to preserve variance
and ensure numerical stability. For MAR-type oxylipins, missing values
are imputed using a Random Forest approach (via the
missForest package), leveraging the global metabolic
profile. Finally, oxylipins completely absent from any experimental
group (when require_all_groups = TRUE) or features with
zero variance after imputation are automatically removed. This prevents
mathematical singularities in downstream multivariate analyses and
improves model robustness and biological interpretability.
Z-score Heatmap
The function plot_heatmap_oxy generates a
publication-ready heatmap, allowing customization of colors and
graphical parameters.
The function applies a \(Z\)-score transformation per oxylipin, placing all oxylipins on the same scale. In the resulting heatmap:
- A value of \(0\) represents the mean abundance of that oxylipin
- Positive values indicate values above the mean
- Negative values indicate values below the mean
plot_heatmap_oxy(staRoxy_object,
colors = my_colors)
#> ! Removed 7 oxylipins with zero variance/data in at least one group.
#> ✔ Filtering complete: 8/15 oxylipins retained.
#> ℹ Imputing via 'minprob': assuming missingness is due to low concentration (MNAR).Note: Missing values (NA) are handled
through a multi-step quality control pipeline. Oxylipins are retained
only if they have valid measurements in at least 50% of samples (≥50%)
in at least one experimental group (user-adjustable). Missingness is
assessed via the plot_na_diagnostic function,
distinguishing between Missing Not At Random (MNAR) and Missing At
Random (MAR). For MNAR-type oxylipins, missing values values are imputed
using the group-wise arithmetic mean (equivalent to the geometric mean
on the log2 scale). When an entire group lacks detection,
values are imputed as below the Limit of Detection (LOD) or failed to
meet the Limit of Quantification (LOQ) criteria, using the global
minimum for that oxylipin, with stochastic noise sampled from a normal
distribution (10% of the global standard deviation) to preserve variance
and ensure numerical stability. For MAR-type oxylipins, missing values
are imputed using a Random Forest approach (via the
missForest package), leveraging the global metabolic
profile. Finally, oxylipins completely absent from any experimental
group (when require_all_groups = TRUE) or features with
zero variance after imputation are automatically removed. This prevents
mathematical singularities in downstream multivariate analyses and
improves model robustness and biological interpretability.
For hierarchical clustering, a fully imputed matrix is used to
compute Euclidean distances. Original NA positions are then
restored for visualization and displayed in a distinct color (default:
"grey90"), distinguishing true non-detections from
low-abundance measurements. NA values can be hidden from
the plot using the hide_na = TRUE argument. However, note
that hierarchical clustering will be recomputed accordingly:
plot_heatmap_oxy(staRoxy_object,
colors = my_colors,
hide_na = TRUE)
#> ! Removed 7 oxylipins with zero variance/data in at least one group.
#> ✔ Filtering complete: 8/15 oxylipins retained.
#> ℹ Imputing via 'minprob': assuming missingness is due to low concentration (MNAR).Sample names can also be hidden using
show_sample_names = FALSE for cleaner visualization:
plot_heatmap_oxy(staRoxy_object,
colors = my_colors,
show_sample_names = FALSE)
#> ! Removed 7 oxylipins with zero variance/data in at least one group.
#> ✔ Filtering complete: 8/15 oxylipins retained.
#> ℹ Imputing via 'minprob': assuming missingness is due to low concentration (MNAR).Oxylipin Profile Analysis
The function profile_test allows direct comparison of
oxylipin abundance patterns across experimental groups, providing a
multivariate overview of oxylipidomic signatures.
The data are standardized so that each oxylipin has a mean of \(0\) and a variance of \(1\) (\(Z\)-score scaling). This step is essential because oxylipins present at very high concentrations could otherwise dominate the analysis, masking biologically relevant signals from lower-abundance molecules.
The function then performs a PERMANOVA analysis [5] to test whether the global oxylipin composition differs between experimental groups. This test provides the following metrics:
- \(R^2\) (explained variance): indicates the proportion of the total variation in the dataset that is explained by the group structure (e.g., an \(R^2\) of 0.40 means that 40% of the differences in oxylipin profiles are due to the experimental groups).
- \(P\)-value: represents the probability of observing a difference between groups as large as (or larger than) the one obtained, assuming the null hypothesis is true. This is calculated via non-parametric permutations (default = 10000), determining if the “clustering” of samples is statistically significant.
profile_test(staRoxy_object)
#> ! Removed 7 oxylipins with zero variance/data in at least one group.
#> ✔ Filtering complete: 8/15 oxylipins retained.
#> ℹ Imputing via 'minprob': assuming missingness is due to low concentration (MNAR).
#>
#> ── Multivariate Profile Analysis (PERMANOVA) ───────────────────────────────────
#> ℹ Groups: Control, LPS | Oxylipins: 8 | Permutations: 10000
#> ℹ Total Variance Explained (R²): 0.885 | P-value: 0.026
#> ────────────────────────────────────────────────────────────────────────────────When more than two groups are present and the global test is statistically significant, the function automatically performs pairwise PERMANOVA comparisons with Bonferroni correction, allowing identification of the specific groups responsible for the observed differences.
For visualization, if plot = TRUE, the function
generates a PCoA (Principal Coordinates Analysis) plot
based on Euclidean distance, which should produce
results equivalent to those obtained with PCA. This visualization
summarizes the behavior of oxylipin profiles across conditions and is
particularly useful for identifying coordinated changes in oxylipid
mediators and for comparing global oxylipidomic signatures between
biological groups:
profile_test(staRoxy_object,
colors = my_colors,
plot = TRUE)
#> ! Removed 7 oxylipins with zero variance/data in at least one group.
#> ✔ Filtering complete: 8/15 oxylipins retained.
#> ℹ Imputing via 'minprob': assuming missingness is due to low concentration (MNAR).
#>
#> ── Multivariate Profile Analysis (PERMANOVA) ───────────────────────────────────
#> ℹ Groups: Control, LPS | Oxylipins: 8 | Permutations: 10000
#> ℹ Total Variance Explained (R²): 0.885 | P-value: 0.026
#> ────────────────────────────────────────────────────────────────────────────────Note: Missing values (NA) are handled
through a multi-step quality control pipeline. Oxylipins are retained
only if they have valid measurements in at least 50% of samples (≥50%)
in at least one experimental group (user-adjustable). Missingness is
assessed via the plot_na_diagnostic function,
distinguishing between Missing Not At Random (MNAR) and Missing At
Random (MAR). For MNAR-type oxylipins, missing values values are imputed
using the group-wise arithmetic mean (equivalent to the geometric mean
on the log2 scale). When an entire group lacks detection,
values are imputed as below the Limit of Detection (LOD) or failed to
meet the Limit of Quantification (LOQ) criteria, using the global
minimum for that oxylipin, with stochastic noise sampled from a normal
distribution (10% of the global standard deviation) to preserve variance
and ensure numerical stability. For MAR-type oxylipins, missing values
are imputed using a Random Forest approach (via the
missForest package), leveraging the global metabolic
profile. Finally, oxylipins completely absent from any experimental
group (when require_all_groups = TRUE) or features with
zero variance after imputation are automatically removed. This prevents
mathematical singularities in downstream multivariate analyses and
improves model robustness and biological interpretability.
Differential Abundance Analysis
This section demonstrates how to perform differential abundance analysis of oxylipins between experimental groups in order to identify Differentially Abundant Oxylipins (DAO).
The function limma_oxy uses the limma
(Linear Models for Microarray Data) framework [6],
which is a gold standard for the analysis of experiments with relatively
small sample sizes. This framework applies an empirical
Bayes approach that, in simple terms, borrows information
across all oxylipins to stabilize variance estimates and improve the
robustness of statistical tests. More details can be found in the
original publication.
This function automatically fits a linear model for each oxylipin and generates all possible pairwise comparisons between experimental groups:
stats <- limma_oxy(staRoxy_object)
#> ! Removed 7 oxylipins with zero variance/data in at least one group.
#> ✔ Filtering complete: 8/15 oxylipins retained.
#> ℹ Imputing via 'minprob': assuming missingness is due to low concentration (MNAR).
#> ✔ limma differential analysis complete.To inspect the comparisons performed between experimental groups
(referred to as “contrasts”), the function list_contrasts
can be used:
list_contrasts(stats)
#>
#> ── Experimental Contrasts ──────────────────────────────────────────────────────
#> ℹ Found 1 comparison in this analysis:
#> • Contrast 1: Control vs. LPS
#> ────────────────────────────────────────────────────────────────────────────────Note: Missing values (NA) are handled
through a multi-step quality control pipeline. Oxylipins are retained
only if they have valid measurements in at least 50% of samples (≥50%)
in at least one experimental group (user-adjustable). Missingness is
assessed via the plot_na_diagnostic function,
distinguishing between Missing Not At Random (MNAR) and Missing At
Random (MAR). For MNAR-type oxylipins, missing values values are imputed
using the group-wise arithmetic mean (equivalent to the geometric mean
on the log2 scale). When an entire group lacks detection,
values are imputed as below the Limit of Detection (LOD) or failed to
meet the Limit of Quantification (LOQ) criteria, using the global
minimum for that oxylipin, with stochastic noise sampled from a normal
distribution (10% of the global standard deviation) to preserve variance
and ensure numerical stability. For MAR-type oxylipins, missing values
are imputed using a Random Forest approach (via the
missForest package), leveraging the global metabolic
profile. Finally, oxylipins completely absent from any experimental
group (when require_all_groups = TRUE) or features with
zero variance after imputation are automatically removed. This prevents
mathematical singularities in downstream multivariate analyses and
improves model robustness and biological interpretability.
Differential Abundance Analysis Visualization
Rank Plot
Next, it is possible to visualize the Log2
Fold-change (Log2FC), Standard Error of the Mean
(SEM), and adjusted \(P\)-values
(Benjamini–Hochberg correction) using the function
plot_rank_oxy:
By default, the function ranks oxylipins according to the absolute Log2FC, but they can also be ranked according to the adjusted \(P\)-value:
Violin and Box Plots
Another way to visualize oxylipin abundance across groups is through
boxplots and violin plots. The function plot_violin_oxy
allows visualization of the abundance of a selected oxylipin across all
experimental groups, typically displayed on the log2 scale,
with customizable colors and graphical parameters:
plot_violin_oxy(staRoxy_object,
stats,
oxylipin = "PGD2",
colors = my_colors,
show_sig = TRUE,
sig_y_nudge = 0.4)
#> ! 1 missing/infinite value(s) removed.Abundance can also be visualized on the original
(non-log2) scale using the log_scale = FALSE
argument, and to define the measurement unit prior to transformation
using the unit argument:
plot_violin_oxy(staRoxy_object,
stats,
oxylipin = "PGD2",
colors = my_colors,
show_sig = TRUE,
sig_y_nudge = 0.6,
log_scale = FALSE,
unit = "fmol/mg")
#> ! 1 missing/infinite value(s) removed.Multiple oxylipins may also be displayed simultaneously by providing
a vector of feature names to the oxylipin argument. In this
mode, significance bars are automatically disabled to avoid visual
clutter.
target_oxylipins <- c("11-HETE", "PGD2", "PGE2", "15-HETrE")
plot_violin_oxy(staRoxy_object,
stats,
oxylipin = target_oxylipins,
colors = my_colors)
#> ! 3 missing/infinite value(s) removed.
#> ℹ Significance bars are disabled when plotting multiple oxylipins simultaneously.Correlation Analysis
Certain oxylipin species may exhibit patterns of interest that
warrant further investigation. To support this, staRoxy
provides two approaches for correlation analysis: between two oxylipins
within the same group, or for a single oxylipin across two experimental
groups.
The function cor_within_oxy computes correlations
between two oxylipins within the same group using
log2-transformed abundance values:
cor_within_oxy(staRoxy_object,
oxylipin1 = "11-HETE",
oxylipin2 = "PGE2",
group = "LPS",
colors = my_colors)
#>
#> ── Intra-group Correlation Analysis ────────────────────────────────────────────
#> ℹ Comparison: '11-HETE' vs. 'PGE2' | Group: 'LPS'
#> ℹ Method: SPEARMAN (Normal=FALSE, Outliers=TRUE)
#> ℹ R = 1 | P = 0.0833 | Result: No significant correlation
#> ────────────────────────────────────────────────────────────────────────────────Alternatively, the function cor_between_oxy allows
comparison of the log2-transformed abundance of a single
oxylipin between two experimental groups:
cor_between_oxy(staRoxy_object,
oxylipin = "11-HETE",
group1 = "LPS",
group2 = "Control")
#>
#> ── Inter-group Correlation Analysis ────────────────────────────────────────────
#> ℹ Lipid: '11-HETE' | Comparison: 'LPS' vs. 'Control'
#> ℹ Method: SPEARMAN (Normal=TRUE, Outliers=TRUE)
#> ℹ R = -0.6 | P = 0.417 | Result: No significant correlation
#> ────────────────────────────────────────────────────────────────────────────────Note: Both functions assess normality and outliers
to automatically select the appropriate correlation method (Pearson or
Spearman) when method = "auto" (default). Alternatively,
the method can be specified manually ("pearson" or
"spearman").
Loading Pattern Dissimilarity Analysis
The staRoxy package provides a method to assess the
stability of biological architecture between two conditions by comparing
oxylipin contributions to principal component structure.
While conventional PCA focuses on sample clustering to identify
global differences between conditions, the
loading_dissimilarity function shifts the focus to the
biological architecture of the system. It evaluates whether the
underlying relationships between oxylipins, their relative contributions
to the system’s variance, remain stable or undergo reorganization
between two conditions.
In this analysis, we perform independent PCAs for each experimental
group. For each group \(g \in \{1,
2\}\), the loadings (\(L\)) of
principal component \(i\) (default:
PC1, specified by pc_index = 1) are
extracted as:
\[L_{g,i} = [w_{1,i}, w_{2,i}, \dots, w_{j,i}, \dots, w_{n,i}]^T\]
Here, \(L_{g,i}\) denotes the loading vector for group \(g\) and component \(i\), \(w_j\) represents the loading of the \(j\)-th oxylipin, \(n\) is the number of oxylipins shared between groups after filtering, and \(T\) denotes the transpose operator.
Similarity between groups is quantified by correlating the loading
vectors using Pearson or Spearman correlation
(method = "auto" by default). The resulting global
similarity metric (\(R^2\)) is defined
as:
\[R^2 = \text{cor}(L_{1,i}, L_{2,i})^2\]
This function performs PCA independently for each group. Because Eigenvector signs in PCA are arbitrary, the function automatically detects and corrects axis inversion by reflecting one component when necessary, ensuring meaningful comparisons. If \(\text{cor}(L_{1,i}, L_{2,i}) < 0\), the sign of \(L_{2,i}\) is internally inverted to ensure that comparisons reflect the magnitude of association.
Note: This approach is conceptually related to Tucker’s Congruence Coefficient (\(\phi\)), a standard metric of factor similarity [7].
High correlation indicates that oxylipin contributions are preserved between groups (stable biological structure), whereas low correlation suggests a reorganization of lipid relationships, where oxylipins may change their relative importance or interaction patterns.
Divergence is defined as:
\[\text{Divergence} = 1 - R^2\]
Statistical significance is assessed using a Fisher
\(Z\)-transformation to test whether
the observed correlation (\(r_{obs}\))
differs from a predefined stability threshold (r_stability,
default = 0.6):
\[z = \text{atanh}(r) = \frac{1}{2} \ln \left( \frac{1 + r}{1 - r} \right)\]
Where:
- \(z\): Fisher-transformed correlation coefficient (Fisher’s \(z\)-score).
- \(r\): Original correlation coefficient.
- \(\ln\): Natural logarithm.
\[Z = \frac{z_{obs} - z_{stability}}{SE}\]
Where:
- \(Z\): Test statistic (standard score).
- \(z_{obs}\): Transformed observed correlation between the sample profile and the group mean profile.
- \(z_{stability}\): Transformed stability threshold (default = 0.6, representing the lower bound of a moderate-to-strong correlation).
- \(SE\): Standard error of the Fisher transformation.
- \(n\): Number of oxylipins (data pairs) used in the correlation.
The \(P\)-value for divergence is then computed from the standard normal distribution:
\[P(\text{div}) = \Phi(Z)\] Where:
- \(P(\text{div})\): Probability of observing a profile correlation equal to or lower than \(r_{obs}\), under the null hypothesis that the sample maintains at least the expected stability (\(r \ge r_{stability}\)) relative to its biological group.
- \(\Phi\): Cumulative distribution function of the standard normal distribution.
In addition, loading_dissimilarity generates a scatter plot comparing
loadings between groups. Points are colored according to \(\Delta\)Loading, and oxylipins with \(\Delta\)Loading ≥ threshold
(default = 0.3) are labeled. Features in quadrants II
and IV represent oxylipins with the strongest
divergence in contribution between groups:
loading_dissimilarity(staRoxy_object,
group1 = "LPS",
group2 = "Control",
pc_index = 1)
#> ! Removed 7 oxylipins with zero variance/data in at least one group.
#> ✔ Filtering complete: 8/15 oxylipins retained.
#> ℹ Imputing via 'minprob': assuming missingness is due to low concentration (MNAR).
#>
#> ── Loading Pattern Dissimilarity Analysis ──────────────────────────────────────
#> ℹ Comparison: 'LPS' vs 'Control' (PC1)
#> ℹ Method: SPEARMAN (Normal=FALSE, Outliers=TRUE)
#> ! Reflection: Eigenvector sign was inverted to maximize congruence.
#> ℹ Correlation (R): 0.095 | Global Congruence (R²): 0.9%
#> ℹ Divergence (1-R²): 99.1% | Significance: P(div) = 0.091
#> ────────────────────────────────────────────────────────────────────────────────Note: Missing values (NA) are handled
through a multi-step quality control pipeline. Oxylipins are retained
only if they have valid measurements in at least 50% of samples (≥50%)
in at least one experimental group (user-adjustable). Missingness is
assessed via the plot_na_diagnostic function,
distinguishing between Missing Not At Random (MNAR) and Missing At
Random (MAR). For MNAR-type oxylipins, missing values values are imputed
using the group-wise arithmetic mean (equivalent to the geometric mean
on the log2 scale). When an entire group lacks detection,
values are imputed as below the Limit of Detection (LOD) or failed to
meet the Limit of Quantification (LOQ) criteria, using the global
minimum for that oxylipin, with stochastic noise sampled from a normal
distribution (10% of the global standard deviation) to preserve variance
and ensure numerical stability. For MAR-type oxylipins, missing values
are imputed using a Random Forest approach (via the
missForest package), leveraging the global metabolic
profile. Finally, oxylipins completely absent from any experimental
group (when require_all_groups = TRUE) or features with
zero variance after imputation are automatically removed. This prevents
mathematical singularities in downstream multivariate analyses and
improves model robustness and biological interpretability.
Precursor Identification
It is possible to classify and quantify the number of oxylipins
belonging to each precursor class using the classify_oxy
function:
classify_oxy(staRoxy_object,
return_mode = "summary")
#>
#> ── Oxylipin Classification Summary ─────────────────────────────────────────────
#> ℹ Counts represent unique species detected per precursor/group.
#> Precursor Control LPS Total_Unique_Detected
#> ARA (20:4 n-6) 7 7 7
#> DGLA (20:3 n-6) 1 1 1
#> DHA (22:6 n-3) 0 1 1
#> EPA (20:5 n-3) 0 3 3
#> LA (18:2 n-6) 0 3 3The return_mode = "summary" argument (default) can be
set to "species" to return a binary table, where \(1\) indicates Presence and
\(0\) indicates
Absence within each group:
classify_oxy(staRoxy_object,
return_mode = "species")
#>
#> ── Species Presence/Absence Matrix ─────────────────────────────────────────────
#> ℹ 1 = Presence (detected in group) | 0 = Absence
#> Oxylipin Precursor Control LPS
#> 11-HETE ARA (20:4 n-6) 1 1
#> 12-HHTrE ARA (20:4 n-6) 1 1
#> 15-deoxy-Δ12,14-PGD2 ARA (20:4 n-6) 1 1
#> 15-keto-PGE2 ARA (20:4 n-6) 1 1
#> PGD2 ARA (20:4 n-6) 1 1
#> PGE2 ARA (20:4 n-6) 1 1
#> PGF2α ARA (20:4 n-6) 1 1
#> 15-HETrE DGLA (20:3 n-6) 1 1
#> 13-HDoHE DHA (22:6 n-3) 0 1
#> 11-HEPE EPA (20:5 n-3) 0 1
#> 14(15)-EpETE EPA (20:5 n-3) 0 1
#> PGE3 EPA (20:5 n-3) 0 1
#> 13-oxo-ODE LA (18:2 n-6) 0 1
#> 9-HODE LA (18:2 n-6) 0 1
#> 9-oxo-ODE LA (18:2 n-6) 0 1Note: Oxylipin identifiers should match the
nomenclature adopted in the reference protocol [1], as naming
conventions often vary across the literature. The complete list of
supported identifiers can be accessed using the
show_database_oxy function. To improve readability,
abbreviated names are used throughout the package (see the
Oxylipin Nomenclature section for details).
Precursor Enrichment Analysis
The function oxy_ora performs an
Over-representation Analysis (ORA) to determine whether
the diversity of oxylipins detected in the analyzed groups deviates from
the expected biological distribution.
ORA is performed using a hypergeometric test (one-vs-all) following established approaches in the literature [8]. This analysis compares the observed diversity of each precursor class with its theoretical frequency in the reference universe (125 oxylipins by defalt) [1]. The enrichment magnitude is summarized by the Enrichment Score (\(ES\)), calculated as:
\[ES = \frac{k / n}{K / N}\]
Where:
- \(k\): Number of oxylipin species from precursor class \(i\) detected in the group.
- \(n\): Total number of distinct oxylipin species detected in the group.
- \(K\): Total number of oxylipin species from precursor class \(i\) present in the reference universe.
- \(N\): Total size of the reference universe.
An \(ES > 1\) indicates that a precursor class is over-represented in the analyzed group relative to the reference universe, whereas \(ES < 1\) indicates under-representation.
The statistical significance of the enrichment is determined using the hypergeometric test:
\[P(X \geq k) = \sum_{i=k}^{\min(n, K)} \frac{\binom{K}{i} \binom{N-K}{n-i}}{\binom{N}{n}}\]
Where:
- \(\sum_{i=k}^{\min(n, K)}\) represents the cumulative probability of observing \(k\), \(k+1\), up to the maximum possible number of oxylipins from the precursor class, corresponding to the upper tail probability used to compute the \(P\)-value.
- \(\binom{K}{i}\) represents the number of combinations for selecting \(i\) oxylipins from the precursor class within the reference universe.
- \(\binom{N-K}{n-i}\) represents the number of combinations for selecting the remaining oxylipins that do not belong to that precursor class.
- \(\binom{N}{n}\): represents the total number of possible combinations for selecting \(n\) oxylipins from the universe of size \(N\).
In addition, the function generates a Precursor Enrichment plot, which displays the \(ES\) for each precursor class across groups.
oxy_ora(staRoxy_object)
#>
#> ── Detailed Over-representation analysis (ORA) ─────────────────────────────────
#> ✔ Significant ENRICHMENT found:
#> ARA (20:4 n-6) (Control): ES = 1.99x | P = 0.0131
#> ────────────────────────────────────────────────────────────────────────────────Note: ORA can also be performed between two specific
groups when more than two experimental groups are present. This can be
achieved using the argument compare_two_groups = TRUE and
specifying the groups with the arguments group1 and
group2.
Customizing the Oxylipin Database
By default, ORA and precursor classification are performed using a reference universe composed of the 125 core oxylipins defined in the reference protocol [1]. However, different analytical platforms may target broader or more restricted oxylipin panels. To ensure accurate enrichment statistics and precursor classification, the reference universe should reflect the actual coverage of the experimental assay.
The current reference universe can be inspected using the
get_oxy_universe function, which returns a data frame
containing standardized oxylipin names and their corresponding precursor
classes:
get_oxy_universe()
#> Oxylipin Precursor
#> 1 tetranor-12-HETE ARA (20:4 n-6)
#> 2 12-HHTrE ARA (20:4 n-6)
#> 3 13-oxo-ODE LA (18:2 n-6)
#> 4 9-HOTrE ALA (18:3 n-3)
#> 5 9-oxo-ODE LA (18:2 n-6)
#> 6 13-HOTrE-γ GLA (18:3 n-6)
#> 7 13-HOTrE ALA (18:3 n-3)
#> 8 9-HODE LA (18:2 n-6)
#> 9 10-HODE LA (18:2 n-6)
#> 10 12-HODE LA (18:2 n-6)
#> 11 13-HODE LA (18:2 n-6)
#> 12 9(10)-EpOME LA (18:2 n-6)
#> 13 12(13)-EpOME LA (18:2 n-6)
#> 14 9-HODA OA (18:1 n-9)
#> 15 10-HODA OA (18:1 n-9)
#> 16 9,10-DiHOME LA (18:2 n-6)
#> 17 12,13-DiHOME LA (18:2 n-6)
#> 18 15-deoxy-Δ12,14-PGJ2 ARA (20:4 n-6)
#> 19 15-oxo-ETE ARA (20:4 n-6)
#> 20 5-HEPE EPA (20:5 n-3)
#> 21 9-HEPE EPA (20:5 n-3)
#> 22 8-HEPE EPA (20:5 n-3)
#> 23 11-HEPE EPA (20:5 n-3)
#> 24 12-HEPE EPA (20:5 n-3)
#> 25 15-HEPE EPA (20:5 n-3)
#> 26 18-HEPE EPA (20:5 n-3)
#> 27 14(15)-EpETE EPA (20:5 n-3)
#> 28 17(18)-EpETE EPA (20:5 n-3)
#> 29 5-HETE ARA (20:4 n-6)
#> 30 8-HETE ARA (20:4 n-6)
#> 31 9-HETE ARA (20:4 n-6)
#> 32 11-HETE ARA (20:4 n-6)
#> 33 12-HETE ARA (20:4 n-6)
#> 34 14 / 15-HETE ARA (20:4 n-6)
#> 35 16-HETE ARA (20:4 n-6)
#> 36 17-HETE ARA (20:4 n-6)
#> 37 18-HETE ARA (20:4 n-6)
#> 38 19-HETE ARA (20:4 n-6)
#> 39 20-HETE ARA (20:4 n-6)
#> 40 5(6)-EpETrE ARA (20:4 n-6)
#> 41 8(9)-EpETrE ARA (20:4 n-6)
#> 42 11(12)-EpETrE ARA (20:4 n-6)
#> 43 14(15)-EpETrE ARA (20:4 n-6)
#> 44 15-oxo-EDE EDA (20:2 n-6)
#> 45 5-HETrE MA (20:3 n-9)
#> 46 15-HETrE DGLA (20:3 n-6)
#> 47 2,3-dinor-8-iso-PGF2α ARA (20:4 n-6)
#> 48 10-Nitrooleate OA (18:1 n-9)
#> 49 9-Nitrooleate OA (18:1 n-9)
#> 50 tetranor-PGDM ARA (20:4 n-6)
#> 51 PGA2 ARA (20:4 n-6)
#> 52 PGB2 ARA (20:4 n-6)
#> 53 PGJ2 ARA (20:4 n-6)
#> 54 15-deoxy-Δ12,14-PGD2 ARA (20:4 n-6)
#> 55 12-oxo-LTB4 ARA (20:4 n-6)
#> 56 LTB4 ARA (20:4 n-6)
#> 57 6-trans-LTB4 ARA (20:4 n-6)
#> 58 6-trans-12-epi-LTB4 ARA (20:4 n-6)
#> 59 12-epi-LTB4 ARA (20:4 n-6)
#> 60 8,15-DiHETE ARA (20:4 n-6)
#> 61 14,15-DiHETE EPA (20:5 n-3)
#> 62 8,9-DiHETrE ARA (20:4 n-6)
#> 63 17-oxo-DHA DHA (22:6 n-3)
#> 64 2,3-dinor-TXB2 ARA (20:4 n-6)
#> 65 4-HDoHE DHA (22:6 n-3)
#> 66 7-HDoHE DHA (22:6 n-3)
#> 67 8-HDoHE DHA (22:6 n-3)
#> 68 10-HDoHE DHA (22:6 n-3)
#> 69 11-HDoHE DHA (22:6 n-3)
#> 70 13-HDoHE DHA (22:6 n-3)
#> 71 14-HDoHE DHA (22:6 n-3)
#> 72 16 / 17-HDoHE DHA (22:6 n-3)
#> 73 19-HDoHE DHA (22:6 n-3)
#> 74 20-HDoHE DHA (22:6 n-3)
#> 75 7(8)-EpDPE DHA (22:6 n-3)
#> 76 10(11)-EpDPE DHA (22:6 n-3)
#> 77 13(14)-EpDPE DHA (22:6 n-3)
#> 78 16(17)-EpDPE DHA (22:6 n-3)
#> 79 19(20)-EpDPE DHA (22:6 n-3)
#> 80 17-oxo-DPA DPA (22:5 n-3)
#> 81 PGE3 EPA (20:5 n-3)
#> 82 15-keto-PGE2 ARA (20:4 n-6)
#> 83 LXA5 EPA (20:5 n-3)
#> 84 RvE1 EPA (20:5 n-3)
#> 85 epi-LXA4 ARA (20:4 n-6)
#> 86 15-epi-LXA4 ARA (20:4 n-6)
#> 87 LXB4 ARA (20:4 n-6)
#> 88 20-hydroxy-LTB4 ARA (20:4 n-6)
#> 89 PGD2 ARA (20:4 n-6)
#> 90 13,14-dihydro-15-keto-PGD2 ARA (20:4 n-6)
#> 91 PGE2 ARA (20:4 n-6)
#> 92 13,14-dihydro-15-keto-PGE2 ARA (20:4 n-6)
#> 93 15-keto-PGF2α ARA (20:4 n-6)
#> 94 8-iso-15-keto-PGF2β ARA (20:4 n-6)
#> 95 PGF3α EPA (20:5 n-3)
#> 96 8-iso-PGF3α EPA (20:5 n-3)
#> 97 15-F2t-Isoprostane ARA (20:4 n-6)
#> 98 11β-13,14-dihydro-15-keto-PGF2α ARA (20:4 n-6)
#> 99 11β-PGF2α ARA (20:4 n-6)
#> 100 15-keto-PGF1α DGLA (20:3 n-6)
#> 101 8-iso-PGE1 DGLA (20:3 n-6)
#> 102 PGD1 ARA (20:4 n-6)
#> 103 PGF2α ARA (20:4 n-6)
#> 104 8,12-iso-iPF2α-VI ARA (20:4 n-6)
#> 105 13,14-dihydro-PGF2α ARA (20:4 n-6)
#> 106 PGF1α DGLA (20:3 n-6)
#> 107 PDX DHA (22:6 n-3)
#> 108 Maresin 1 DHA (22:6 n-3)
#> 109 Maresin 2 DHA (22:6 n-3)
#> 110 RvD5 DHA (22:6 n-3)
#> 111 19,20-DiHDPA DHA (22:6 n-3)
#> 112 20-carboxy-LTB4 ARA (20:4 n-6)
#> 113 TXB3 EPA (20:5 n-3)
#> 114 20-hydroxy-PGE2 ARA (20:4 n-6)
#> 115 6-keto-PGE1 DGLA (20:3 n-6)
#> 116 Δ17-6-keto-PGF1α DGLA (20:3 n-6)
#> 117 TXB2 ARA (20:4 n-6)
#> 118 6-keto-PGF1α DGLA (20:3 n-6)
#> 119 6,15-diketo-13,14-dihydro-PGF1α DGLA (20:3 n-6)
#> 120 TXB1 DGLA (20:3 n-6)
#> 121 RvD1 DHA (22:6 n-3)
#> 122 RvD2 DHA (22:6 n-3)
#> 123 RvD3 DHA (22:6 n-3)
#> 124 16,16-dimethyl-PGD2 ARA (20:4 n-6)
#> 125 1a,1b-dihomo-PGE2 AdA (22:4 n-6)To adapt the database to specific experimental conditions, users can
remove existing oxylipins or add custom features and precursor
annotations using the edit_oxy_universe function.
Note: When adding custom oxylipins or precursor names containing Greek letters, Unicode escape sequences are strongly recommended to ensure consistent string matching across operating systems (see the Oxylipin Nomenclature section for details).
First, store the default universe in a new object:
To remove oxylipins not included in your analytical method (e.g.,
PGE2 and PGD2), use the remove argument:
my_universe <- edit_oxy_universe(my_universe,
remove = c("PGE2", "PGD2"))
#> ✔ Removed 2 oxylipins from the universe.Conversely, new oxylipin species can be added by specifying both the
feature name and its precursor class using the add_name and
add_precursor arguments. In the example below, a
hypothetical alpha-oxylipin is added using the appropriate Unicode
escape sequence:
my_universe <- edit_oxy_universe(univ = my_universe,
add_name = "New-Oxylipin\u03b1",
add_precursor = "ARA (20:4 n-6)")
#> ✔ Added 1 new oxylipin. Current universe size: 124.Once the customized universe is defined, it can be supplied directly
to the universe argument in oxy_ora. The function will
automatically recalculate the total universe size (\(N\)) and precursor class sizes (\(K\)) according to the modified database,
ensuring that enrichment statistics accurately reflect the experimental
design:
oxy_ora(staRoxy_object,
universe = my_universe)
#>
#> ── Detailed Over-representation analysis (ORA) ─────────────────────────────────
#> ℹ No significant precursor ENRICHMENT detected.
#> ────────────────────────────────────────────────────────────────────────────────Adaptation to Other Omics Data
Although originally developed for oxylipin profiling [1],
staRoxy may also be applicable to other mass
spectrometry–based datasets, including proteomics and metabolomics. With
the exception of classify_oxy and oxy_ora,
most functions are expected to be compatible with these data types.
R Package Dependencies
staRoxy depends on several widely accessible
packages:
- Core data manipulation and infrastructure:
dplyr,tidyr,tibble,data.table,stringr,forcats,readxl,rlang. - Statistical modeling and analysis:
limma,mgcv,vegan. - Missing data imputation:
missForest. - Visualization:
ggplot2,ggrepel,ggsignif,ggiraphExtra,ComplexHeatmap,circlize,viridis,RColorBrewer,cowplot,scales. - Utilities and graphical system:
grid,cli.
Session and Environment Information
#> R version 4.5.1 (2025-06-13 ucrt)
#> Platform: x86_64-w64-mingw32/x64
#> Running under: Windows 11 x64 (build 26200)
#>
#> Matrix products: default
#> LAPACK version 3.12.1
#>
#> locale:
#> [1] LC_COLLATE=Portuguese_Brazil.utf8 LC_CTYPE=Portuguese_Brazil.utf8
#> [3] LC_MONETARY=Portuguese_Brazil.utf8 LC_NUMERIC=C
#> [5] LC_TIME=Portuguese_Brazil.utf8
#>
#> time zone: America/Sao_Paulo
#> tzcode source: internal
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] staRoxy_0.2.0
#>
#> loaded via a namespace (and not attached):
#> [1] tidyselect_1.2.1 viridisLite_0.4.3 dplyr_1.2.1
#> [4] farver_2.1.2 viridis_0.6.5 S7_0.2.2
#> [7] fastmap_1.2.0 digest_0.6.39 lifecycle_1.0.5
#> [10] cluster_2.1.8.2 Cairo_1.7-0 statmod_1.5.1
#> [13] magrittr_2.0.5 compiler_4.5.1 rlang_1.2.0
#> [16] sass_0.4.10 tools_4.5.1 yaml_2.3.12
#> [19] data.table_1.18.4 knitr_1.51 ggsignif_0.6.4
#> [22] labeling_0.4.3 RColorBrewer_1.1-3 withr_3.0.2
#> [25] purrr_1.2.2 BiocGenerics_0.56.0 grid_4.5.1
#> [28] stats4_4.5.1 colorspace_2.1-2 ggplot2_4.0.3
#> [31] scales_1.4.0 iterators_1.0.14 MASS_7.3-65
#> [34] dichromat_2.0-1 cli_3.6.6 rmarkdown_2.31
#> [37] vegan_2.7-3 crayon_1.5.3 generics_0.1.4
#> [40] otel_0.2.0 rstudioapi_0.18.0 rjson_0.2.23
#> [43] readxl_1.4.5 cachem_1.1.0 stringr_1.6.0
#> [46] splines_4.5.1 parallel_4.5.1 cellranger_1.1.0
#> [49] matrixStats_1.5.0 vctrs_0.7.3 Matrix_1.7-5
#> [52] jsonlite_2.0.0 IRanges_2.44.0 GetoptLong_1.1.1
#> [55] S4Vectors_0.48.1 ggrepel_0.9.8 clue_0.3-68
#> [58] foreach_1.5.2 limma_3.66.0 tidyr_1.3.2
#> [61] jquerylib_0.1.4 glue_1.8.1 codetools_0.2-20
#> [64] cowplot_1.2.0 stringi_1.8.7 shape_1.4.6.1
#> [67] gtable_0.3.6 ComplexHeatmap_2.26.1 tibble_3.3.1
#> [70] pillar_1.11.1 htmltools_0.5.9 circlize_0.4.18
#> [73] R6_2.6.1 doParallel_1.0.17 evaluate_1.0.5
#> [76] lattice_0.22-9 png_0.1-9 bslib_0.10.0
#> [79] Rcpp_1.1.1-1.1 gridExtra_2.3 nlme_3.1-169
#> [82] permute_0.9-10 mgcv_1.9-4 xfun_0.57
#> [85] forcats_1.0.1 prettydoc_0.4.1 pkgconfig_2.0.3
#> [88] GlobalOptions_0.1.4
References
- (in press).
- Chaves-Filho AB, Diniz LS, Santos RS, Lima RS, Oreliana H, Pinto IFD, et al. Plasma oxylipin profiling by high-resolution mass spectrometry reveals signatures of inflammation and hypermetabolism in amyotrophic lateral sclerosis. Free Radic Biol Med. 2023;208:285–298.
- Bulmer MG. Principles of Statistics. Dover Publications; 1979.
- García V, Sánchez JS, Mollineda RA. On the effectiveness of preprocessing methods when dealing with different levels of class imbalance. Knowledge-Based Systems. 2011;24(1):13–21.
- Anderson MJ. A new method for non-parametric multivariate analysis of variance. Austral Ecology. 2001;26:32–46.
- Ritchie ME, Phipson B, Wu D, Hu Y, Law CW, Shi W, Smyth GK. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Research. 2015;43(7):e47.
- Lorenzo-Seva U, ten Berge JMF. Tucker’s congruence coefficient as a meaningful index of factor similarity. Methodology. 2006;2(2):57–64.
- Yu G, Wang LG, Han Y, He QY. clusterProfiler: an R package for comparing biological themes among gene clusters. OMICS. 2012;16(5):284–287.