Geometry-Aware Longitudinal Flow Matching with Mixture of Experts
Many biological systems evolve through continuous dynamics while switching between latent regimes, yet are observed only as unpaired longitudinal snapshots. FLUX learns a data-dependent geometry to construct manifold-aware conditional paths between adjacent marginals and decomposes the resulting velocity field into sparse expert vector fields for joint transport modeling and unsupervised regime discovery.
Yale University · Wu Tsai Institute · Kavli Institute
Many biological systems evolve through continuous local dynamics while switching between latent regimes defined by learning, stimulus context, internal state, or developmental stage. These processes are often observed only as unpaired longitudinal snapshots: the same cells, neurons, or animals are not tracked as matched trajectories, even though population states are sampled across successive stages. This creates two coupled challenges. First, trajectories must respect curved low-dimensional manifolds embedded in high-dimensional biological measurements. Second, the model must identify when the transport mechanism itself changes.
We introduce FLUX (Flow matching for Unpaired longitudinal data with miXture-of-experts), a geometry-aware longitudinal flow-matching framework for joint transport modeling and unsupervised regime discovery. FLUX learns a data-dependent metric from pooled labeled and unlabeled observations, uses that metric to construct geometry-aware conditional paths between adjacent marginals, and decomposes the resulting velocity field into sparse expert vector fields selected by a Straight-Through Gumbel-Softmax router.
Across manifold controls, a regime-switching Lorenz system, widefield cortical calcium imaging during associative learning, and embryoid body single-cell differentiation, FLUX reconstructs longitudinal transport while recovering interpretable regime structure. Ablations show that mixture-of-experts routing alone is insufficient: FLUX without geometric learning can fit local transport but fails or weakens regime discovery when regimes are encoded in local dynamics. These results suggest that geometry-aware velocity decomposition provides a general strategy for discovering latent biological state transitions from unpaired longitudinal snapshots.
FLUX combines three ideas: longitudinal marginal chaining, geometry-aware conditional paths, and a mixture-of-experts velocity field.
FLUX observes T ordered marginal distributions at discrete observation times. Each marginal contains unpaired samples—consecutive snapshots do not correspond to the same cell, neuron, or individual. FLUX trains on adjacent marginal pairs: for each pair, endpoints are sampled from an optimal-transport coupling, a local interpolation time is drawn, and the resulting point and tangent are mapped to global model time. This yields a single velocity field whose ODE is integrated across the full marginal chain.
Standard flow matching uses Euclidean linear interpolation between endpoints, which can pass through low-density regions when data concentrate near a curved manifold. FLUX instead learns a data-dependent metric G from pooled observations (including unlabeled samples when available), then trains a bend network B that parameterizes approximate geodesic paths. The geometry-aware path replaces the Euclidean interpolant everywhere: it changes the training locations, the target tangents, and where the velocity network is evaluated. The metric is not an input to the velocity network—it reshapes the conditional paths used for supervision.
FLUX decomposes the velocity field into M expert vector fields. A router maps each (time, state) pair to expert logits, and Straight-Through Gumbel-Softmax produces differentiable but near-discrete routing weights during training. At inference, the hard expert assignment provides an unsupervised candidate regime label—a decomposition of the learned transport dynamics rather than post-hoc clustering of static observations. Router regularizers encourage sparse, temporally coherent assignments and discourage expert collapse.
Geometry and bend networks are frozen before velocity training. Evaluation uses compute_metrics.py.
| Stage | Script | Output |
|---|---|---|
| 1. Geometry | train_benchmark_rbf.py or train_benchmark_vae.py |
rbf_network_best.pth or best_gaga_model.pth |
| 2. Bend | train_benchmark_bend.py |
bend_network_best.pth |
| 3. Velocity | train_benchmark_velocity.py |
velocity_network_best.pth |
See the repository README for full options, conda setup, and dataset-specific flags.
git clone https://github.com/josueortc/flux.git
cd flux
conda create -n flux python=3.10 -y && conda activate flux
# Install PyTorch for your platform, then:
pip install -r requirements.txt
Learn the Riemannian metric for your benchmark dataset.
python scripts/benchmark_data/train_benchmark_rbf.py \
--dataset lorenz --lorenz_mode day_marginals --num_marginals 8 \
--save_dir saved_models/lorenz
Train the bend network using the saved geometry checkpoint.
python scripts/benchmark_data/train_benchmark_bend.py \
--dataset lorenz --lorenz_mode day_marginals --num_marginals 8 \
--geo_model_path saved_models/lorenz/rbf_network_best.pth \
--save_dir saved_models/lorenz
Train the mixture-of-experts velocity with Gumbel routing, then evaluate.
python scripts/benchmark_data/train_benchmark_velocity.py \
--dataset lorenz --lorenz_mode day_marginals --num_marginals 8 \
--geo_model_path saved_models/lorenz/rbf_network_best.pth \
--bend_model_path saved_models/lorenz/bend_network_best.pth \
--use_gumbel_routing --num_experts 2 --save_dir saved_models/lorenz
python scripts/benchmark_data/compute_metrics.py \
--dataset lorenz --model_dir saved_models/lorenz
If you use FLUX in your research, please cite:
@article{ortegacaro2026flux,
title={FLUX: Geometry-Aware Longitudinal Flow Matching with Mixture of Experts},
author={Ortega Caro, Josue and Zhang, Yongxu and Batchelor, Hannah M. and He, Sizhuang and Cardin, Jessica and Saxena, Shreya},
journal={arXiv preprint arXiv:2605.08648},
year={2026}
}