Skip to content
/ RVAV Public

RVAV: a physics-informed PyTorch optimizer for energy-stable, high-LR training—with a simple closure API, tests, CI, and quickstart.

License

Notifications You must be signed in to change notification settings

shzhang3/RVAV

Repository files navigation

PyTorch RVAV Optimizer

A PyTorch implementation of the Relaxed Vector Auxiliary Variable (RVAV) algorithm — a novel optimizer for deep learning and other unconstrained optimization problems.

This implementation is based on:

A Relaxed Vector Auxiliary Variable Algorithm for Unconstrained Optimization Problems
S. Zhang, J. Zhang, J. Shen, and G. Lin


Motivation

The RVAV optimizer is designed to be unconditionally energy stable, enabling reliable convergence even on complex, non-convex loss landscapes where traditional optimizers may struggle. It achieves this by introducing an adaptive, element-wise learning rate controlled by an auxiliary variable, plus a relaxation step that enforces stability.

Key Features

  • Robust Convergence: Unconditional energy dissipation prevents divergence.
  • Element-wise Adaptive Learning Rate: Each parameter dimension adapts automatically.
  • Strong Performance: Fast and accurate convergence on convex and non-convex problems.
  • Drop-in Usage: Works as a replacement for SGD/Adam in PyTorch.

Installation

Install directly from GitHub using pip:

pip install git+https://github.com/shzhang3/RVAV.git

🚀 Quickstart in 60 Seconds

Get up and running with a minimal example.

  1. Create a virtual environment and install dependencies:

    git clone https://github.com/shzhang3/RVAV.git
    cd RVAV
    python -m venv .venv
    source .venv/bin/activate   # On Windows: .venv\Scripts\activate
    pip install -r requirements.txt
  2. Run a minimal example:

    python examples/toy_example.py

    You should see output like this:

    Epoch [20/100], Loss: 3.7682, Weight: 1.9557, Bias: 0.6327
    Epoch [40/100], Loss: 3.5831, Weight: 1.9656, Bias: 0.7949
    Epoch [60/100], Loss: 3.5016, Weight: 1.9716, Bias: 0.9033
    Epoch [80/100], Loss: 3.4641, Weight: 1.9758, Bias: 0.9757
    Epoch [100/100], Loss: 3.4464, Weight: 1.9788, Bias: 1.0241
    ...
    

Basic Usage

Using RVAV is as simple as using any other PyTorch optimizer.

from rvav import RVAV, RVAV_Momentum
from torch import nn

# 1. Define your model
model = nn.Linear(10, 1)

# 2. Instantiate the optimizer
# Use the base version
optimizer = RVAV(model.parameters(), lr=0.01)

# Or the version with momentum
optimizer_momentum = RVAV_Momentum(model.parameters(), lr=0.01, momentum=0.9)

# 3. Use it in your training loop
loss = model(input).mean()
optimizer.zero_grad()
loss.backward()
optimizer.step()

Reference

If you use this optimizer in your research, please cite the original work:

  • Zhang, S., Zhang, J., Shen, J., & Lin, G. (2025). A Relaxed Vector Auxiliary Variable Algorithm for Unconstrained Optimization Problems. SIAM Journal on Scientific Computing, 47(1), C126–C150.

Contributing

Contributions are welcome! If you find a bug or have ideas for improvements:

  • Open an issue to discuss the change.
  • Submit a pull request with a clear description and a minimal reproducible example if applicable.

License

This project is licensed under the MIT License. See the LICENSE file for details.


RVAV Benchmark Results

This document presents the performance of the RVAV optimizer and its momentum variant compared to the standard Adam optimizer on a synthetic linear regression task. The goal is to demonstrate the convergence speed and final loss achieved by each optimizer under the same conditions.

All models were trained for 200 epochs on the same dataset with a learning rate of 0.01.


Convergence Plot

The following plot illustrates the training loss over 200 epochs. The RVAV + Momentum variant demonstrates the fastest initial convergence, quickly reaching a low loss value within the first 40 epochs.

Training loss over 200 epochs


Performance Summary

The final loss values after 200 epochs are summarized in the table below. Both RVAV variants significantly outperform the Adam baseline on this task, achieving a much lower final loss.

Optimizer Final Loss (after 200 epochs)
RVAV 4.2912
RVAV + Momentum 4.2908
Adam (Baseline) 5.6671

Conclusion

The results clearly indicate that both the original RVAV and the RVAV + Momentum optimizers are highly effective for this regression problem. The addition of momentum provides a noticeable benefit in terms of initial convergence speed and achieves the lowest overall loss. Both custom optimizers substantially outperformed the Adam baseline, highlighting their potential as powerful alternatives for specific machine learning tasks.