PyTorch RVAV Optimizer

A PyTorch implementation of the Relaxed Vector Auxiliary Variable (RVAV) algorithm — a novel optimizer for deep learning and other unconstrained optimization problems.

This implementation is based on:

A Relaxed Vector Auxiliary Variable Algorithm for Unconstrained Optimization Problems
S. Zhang, J. Zhang, J. Shen, and G. Lin

Motivation

The RVAV optimizer is designed to be unconditionally energy stable, enabling reliable convergence even on complex, non-convex loss landscapes where traditional optimizers may struggle. It achieves this by introducing an adaptive, element-wise learning rate controlled by an auxiliary variable, plus a relaxation step that enforces stability.

Key Features

Robust Convergence: Unconditional energy dissipation prevents divergence.
Element-wise Adaptive Learning Rate: Each parameter dimension adapts automatically.
Strong Performance: Fast and accurate convergence on convex and non-convex problems.
Drop-in Usage: Works as a replacement for SGD/Adam in PyTorch.

Installation

Install directly from GitHub using pip:

pip install git+https://github.com/shzhang3/RVAV.git

🚀 Quickstart in 60 Seconds

Get up and running with a minimal example.

Create a virtual environment and install dependencies:

git clone https://github.com/shzhang3/RVAV.git
cd RVAV
python -m venv .venv
source .venv/bin/activate   # On Windows: .venv\Scripts\activate
pip install -r requirements.txt

Run a minimal example:

python examples/toy_example.py

You should see output like this:

Epoch [20/100], Loss: 3.7682, Weight: 1.9557, Bias: 0.6327
Epoch [40/100], Loss: 3.5831, Weight: 1.9656, Bias: 0.7949
Epoch [60/100], Loss: 3.5016, Weight: 1.9716, Bias: 0.9033
Epoch [80/100], Loss: 3.4641, Weight: 1.9758, Bias: 0.9757
Epoch [100/100], Loss: 3.4464, Weight: 1.9788, Bias: 1.0241
...

Basic Usage

Using RVAV is as simple as using any other PyTorch optimizer.

from rvav import RVAV, RVAV_Momentum
from torch import nn

# 1. Define your model
model = nn.Linear(10, 1)

# 2. Instantiate the optimizer
# Use the base version
optimizer = RVAV(model.parameters(), lr=0.01)

# Or the version with momentum
optimizer_momentum = RVAV_Momentum(model.parameters(), lr=0.01, momentum=0.9)

# 3. Use it in your training loop
loss = model(input).mean()
optimizer.zero_grad()
loss.backward()
optimizer.step()

Reference

If you use this optimizer in your research, please cite the original work:

Zhang, S., Zhang, J., Shen, J., & Lin, G. (2025). A Relaxed Vector Auxiliary Variable Algorithm for Unconstrained Optimization Problems. SIAM Journal on Scientific Computing, 47(1), C126–C150.

Contributing

Contributions are welcome! If you find a bug or have ideas for improvements:

Open an issue to discuss the change.
Submit a pull request with a clear description and a minimal reproducible example if applicable.

License

This project is licensed under the MIT License. See the LICENSE file for details.

RVAV Benchmark Results

This document presents the performance of the RVAV optimizer and its momentum variant compared to the standard Adam optimizer on a synthetic linear regression task. The goal is to demonstrate the convergence speed and final loss achieved by each optimizer under the same conditions.

All models were trained for 200 epochs on the same dataset with a learning rate of 0.01.

Convergence Plot

The following plot illustrates the training loss over 200 epochs. The RVAV + Momentum variant demonstrates the fastest initial convergence, quickly reaching a low loss value within the first 40 epochs.

Performance Summary

The final loss values after 200 epochs are summarized in the table below. Both RVAV variants significantly outperform the Adam baseline on this task, achieving a much lower final loss.

Optimizer	Final Loss (after 200 epochs)
RVAV	4.2912
RVAV + Momentum	4.2908
Adam (Baseline)	5.6671

Conclusion

The results clearly indicate that both the original RVAV and the RVAV + Momentum optimizers are highly effective for this regression problem. The addition of momentum provides a noticeable benefit in terms of initial convergence speed and achieves the lowest overall loss. Both custom optimizers substantially outperformed the Adam baseline, highlighting their potential as powerful alternatives for specific machine learning tasks.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
.github/workflows		.github/workflows
examples		examples
rvav		rvav
tests		tests
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CITATION.cff		CITATION.cff
LICENSE		LICENSE
README.md		README.md
RESULTS.md		RESULTS.md
convergence_plot.png		convergence_plot.png
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

PyTorch RVAV Optimizer

Motivation

Key Features

Installation

🚀 Quickstart in 60 Seconds

Basic Usage

Reference

Contributing

License

RVAV Benchmark Results

Convergence Plot

Performance Summary

Conclusion

About

Uh oh!

Releases 2

Packages

Languages

License

shzhang3/RVAV

Folders and files

Latest commit

History

Repository files navigation

PyTorch RVAV Optimizer

Motivation

Key Features

Installation

🚀 Quickstart in 60 Seconds

Basic Usage

Reference

Contributing

License

RVAV Benchmark Results

Convergence Plot

Performance Summary

Conclusion

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Languages

Packages