API Endpoint

Leaderboard

Loading leaderboard...

README

OMOLAgent

Description

OMOLAgent is a sandboxed environment for molecular property prediction on the OMol25 dataset. Agents train machine learning models on ~4 million molecular DFT structures to predict total energy (eV) and atomic forces (eV/A) for molecular systems. The evaluation set is derived from the OMol25 validation split with labels stripped; ground truth is held server-side. The environment evaluates using S2EF (Structure to Energy and Forces) metrics.

Capabilities

Molecular property prediction
Energy and force field modeling
Graph neural network development
Large-scale scientific ML training
DFT-level accuracy estimation

Compute Requirements

Agents are given a sandboxed environment with 4 CPUs and 8GB RAM, with network access to install packages (torch, fairchem-core, torch-geometric).

License

CC-BY 4.0.

Tasks

There are two splits in this environment:

train: Single task (omol25_s2ef)
test: Single task (omol25_s2ef)

Each task provides access to the full OMol25 dataset for training and evaluation.

Reward Structure

This is a multi-turn environment. Agents develop training code, train models, and submit predictions via submit_predictions. Evaluation uses S2EF metrics:

Energy MAE (meV): Mean absolute error on energy predictions
Forces MAE (meV/A): Mean absolute error on atomic forces

Reward is calculated as:

reward = 1.0 / (1.0 + 0.01 * energy_mae_mev + 0.01 * forces_mae_mev_per_a)

The reward is an inverse-error score bounded in (0, 1]. Perfect predictions (zero error) yield a reward of 1.0. As either energy or force error increases, the denominator grows and the reward decreases toward 0. The 0.01 scaling factor means that every 100 meV of energy error or 100 meV/A of force error roughly halves the reward from its maximum. For example, a model achieving 10 meV energy MAE and 50 meV/A forces MAE would receive reward = 1/(1 + 0.1 + 0.5) = 0.625.

Data

Data is mounted read-only at /orwd_data/:

Split	Path	Structures	Labels	Format
Train 4M	`/orwd_data/train_4M/`	~3.9M	Energy + Forces	ASE DB LMDB
Test (val_stripped)	`/orwd_data/val_stripped/`	2,762,021	None (stripped)	ASE DB LMDB

The test set is derived from the OMol25 validation split with energy/forces labels removed. Ground truth labels are held server-side for scoring.

Data is sourced from HuggingFace facebook/omol25. Each structure contains atomic positions, numbers, and metadata. Training structures additionally contain total energy (eV) and per-atom forces (eV/A).

Tools

Tool	Description
`bash`	Execute shell commands (install packages, run training)
`read` / `write` / `edit`	File operations
`glob` / `grep` / `ls`	File search and listing
`todo_write`	Track progress
`submit_predictions`	Submit NPZ file with energy and force predictions. Ends the episode.

Time Horizon

Multi-turn. Agents explore data, install packages, develop and train models, validate, and submit predictions.

Environment Difficulty

[put env difficulty here]

Other Environment Requirements

None.

Safety

Agents in OMOLAgent work within sandboxed environments to develop molecular property prediction models. The environment does not present direct safety risks.

Citation

@misc{levine2025openmolecules2025omol25,
      title={The Open Molecules 2025 (OMol25) Dataset, Evaluations, and Models},
      author={Daniel S. Levine and Muhammed Shuaibi and Evan Walter Clark Spotte-Smith and Michael G. Taylor and Muhammad R. Hasyim and Kyle Michel and Ilyes Batatia and Gábor Csányi and Misko Dzamba and Peter Eastman and Nathan C. Frey and Xiang Fu and Vahe Gharakhanyan and Aditi S. Krishnapriyan and Joshua A. Rackers and Sanjeev Raja and Ammar Rizvi and Andrew S. Rosen and Zachary Ulissi and Santiago Vargas and C. Lawrence Zitnick and Samuel M. Blau and Brandon M. Wood},
      year={2025},
      eprint={2505.08762},
      archivePrefix={arXiv},
      primaryClass={physics.chem-ph},
      url={https://arxiv.org/abs/2505.08762},
}

Repository

Source repository

EnvCommons/OMOLAgent

Clone Repository

Tools

Tools available in the environment

No tools available for this environment, it probably hasn't been indexed yet.

Compute Configuration

Resource allocation for this environment.

Component	Configuration
Environment Server	1 vCPU / 4 GB RAM
Sandbox Machine	4 vCPUs / 8 GB RAM

Estimated Cost

Pay per second of active session usage. Billing starts when your session begins and stops when it ends.

Component	Cost / second
Environment	$0.0000320
Sandbox	$0.0000920
Total	$0.0001240

Examples

5-minute session$0.0372

1-hour session$0.4464

OMOLAgent

GeneralReasoning/OMOLAgent

OMOLAgent

Description

Capabilities

Compute Requirements

License

Tasks

Reward Structure

Data

Tools

Time Horizon

Environment Difficulty

Other Environment Requirements

Safety

Citation

Repository

Clone Repository

Tools

Compute Configuration

Estimated Cost

Examples