OMOLAgent
OMOLAgent
Description
OMOLAgent is a sandboxed environment for molecular property prediction on the OMol25 dataset. Agents train machine learning models on ~4 million molecular DFT structures to predict total energy (eV) and atomic forces (eV/A) for molecular systems. The evaluation set is derived from the OMol25 validation split with labels stripped; ground truth is held server-side. The environment evaluates using S2EF (Structure to Energy and Forces) metrics.
Capabilities
- Molecular property prediction
- Energy and force field modeling
- Graph neural network development
- Large-scale scientific ML training
- DFT-level accuracy estimation
Compute Requirements
Agents are given a sandboxed environment with 4 CPUs and 8GB RAM, with network access to install packages (torch, fairchem-core, torch-geometric).
License
Tasks
There are two splits in this environment:
- train: Single task (omol25_s2ef)
- test: Single task (omol25_s2ef)
Each task provides access to the full OMol25 dataset for training and evaluation.
Reward Structure
This is a multi-turn environment. Agents develop training code, train models, and submit predictions via submit_predictions. Evaluation uses S2EF metrics:
- Energy MAE (meV): Mean absolute error on energy predictions
- Forces MAE (meV/A): Mean absolute error on atomic forces
Reward is calculated as:
reward = 1.0 / (1.0 + 0.01 * energy_mae_mev + 0.01 * forces_mae_mev_per_a)
The reward is an inverse-error score bounded in (0, 1]. Perfect predictions (zero error) yield a reward of 1.0. As either energy or force error increases, the denominator grows and the reward decreases toward 0. The 0.01 scaling factor means that every 100 meV of energy error or 100 meV/A of force error roughly halves the reward from its maximum. For example, a model achieving 10 meV energy MAE and 50 meV/A forces MAE would receive reward = 1/(1 + 0.1 + 0.5) = 0.625.
Data
Data is mounted read-only at /orwd_data/:
| Split | Path | Structures | Labels | Format |
|---|---|---|---|---|
| Train 4M | /orwd_data/train_4M/ | ~3.9M | Energy + Forces | ASE DB LMDB |
| Test (val_stripped) | /orwd_data/val_stripped/ | 2,762,021 | None (stripped) | ASE DB LMDB |
The test set is derived from the OMol25 validation split with energy/forces labels removed. Ground truth labels are held server-side for scoring.
Data is sourced from HuggingFace facebook/omol25. Each structure contains atomic positions, numbers, and metadata. Training structures additionally contain total energy (eV) and per-atom forces (eV/A).
Tools
| Tool | Description |
|---|---|
bash | Execute shell commands (install packages, run training) |
read / write / edit | File operations |
glob / grep / ls | File search and listing |
todo_write | Track progress |
submit_predictions | Submit NPZ file with energy and force predictions. Ends the episode. |
Time Horizon
Multi-turn. Agents explore data, install packages, develop and train models, validate, and submit predictions.
Environment Difficulty
[put env difficulty here]
Other Environment Requirements
None.
Safety
Agents in OMOLAgent work within sandboxed environments to develop molecular property prediction models. The environment does not present direct safety risks.
Citation
@misc{levine2025openmolecules2025omol25,
title={The Open Molecules 2025 (OMol25) Dataset, Evaluations, and Models},
author={Daniel S. Levine and Muhammed Shuaibi and Evan Walter Clark Spotte-Smith and Michael G. Taylor and Muhammad R. Hasyim and Kyle Michel and Ilyes Batatia and Gábor Csányi and Misko Dzamba and Peter Eastman and Nathan C. Frey and Xiang Fu and Vahe Gharakhanyan and Aditi S. Krishnapriyan and Joshua A. Rackers and Sanjeev Raja and Ammar Rizvi and Andrew S. Rosen and Zachary Ulissi and Santiago Vargas and C. Lawrence Zitnick and Samuel M. Blau and Brandon M. Wood},
year={2025},
eprint={2505.08762},
archivePrefix={arXiv},
primaryClass={physics.chem-ph},
url={https://arxiv.org/abs/2505.08762},
}