IPLBench

Description

IPLBench is an environment for building machine learning models of Indian Premier League (IPL) cricket and trading those models on historical betting markets. Agents develop ML strategies using historical match data (2010-2024), place bets on match outcomes, and manage bankroll across an entire IPL season.

Capabilities

Developing machine learning models for cricket match prediction
Backtesting models against historical betting odds
Bankroll management and bet execution
Iterating on model development over time

Compute Requirements

Agents are given a sandbox with file system access and scientific Python libraries (pandas, numpy).

Tasks

There is one split (train) with 4 tasks/scenarios:

early-ipl: Bets for the 2013 season, starting bankroll of ₹100, training data from 2010-2012.
mid-ipl: Bets for the 2017 season, starting bankroll of ₹150, training data from 2010-2016.
covid-ipl: Bets for the 2021 season, starting bankroll of ₹200, training data from 2010-2020.
recent-ipl: Bets for the 2024 season, starting bankroll of ₹250, training data from 2010-2023.

Each task lasts for an entire IPL season and concludes after the final matchday of betting. Agents must place at least one bet per matchday before advancing to the next.

Reward Structure

This is a dense, verifiable reward environment. Rewards occur after each matchday. The reward is calculated as the difference in log wealth before and after betting:

$\log{W_{t+1}} - \log{W_{t}}$

Agents must place at least one bet per matchday to prevent the agent from learning to not exert any effort in the task.

No LLM graders are used for this task; rewards are deterministic based on match outcomes and betting odds.

Data

Historical IPL match data consisting of 984 matches from 2010-2024, including team names, scores, wickets, betting odds, match type, and results. The data is sourced from public betting odds recorded around match time.

Training data is mounted at /tmp/gr-datasets for agents to build and refine their models. After each matchday, agents receive the complete match data for settled matches.

Tools

Agents are given access to CLI tools for creating, viewing, and searching a filesystem (bash, read, write, edit, grep, glob, ls, todo_write). They are also given environment-specific tools:

Tool	Description
`view_matches`	View current matchday's games with betting odds.
`place_bet`	Place a bet on a match outcome (`team1` or `team2`) with a specified amount.
`view_bankroll`	View current bankroll and active bets.
`next_matchday`	Settle bets, receive reward, and advance to the next matchday.

Time Horizon

IPLBench is an open-ended, long-horizon environment where agents simulate an entire IPL season of model development and betting. The number of turns in each task corresponds to the number of unique matchdays in that season.

Environment Difficulty

[Put environment difficulty statistics here]

Other Environment Requirements

There are no further environment requirements; IPLBench works out of the box with the OpenReward endpoint without any external API keys.

Safety

Agents in IPLBench are told to maximize their long-run bankroll growth. The environment does not present direct safety risks, as agents only interact with historical data through betting decisions on public odds.

There may be indirect risks, however, in that an agent that is taught to maximize long-run wealth may blindly follow this objective when tested in other environments, leading it to pursue unethical objectives. Our advice is that multi-environment training runs involving IPLBench should include other environments that teach agents to respect ethical norms so that the agent understands a broader category of objectives than just maximizing wealth.

Citation

@dataset{GRIPLBench,
  author    = {General Reasoning Inc. Team},
  title     = {IPLBench},
  year      = {2026},
  publisher = {OpenReward},
  url       = {https://www.openreward.ai/GeneralReasoning/IPLBench}
}

Tools

Tools available in the environment

No tools available for this environment, it probably hasn't been indexed yet.

Compute Configuration

Resource allocation for this environment.

Component	Configuration
Environment Server	1 vCPU / 4 GB RAM
Sandbox Machine	1 vCPU / 2 GB RAM

Estimated Cost

Pay per second of active session usage. Billing starts when your session begins and stops when it ends.

Component	Cost / second
Environment	$0.0000320
Sandbox	$0.0000230
Total	$0.0000550

Examples

5-minute session$0.0165

1-hour session$0.1980