FabScheduler
FabScheduler
Description
FabScheduler is a semiconductor fabrication scheduling environment. Agents act as fab dispatchers, routing wafer lots through processing tools in a simulated 28nm logic fab over a 168-hour (1-week) planning horizon. The environment models re-entrant flow, batch processing, queue time constraints, sequence-dependent setup times, preventive maintenance, random equipment breakdowns, and contamination risks.
Note: this is a synthetic environment which is majority AI-generated; it is recommended testing before use in any RL pipeline.
Capabilities
- Multi-step dispatching of wafer lots to processing tools across 7 tool groups
- Batch formation for furnace and wet clean tools with recipe compatibility constraints
- Managing queue time constraints to avoid contamination-induced wafer scrap
- Planning around preventive maintenance windows and random equipment breakdowns
- Balancing throughput, on-time delivery, and machine health over a long horizon
- Long-horizon multi-turn execution (100s of tool calls per task)
Compute Requirements
No further compute requirements.
License
MIT
Tasks
There are 30 training tasks and 10 test tasks. Tasks vary across six dimensions:
- Product mix: logic-heavy, balanced, or memory-heavy ratios of 3 product types
- Number of lots: 15-20 lots of 25 wafers each
- Hot lot fraction: 10-20% of lots designated as priority with tighter deadlines
- Maintenance intensity: light, normal, or heavy PM schedules
- Demand pressure: tight, normal, or relaxed due dates
- Random seed: determines breakdown timing, processing time variation, and lot arrivals
Each task simulates a 168-hour (1-week) planning horizon. The fab contains 17 tools across 7 groups (Lithography, Etch, CVD, Furnace, Ion Implant, CMP, Wet Clean) and 3 product types with 11-14 re-entrant process steps each.
Reward Structure
This is a sparse, verifiable reward environment. The final reward is computed at the end of the simulation (hour 168 or on early submission) as a weighted sum of five components, each normalized to [0, 1]:
| Component | Weight | Description |
|---|---|---|
| Throughput | 35% | Fraction of total process steps completed across all lots |
| Lateness | 25% | Penalty for lots completing after their due date |
| Scrap | 20% | Penalty for wafers lost to contamination from QTC violations |
| Machine Health | 10% | Penalty for tools with overdue preventive maintenance |
| Utilization | 10% | Average utilization of bottleneck tools (Lithography, Furnace) |
We do not use LLM graders for this task. All rewards are computed deterministically from the simulation state.
Data
No external data is required. The simulation is fully self-contained with all parameters (tool configurations, product routes, lot specifications) generated programmatically from the task specification.
Tools
Agents have access to 7 environment-specific tools:
- get_fab_status: Full snapshot of all tools, lots, queues, and alerts
- get_lot_info: Detailed info on a specific lot including its full route
- get_tool_info: Detailed info on a specific tool including PM schedule
- dispatch: Assign a waiting lot to an idle serial tool
- form_batch: Form a batch for furnace (up to 4 lots) or wet clean (up to 2 lots)
- advance: Advance the simulation clock by 1-8 hours, processing all events
- submit: End the simulation early and compute the final reward
Time Horizon
FabScheduler is a long-horizon, multi-turn environment. Each task spans a 168-hour simulated week. Agents must make dispatching decisions and advance time repeatedly. The environment produces very long delayed rewards: intermediate tool calls return reward=0.0, with the full reward computed only at the end.
Environment Difficulty
The core scheduling problem is NP-hard due to re-entrant flow, sequence-dependent setup times, batch processing constraints, and queue time limits. Key challenges include:
- Queue time constraints: Wafers must move from Wet Clean to Furnace within 2 hours (and Etch to Clean within 4 hours) or risk contamination and scrap
- Batch formation: Furnaces process up to 4 lots simultaneously but require recipe compatibility, creating a tension between waiting for full batches and respecting QTC deadlines
- Maintenance disruptions: Scheduled PM windows and random breakdowns can displace lots mid-processing
- Conflicting objectives: Maximizing throughput often conflicts with minimizing lateness and scrap
Other Environment Requirements
There are no further environment requirements; FabScheduler works out of the box with the OpenReward endpoint without any secrets.
Safety
FabScheduler operates within a simulated fab environment. Agents interact only with a mathematical simulation and cannot affect any real-world systems.
Citations
@dataset{GRFabScheduler,
author = {General Reasoning Inc. Team},
title = {FabScheduler: Semiconductor Fabrication Scheduling Environment},
year = {2026},
publisher = {OpenReward},
url = {https://openreward.ai/GeneralReasoning/fabscheduler}
}