Nemotron-RL-Workplace-Assistant

Name: NVIDIA/Nemotron-RL-agent-workplace_assistant
Author: NVIDIA

Description

Nemotron-RL-Workplace-Assistant is an agentic environment that evaluates whether a model can correctly execute workplace tasks using simulated business tools. Each task presents a natural language request (e.g., "reply to Carlos's last email about the task update") and the agent must invoke the correct sequence of tool calls with the correct arguments to fulfill it. The environment covers five workplace domains: email, calendar, project management, customer relationship management (CRM), and web analytics.

The tools are backed by real simulated backends using pandas DataFrames loaded from CSV data files. When the agent calls finish(), the resulting database state (across all five domain backends) is compared against the state produced by executing the ground truth tool calls. This faithfully reproduces NVIDIA's original state-based grading from NeMo Gym.

Capabilities

Multi-step agentic tool use across 27 workplace tools
Action planning: determining which tools to call and in what order
Argument accuracy: providing correct IDs, field names, values, and free-text content
Five workplace domains: email, calendar, project management, CRM, analytics

License

CC-BY-4.0.

Tasks

Split	Tasks
`train`	1,255
`validation`	545

Tasks are distributed across five categories:

Category	Description
`workplace_assistant_email`	Send, reply, forward, delete, search emails
`workplace_assistant_calendar`	Create, update, delete, search calendar events
`workplace_assistant_project_management`	Create, update, delete, search project tasks
`workplace_assistant_customer_relationship_manager`	Add, update, delete, search CRM customers
`workplace_assistant_analytics`	Query visit counts, session durations, create plots

Ground truth call counts per task range from 0 to 8, with the majority being single-call tasks.

Reward Structure

Reward is binary (0.0 or 1.0), determined by state-based comparison:

The agent's recorded action tool calls are executed against fresh tool backends (pandas DataFrames loaded from CSV).
The ground truth tool calls are executed against separate fresh tool backends.
The resulting DataFrame states (email, calendar, analytics plots, project tasks, CRM) are compared using DataFrame.equals() after case normalization.
If all five domain states match, reward = 1.0. Otherwise, reward = 0.0.

Read-only / information-gathering tool calls do not affect grading state, so the agent is free to explore before acting.

Data

Data is sourced from nvidia/Nemotron-RL-agent-workplace_assistant on HuggingFace. CSV tool backend data is sourced from NVIDIA's NeMo Gym repository. The dataset is stored on the OpenReward platform.

Tools

Tool	Type	Description
`company_directory_find_email_address`	Read	Find email addresses by name
`email_get_email_information_by_id`	Read	Get email details by ID
`email_search_emails`	Read	Search emails by query, date range
`email_send_email`	Action	Send a new email
`email_delete_email`	Action	Delete an email
`email_forward_email`	Action	Forward an email
`email_reply_email`	Action	Reply to an email
`calendar_get_event_information_by_id`	Read	Get calendar event details
`calendar_search_events`	Read	Search calendar events
`calendar_create_event`	Action	Create a calendar event
`calendar_delete_event`	Action	Delete a calendar event
`calendar_update_event`	Action	Update a calendar event field
`analytics_get_visitor_information_by_id`	Read	Get visitor analytics info
`analytics_create_plot`	Action	Create an analytics plot
`analytics_total_visits_count`	Read	Get total visits for date range
`analytics_engaged_users_count`	Read	Get engaged users for date range
`analytics_traffic_source_count`	Read	Get traffic source counts
`analytics_get_average_session_duration`	Read	Get average session duration
`project_management_get_task_information_by_id`	Read	Get project task details
`project_management_search_tasks`	Read	Search project tasks
`project_management_create_task`	Action	Create a project task
`project_management_delete_task`	Action	Delete a project task
`project_management_update_task`	Action	Update a project task field
`customer_relationship_manager_search_customers`	Read	Search CRM customers
`customer_relationship_manager_update_customer`	Action	Update a CRM customer field
`customer_relationship_manager_add_customer`	Action	Add a new CRM customer
`customer_relationship_manager_delete_customer`	Action	Delete a CRM customer
`finish`	Control	Signal task completion and trigger grading

Time Horizon

Multi-turn agentic environment. The agent may call information-gathering tools before taking actions, then calls finish to end the episode.

Safety

This environment uses simulated workplace tools that do not connect to real services. There are no direct safety risks.

Citations

@misc{nvidia_nemotron_rl_workplace_assistant,
  title={Nemotron-RL-agent-workplace_assistant},
  author={NVIDIA},
  year={2026},
  url={https://huggingface.co/datasets/nvidia/Nemotron-RL-agent-workplace_assistant}
}

Implementations

No implementations linked yet. Add one to showcase related work.

Repository

Source repository

EnvCommons/nemotron-rl-workplace-assistant

Clone Repository

Tools

Tools available in the environment

No tools available for this environment, it probably hasn't been indexed yet.

Compute Configuration

Resource allocation for this environment.

Component	Configuration
Environment Server	1 vCPU / 4 GB RAM
Sandbox Machine	Not configured

Estimated Cost

Pay per second of active session usage. Billing starts when your session begins and stops when it ends.

Component	Cost / second
Environment	$0.0000320
Sandbox	Not configured
Total	$0.0000320

Examples

5-minute session$0.0096

1-hour session$0.1152