SocialData

API Endpoint
Leaderboard
Loading leaderboard...
README

SocialData

OpenReward Environment

Description

SocialData is a collection of data science competition environments sourced from DrivenData. It contains 5 multi-turn sandboxed tasks where agents develop machine learning models to solve real-world prediction problems spanning public health, infrastructure, natural language processing, and disaster response.

Capabilities

  • Data exploration and feature engineering
  • Machine learning model development
  • Time series prediction
  • Multi-target classification and regression
  • Document summarization with LLMs

Compute Requirements

Agents are given a sandboxed environment with 1 CPU and 4 GB RAM, with access to scientific Python libraries (pandas, scikit-learn, etc.).

Tasks

There are 5 environment variants, each with a train split:

VariantDescriptionMetric
FluVaccinePredictionPredict H1N1 and seasonal flu vaccination probabilitiesMean ROC AUC
PumpItUpPredictionClassify water pump functionality in TanzaniaF1-micro
DocSumTaskSummarize social science research papersROUGE-2 F1
DengAIPredictionPredict weekly dengue fever case countsMean Absolute Error
RichterPredictionPredict earthquake building damage gradesF1-micro

Reward Structure

This is a multi-turn environment. Agents explore data, develop models, generate predictions, and submit via the submit_predictions tool. Each variant uses its specific evaluation metric:

  • FluVaccinePrediction: Mean ROC AUC across H1N1 and seasonal targets (0-1)
  • PumpItUpPrediction: Micro-averaged F1 across 3 classes (0-1)
  • DocSumTask: ROUGE-2 F1 score (0-1)
  • DengAIPrediction: Inverted MAE (lower error = higher reward)
  • RichterPrediction: Micro-averaged F1 across 3 damage grades (0-1)

Data

Training data is mounted read-only at /orwd_data. Each competition includes:

  • Training features and labels
  • Test features (labels hidden)
  • Data dictionaries and descriptions

Data is sourced from DrivenData competitions and stored on the OpenReward platform.

Tools

Each variant provides CLI tools plus a submission tool:

ToolDescription
bashExecute shell commands in the sandbox
globFind files by pattern
grepSearch file contents
lsList directory contents
readRead file contents
writeWrite file contents
editEdit existing files
multi_editMake multiple edits
todo_writeTrack task progress
submit_predictionsSubmit predictions CSV for evaluation. Ends the episode.

Time Horizon

Multi-turn. Agents explore data, develop and train models, generate predictions, save to submission.csv, and submit for evaluation.

Environment Difficulty

[Put environment difficulty here]

Other Environment Requirements

None. All evaluation is deterministic using competition-specific metrics.

Safety

Agents in SocialData work within sandboxed environments to develop ML models. The environment does not present direct safety risks.

Citation

@software{socialdata_openreward,
  title={SocialData: DrivenData Competition Environments for OpenReward},
  author={GeneralReasoning},
  year={2025},
  url={https://openreward.ai/GeneralReasoning/SocialData}
}
GeneralReasoning/SocialData | OpenReward