๐Ÿช EDM-ARS  ยท  v1.0 Pilot

Educational Data Mining
Automated Research System

A multi-agent LLM pipeline that turns a dataset and a research prompt into a complete, reviewer-ready academic paper.

5
Specialized Agents
7/10
Critic Quality Score
67 min
Demo Runtime
$7.00
API Cost Per Paper

Overview

What It Does

Inspired by FARS, EDM-ARS is a domain-specific multi-agent LLM pipeline that automates the complete workflow of prediction-focused educational data mining research. Given the HSLS:09 dataset and a research prompt, it formulates a research question, engineers features, trains and compares multiple ML models, runs SHAP explainability and subgroup fairness analysis, retrieves real citations via the Semantic Scholar API, and produces a complete ACM sigconf-formatted LaTeX paper โ€” with a built-in Critic agent that enforces methodological rigor through automated peer review and targeted revision loops.

๐ŸŽ“

Domain-Specific Design

Built around the HSLS:09 longitudinal dataset with a hand-curated 95-variable Tier 1 registry encoding substantive educational meaning โ€” not just column names.

๐Ÿ“„

Real Academic Output

The Writer fills prose into a fixed ACM sigconf LaTeX skeleton with %%PLACEHOLDER%% markers โ€” never generating boilerplate from scratch, ensuring structurally correct output every time.

๐Ÿ”

Critic-Gated Revision Loop

After analysis, the Critic reviews all prior agents' outputs and can route targeted revisions back to any stage โ€” up to 2 cycles โ€” before writing begins.

๐Ÿ’พ

Checkpoint & Resume

Pipeline state is serialized to checkpoint.json after every stage. Interrupted runs resume from the last completed stage โ€” no work is lost.


Five-Agent Pipeline

๐Ÿ” 1
ProblemFormulator
Searches Semantic Scholar, scopes the research question & hypothesis
๐Ÿ›  2
DataEngineer
Cleans features, outputs subgroup_labels & column_mapping for fairness analysis
๐Ÿ“Š 3
Analyst
Trains model battery (LR, RF, XGBoost, ElasticNet, MLP, Stacking), runs SHAP & subgroup analysis in phased execution
โœ๏ธ 5
Writer
Fills structured results into ACM sigconf LaTeX template โ€” template-based, never free-form generation
โš–๏ธ  Agent 4 ยท Gatekeeper
Critic
Reviews all prior agents' outputs for methodological soundness.
Issues PASS / REVISE / ABORT verdicts.
claude-opus โ€” highest-tier model
๐Ÿ”„

Revision loop โ€” on REVISE, targeted instructions are routed back to ProblemFormulator, DataEngineer, or Analyst selectively. Up to 2 cycles before the Writer is unblocked regardless.


Features

Key Capabilities

๐Ÿค–

5 Specialized Agents

Coordinated by a state-machine orchestrator. Each agent has its own system prompt, temperature, and model tier (Opus for Critic, Sonnet for all others).

๐Ÿ“œ

End-to-End Automation

From a raw CSV and a research prompt to a compiled ACM LaTeX paper โ€” with real citations, methodology validation, and SHAP explainability figures.

๐Ÿ›ก

Self-Healing Pipeline

Contract validation at every stage boundary. Auto-patching for classifiable errors (SHAP failure, dtype mismatch, missing column) before falling back to LLM repair.

๐Ÿ“š

Live Academic Citations

The ProblemFormulator queries the Semantic Scholar API with exponential-backoff retry logic to retrieve and validate real, current citations.

โš—๏ธ

6-Model Battery

Logistic Regression, Random Forest, XGBoost, ElasticNet, MLP, and a Stacking Ensemble are trained, compared, and reported with SHAP explainability where applicable.

๐Ÿณ

Docker Sandboxing

LLM-generated analysis code executes inside a Docker sandbox (network-disabled). Gracefully falls back to subprocess when Docker is unavailable.

Demo Run Stats

Numbers from the first end-to-end pipeline run, producing a complete ACM sigconf paper on HSLS:09 college-enrollment prediction.

7/10
Critic Quality Score
2
Revision Cycles
67 min
Total Pipeline Runtime
$7.57
API Cost (vs $5 target)

Built With

Tech Stack

Core Pipeline
Python 3.11 Anthropic API Claude Sonnet Claude Opus (Critic) Docker PyYAML
ML & Analysis
scikit-learn XGBoost SHAP pandas matplotlib seaborn
Data & Literature
HSLS:09 Dataset Semantic Scholar API 95-var Tier 1 Registry
Output
ACM sigconf LaTeX Template-Based Generation SHAP Figures

Pilot v1.0 & Beyond

The current release targets prediction tasks on HSLS:09 only. Future task modules will expand EDM-ARS into a full research automation platform for educational data science.

โœ…  Pilot v1.0 โ€” Completed
Prediction Pipeline โ€” end-to-end ML prediction workflow on HSLS:09
Self-Healing Architecture โ€” contract validation, phased Analyst execution, error taxonomy & auto-patching
Docker Sandbox โ€” isolated code execution with subprocess fallback
LaTeX Template System โ€” fixed ACM skeleton with %%PLACEHOLDER%% markers
Checkpoint & Resume โ€” pipeline state persisted after every stage
Multi-Dataset Support โ€” expanding beyond HSLS:09 to broader educational datasets
Causal Inference Module โ€” digging causal relationships in educational data
Transfer Learning Module โ€” generalizing findings across dataset and population
Psychometrics Module โ€” measuring cognitive abilities and learning outcomes
Synthetic Data Generation Module โ€” privacy-preserving training data augmentation