MMU-RAG


NeurIPS 2025 Competition

Welcome to the official website of MMU-RAG: the Massive Multi-Modal User-Centric Retrieval-Augmented Generation Benchmark. This competition invites researchers and developers to build RAG systems that perform in real-world conditions.

Participants will tackle real-user queries, retrieve from web-scale corpora, and generate high-quality responses in both text and/or video formats.

MMU-RAG features two tracks:

  1. Text-to-Text
  2. Text-to-Video

Submissions are evaluated using a blend of:

  • Automatic metrics
  • LLM-as-a-judge evaluations
  • Real-time human feedback through our interactive RAG-Arena platform

Evaluation Methods and Metrics

Illustration of our static evaluation methods and their corresponding metrics.

Track Evaluation Method Evaluation Metric
Text-to-text Automatic Rouge-L, BERTScore
LLM-as-a-Judge Semantic Similarity, Coverage, Factuality, Citation Quality
Human Likert Ratings
Text-to-video Automatic Subject Consistency, Background Consistency, Motion Smoothness, Dynamic Degree, Aesthetic Quality, Imaging Quality (from VBench)
LLM-as-a-Judge Relevance, Precision, Recall, Usefulness
Human Likert Ratings

Whether you’re advancing retrieval strategies, generation quality, or multimodal outputs, this is your opportunity to benchmark your system in a setting that reflects actual user needs.


Timeline

Aug 1: Competition launch & dataset release

Two exciting tracks, both with provided corpora, APIs, and starter codes. You are also allowed to use external resources or APIs for retrieval as long as they are clearly documented in submission.

Text-to-Text (details) Text-to-Video (details)
Standard text-to-text RAG: Create systems that retrieve from a text corpus and generate text responses from text queries

Deep Research Systems welcome! e.g. Multi-hop retrieval, Structured reasoning, Integration with external tools or knowledge bases, etc.
More novel task! Given text queries that benefit from video outputs (“how to peel banana”), Retrieve from a text corpus and generate video responses.

Aug 1 - Oct 15: ACTION REQUIRED: Register to get necessary resources

  • Go to Getting Started page to see:
    • Our competition rules
    • Instructions on registration (required)
    • Detailed instructions for the two tracks

Aug 1 - Oct 15: ACTION REQUIRED: Competition Submission

  • Step-by-step instructions for the text-to-text and text-to-video tracks.
  • Submission options preview (applicable for both tracks):
Static Evaluation (Non-Cash Prizes) Full System Submission (Cash Prizes)
Run your system on the public validation set

Submit outputs (.jsonl or video folder) via Google Drive

Eligible for honorable mentions and website features
Package your RAG system as a Docker image

Submit via AWS ECR for live + static evaluation

Eligible for leaderboard rankings and cash prizes

Oct 15–21: Organizers Running Evaluations

  • Submissions will be evaluated using a blend of:
    • Automatic metrics
    • LLM-as-a-judge evaluations
    • Real-time user feedback from our RAG-Arena

Action required: All participants are required to submit a report detailing their system, methods, and results. Top-performing and innovative teams will be invited to present their work at our associated NeurIPS 2025 workshop. Further details on the report format and submission deadlines will be announced soon.

Dec 6-7: MMU-RAG Workshop at NeurIPS 2025

  • Presentations by selected teams
  • Winners and runner(s)-up announced

Prizes

We’re excited to offer both monetary prizes and academic exposure opportunities to recognize outstanding submissions.

💰 Prize Pool

Thanks to the support of Amazon, MMU-RAG offers a $10,000 prize pool in AWS credits. Prizes will be awarded to top-performing teams across both tracks.

🎤 Present at NeurIPS

Top teams will also be invited to present their systems during the MMU-RAG competition session at NeurIPS 2025. This is a unique opportunity to share your work with the community.

🥇 Eligibility

Prize eligibility requires full system reproducibility and clear documentation of all components. Only participants in the Full System Submission option are eligible for cash prizes.


Contact Us

For any questions or clarifications, email the organizers directly at: mmu-rag@andrew.cmu.edu

Organizers

  • Luo Qi Chan, DSO National Laboratories / Carnegie Mellon University
  • Tevin Wang, Carnegie Mellon University
  • Shuting Wang, Renmin University of China / Carnegie Mellon University
  • Zhihan Zhang, Carnegie Mellon University
  • Alfredo Gomez, Carnegie Mellon University
  • Prahaladh Chandrahasan, Carnegie Mellon University
  • Lan Yan, Carnegie Mellon University
  • Andy Tang, Carnegie Mellon University
  • Zimeng (Chris) Qiu, Amazon AGI
  • Morteza Ziyadi, Amazon AGI
  • Sherry Wu, Carnegie Mellon University
  • Mona Diab, Carnegie Mellon University
  • Akari Asai, University of Washington
  • Chenyan Xiong, Carnegie Mellon University