Welcome to the official website of MMU-RAG: the Massive Multi-Modal User-Centric Retrieval-Augmented Generation Benchmark. This competition invites researchers and developers to build RAG systems that perform in real-world conditions.
Participants will tackle real-user queries, retrieve from web-scale corpora, and generate high-quality responses in both text and/or video formats.
MMU-RAG features two tracks:
- Text-to-Text
- Text-to-Video
Submissions are evaluated using a blend of:
- Automatic metrics
- LLM-as-a-judge evaluations
- Real-time human feedback through our interactive RAG-Arena platform
Evaluation Methods and Metrics
Illustration of our static evaluation methods and their corresponding metrics.
Track | Evaluation Method | Evaluation Metric |
---|---|---|
Text-to-text | Automatic | Rouge-L, BERTScore |
LLM-as-a-Judge | Semantic Similarity, Coverage, Factuality, Citation Quality | |
Human Likert Ratings | ||
Text-to-video | Automatic | Subject Consistency, Background Consistency, Motion Smoothness, Dynamic Degree, Aesthetic Quality, Imaging Quality (from VBench) |
LLM-as-a-Judge | Relevance, Precision, Recall, Usefulness | |
Human Likert Ratings |
Whether you’re advancing retrieval strategies, generation quality, or multimodal outputs, this is your opportunity to benchmark your system in a setting that reflects actual user needs.
Timeline
Aug 1: Competition launch & dataset release
Two exciting tracks, both with provided corpora, APIs, and starter codes. You are also allowed to use external resources or APIs for retrieval as long as they are clearly documented in submission.
Text-to-Text (details) | Text-to-Video (details) |
---|---|
Standard text-to-text RAG: Create systems that retrieve from a text corpus and generate text responses from text queries Deep Research Systems welcome! e.g. Multi-hop retrieval, Structured reasoning, Integration with external tools or knowledge bases, etc. |
More novel task! Given text queries that benefit from video outputs (“how to peel banana”), Retrieve from a text corpus and generate video responses. |
Aug 1 - Oct 15: ACTION REQUIRED: Register to get necessary resources
- Go to Getting Started page to see:
- Our competition rules
- Instructions on registration (required)
- Detailed instructions for the two tracks
Aug 1 - Oct 15: ACTION REQUIRED: Competition Submission
- Step-by-step instructions for the text-to-text and text-to-video tracks.
- Submission options preview (applicable for both tracks):
Static Evaluation (Non-Cash Prizes) | Full System Submission (Cash Prizes) |
---|---|
Run your system on the public validation set Submit outputs (.jsonl or video folder) via Google Drive Eligible for honorable mentions and website features |
Package your RAG system as a Docker image Submit via AWS ECR for live + static evaluation Eligible for leaderboard rankings and cash prizes |
Oct 15–21: Organizers Running Evaluations
- Submissions will be evaluated using a blend of:
- Automatic metrics
- LLM-as-a-judge evaluations
- Real-time user feedback from our RAG-Arena
Action required: All participants are required to submit a report detailing their system, methods, and results. Top-performing and innovative teams will be invited to present their work at our associated NeurIPS 2025 workshop. Further details on the report format and submission deadlines will be announced soon.
Dec 6-7: MMU-RAG Workshop at NeurIPS 2025
- Presentations by selected teams
- Winners and runner(s)-up announced
Prizes
We’re excited to offer both monetary prizes and academic exposure opportunities to recognize outstanding submissions.
💰 Prize Pool
Thanks to the support of Amazon, MMU-RAG offers a $10,000 prize pool in AWS credits. Prizes will be awarded to top-performing teams across both tracks.
🎤 Present at NeurIPS
Top teams will also be invited to present their systems during the MMU-RAG competition session at NeurIPS 2025. This is a unique opportunity to share your work with the community.
🥇 Eligibility
Prize eligibility requires full system reproducibility and clear documentation of all components. Only participants in the Full System Submission option are eligible for cash prizes.
Contact Us
For any questions or clarifications, email the organizers directly at: mmu-rag@andrew.cmu.edu
Organizers
- Luo Qi Chan, DSO National Laboratories / Carnegie Mellon University
- Tevin Wang, Carnegie Mellon University
- Shuting Wang, Renmin University of China / Carnegie Mellon University
- Zhihan Zhang, Carnegie Mellon University
- Alfredo Gomez, Carnegie Mellon University
- Prahaladh Chandrahasan, Carnegie Mellon University
- Lan Yan, Carnegie Mellon University
- Andy Tang, Carnegie Mellon University
- Zimeng (Chris) Qiu, Amazon AGI
- Morteza Ziyadi, Amazon AGI
- Sherry Wu, Carnegie Mellon University
- Mona Diab, Carnegie Mellon University
- Akari Asai, University of Washington
- Chenyan Xiong, Carnegie Mellon University