Computer vision in healthcare

How CrunchDAO is Advancing Autoimmune Disease Research at the Broad Institute of MIT and Harvard

Background

Autoimmune diseases occur when the immune system mistakenly attacks healthy cells. Inflammatory Bowel Disease (IBD) is one of the most prevalent forms, affecting 50 million people in the United States alone with cases rising globally. Patients face up to twice the risk of developing colorectal cancer. While this cancer is highly treatable when detected early, current diagnostic methods require expensive, time-intensive spatial genomics measurements that limit their adoption in patient care.

‍

Side-by-side comparison showing the same colon tissue sample in H&E registered image (left) aligned to spatial transcriptomics coordinates, and H&E original image (right) in native pixel coordinates.

‍

The Eric and Wendy Schmidt Center at the Broad Institute of MIT and Harvard took on to explore how machine learning could predict gene expression patterns that signal pre-cancerous tissue directly from standard pathology images.

‍

While their in-house teams had deep domain knowledge, the complex challenge of connecting routine H&E pathology images to spatial transcriptomics data could benefit from diverse methodological approaches. The institute sought to complement their existing capabilities with fresh perspectives from the broader ML research community.

The Challenge

The Broad Institute partnered with CrunchDAO to use the collective intelligence of our 10,000+ ML community. The Autoimmune Disease Machine Learning Challenge launched in October 2024, attracting hundreds of researchers from prestigious institutions worldwide.

‍

The competition structured the problem into three progressive phases:

Crunch 1: It challenged participants to predict gene expression in spatial transcriptomics data from matched pathology images, effectively translating visual tissue patterns into molecular insights.
‍
Crunch 2: It required prediction of 2,000 unmeasured genes using spatial data from Crunch 1 and single-cell RNA sequencing data to infer comprehensive gene activity from limited spatial measurements.
‍
Crunch 3: It needed ranking genes by their ability to distinguish dysplasia (pre-cancerous regions) from healthy tissue to create a potential diagnostic tool for early cancer detection.

Solution

Our decentralized approach leveraged the collective intelligence to explore diverse methodologies that would be unlikely to emerge from any single research team in the same given time. By bringing together hundreds of researchers from different institutions and backgrounds, we found multiple novel approaches simultaneously to the same complex problem.

‍

Some of the approaches that stood out:

DeepSpot Enhancement: Kalin Nonchev from ETH Zurich extended the DeepSpot methodology to support 10x Genomics Xenium data, dramatically improving single-cell gene expression predictions.
Contrastive Learning: Alexis Gassmann applied contrastive learning to build shared embedding spaces between pathology images, gene expression data, and spatial coordinates.
Creative Proxy Supervision: Team Cellmates solved the fundamental problem of predicting genes with no training data by using FAISS algorithm to find similar single-cell samples and creating supervised learning from unsupervised data.
Vision-Attention Architecture: The team from IIT Ropar combined vision transformers with multi-head attention networks, demonstrating how modern AI architectures could tackle complex biological problems.

‍

Kalin Nonchev from ETH Zurich's Department of Computer Science secured first place with his DeepSpot methodology.

‍

DeepSpot's application generated 1,792 spatial transcriptomics samples from The Cancer Genome Atlas (TCGA) cohorts, analyzing 37 million spots across melanoma and renal cell carcinoma datasets. The methodology demonstrated multi-cancer validation across metastatic melanoma, kidney, lung, and colon cancers, achieving significant improvement in gene correlation compared to existing methods.

‍

Following the conclusion of the challenge in March 2025, Schmidt Center scientists analyzed the top-performing models and ordered a custom dysplasia gene panel based on AI predictions for cancer detection. Custom gene panels are now in manufacturing, with wet-lab validation experiments launching soon. Winners will be announced in December 2025.

This challenge demonstrates how introducing the global machine learning community to a biological problem can accelerate scientific and clinical discoveries. Looking beyond the boundaries of one domain reveals opportunities we wouldn’t find alone.

Ramnik Xavier

Director of the KCO, gastroenterologist at Massachusetts General Hospital

Impact

The challenge engaged top researchers from prestigious institutions like ETH Zurich, University of Cambridge, IIT Ropar, Technology Innovation Institute UAE, University of Miami, and more, allowing the Broad Institute to access global talent pools and maximize its ROI.

‍

“This challenge demonstrates how introducing the global machine learning community to a biological problem can accelerate scientific and clinical discoveries,” added Ramnik Xavier, director of the KCO, gastroenterologist at Massachusetts General Hospital, and Kurt J. Isselbacher Professor of Medicine at Harvard Medical School. “Looking beyond the boundaries of one domain reveals opportunities we wouldn’t find alone.”

Conclusion

The Broad Institute's decision to manufacture gene panels based on competition results showcases institutional confidence in community-led scientific discovery.

This collaboration establishes a new model for accelerating medical research breakthroughs.

Submit my model

Launch a Crunch

Computer vision in healthcare

Background

The Challenge

Solution

Some of the approaches that stood out:

Impact

Conclusion

Build together

Construire ensemble

Bygge sammen,

साथ मिलकर बनाना