Turbine’s Post

View organization page for Turbine, graphic

5,760 followers

Current single-cell foundational models, including scGPT, can underperform compared to simpler approaches in predicting cellular responses to perturbations, highlighting significant limitations in existing benchmark datasets: 🌲 Turbine's benchmarking showed scGPT lagging behind simpler approaches like averaging training samples and Random Forest. 🐾 Models with biologically relevant features can significantly outperform scGPT. 🔻 Perturb-Seq datasets used for benchmarking are limited by low perturbation counts and lack of intra-perturbation variance. ⛓ Pseudo-bulked expression profiles outperform foundational models, suggesting little-to-no advantage in single-cell level modeling, especially in low-heterogeneity cell lines. Findings suggest revisiting benchmarking practices for more effective evaluation of post-perturbation gene expression prediction. Read more in the following preprint: https://lnkd.in/dKBGpYxr

View profile for Bence Szalai, graphic

computational systems biologist | principal bioinformatics scientist & research team lead at Turbine

Accurately predicting #cellular responses to #perturbations is crucial for understanding cell behavior in both healthy and diseased states. Recently, several large language model (LLM)-based single-cell #foundational models have been proposed for this task. But how well are they performing? At Turbine, we conducted #benchmarks on one of these models, scGPT, and uncovered some surprising results - see our new preprint: https://lnkd.in/dmHKbYQj Even simple models, like averaging training samples, outperformed scGPT. A straightforward machine learning model, Random Forest, incorporating biologically meaningful features, outperformed it by a large margin. We also discovered that current Perturb-Seq benchmark datasets generally contain a low number of perturbations and lack intra-perturbation variance, limiting their usefulness for robust benchmarking. Additionally, models using pseudo-bulked expression profiles outperformed foundation models, suggesting that single-cell level modeling may offer little advantage, especially in low-heterogeneity cell lines. Our findings reveal important limitations in current benchmarking practices and offer new insights for more effective evaluation of post-perturbation gene expression prediction models. Thanks to the coauthors, Kristóf Szalay and especially Gerold Csendes who led these efforts.

  • No alternative text description for this image

The billion dollar question is how long this lag will last until the GPTs catch up and overtake. Experience shows: Typically not very long -> http://www.incompleteideas.net/IncIdeas/BitterLesson.html

Like
Reply

To view or add a comment, sign in

Explore topics