Using microscopy data to create powerful foundation models of cellular biology. ◾ In a new paper, researchers from Recursion and Valence Labs reveal the next level of scaling microscopy images for use in biological research – offering a significant improvement over weakly supervised learning (WSL). They are presenting their findings in a keynote and poster presentation at the IEEE/CVF Conference on Computer Vision and Pattern Recognition June 18 & 19. ◾ While images have been a powerful means to explore chemical and genetic changes that happen in cells, and while we now have massive datasets at our disposal like RxRx3 – a publicly available database of 2.2 million images that represents less than 1% of Recursion’s total dataset – developing robust and feature extraction pipelines using open source software packages has been challenging. ◾ One approach has been to use weakly supervised learning (WSL) to train models that predict the perturbations used to treat cells in an image – but these models are limited with larger datasets, including forgetting known biological relationships. In part, this is because perturbations are noisy labels with clusters of similar effects – and many have no significant effect. ◾ In the preprint, researchers demonstrate a new framework for learning representations of high content screening datasets based on self-supervised learning. They hypothesized that if they had more compute, more data, and more parameters, they would get improved performance (a.k.a., the "scaling hypothesis"). ◾ Using the BioHive supercomputer (more compute), Recursion’s public and private datasets (more data) and vision transformer (ViT) masked autoencoders (more parameters) the team found that they could scale microscopy-based representation of cellular biology that could accurately infer known biological relationships without losing recall – achieving as much as a 11.5% relative improvement when recalling known biological relationships curated from public databases. ◾ What’s more, they developed a new channel-agnostic masked autoencoder architecture (CA-MAE) that allows for inputting images of different numbers and orders of channels. Learn more: https://lnkd.in/einj4_5N Oren Kraus Kian Kenyon-Dean Berton Earnshaw Saber Saberian Maryam Fallah Peter McLean Jess Leung Vasudev Sharma Ayla Khan Safiye C. Dominique Beaini Maciej Sypetkowski Maureen Makes Kristen Morse Ben Mabey #ai #ml #tech #techbio #wsl #vit #mae #data #cvpr
Recursion’s Post
More Relevant Posts
-
We are thrilled to announce our latest publication in Nature Chemistry, “Exploring the frontiers of condensed phase chemistry with a general reactive machine learning potential”. This groundbreaking work is the result of a collaborative effort between Los Alamos National Laboratory, NVIDIA, Carnegie Mellon University, and Southern Methodist University. Our research introduces a nanoreactor active learning data generation approach and the resulting general machine learning interatomic potential (ANI-1xnr) for organic condensed-phase reactive molecular dynamics. ANI-1xnr is trained to a large dataset obtained using an active learning workflow employing a new MD-based sampling algorithm to discover diverse and relevant condensed-phase reactive atomistic configurations. We’ve validated the accuracy and applicability of the ANI-1xnr potential on five distinct condensed-phase reactive studies, showing that ANI-1xnr reproduces experimental results well and produces results consistent with alternative modeling approaches. We’re excited to provide the resulting ANI-1xnr potential and dataset to the community for further application and analysis. We look forward to seeing how this work will contribute to future advancements in the field. Read the full paper to learn more about our methodology, findings, and the potential avenues for future research. #machinelearning #computationalchemistry Shout out to our top-notch team! Shuhao Zhang, Malgorzata Z. Makos, Ryan B. Jadrich, Elfi Kraka, Kipton Barros, Benjamin Nebgen, Sergei Tretiak, Olexandr Isayev, Nicholas Lubbers, and Richard Messerly!
To view or add a comment, sign in
-
This journey began in 2020 amidst the COVID-19 pandemic when I initiated a plan of self educating myself in Feature Engineering, Machine Learning and Neural Networks. Recognizing the grave consequences of driver drowsiness leading to road accidents and the irreversible loss of life this study stands out for its unique approach, leveraging advanced feature engineering algorithms to ensure the classifier is not misled by irrelevant features, and addressing the challenge of working with limited data. As the electric vehicle (EV) industry undergoes rapid transformation, implementing these findings not only enhances safety in EVs but also traditional automobiles, thereby contributing to safer roads. Despite the challenge of acquiring brain data within vehicles, this research demonstrates the potential of using vital signs, such as a decrease in pulse rate, to infer the driver's state. In essence, this research opens new frontiers and underscores the importance of prioritizing road safety. This research has been published by ACM in Proceedings of the 2023 10th International Conference on Biomedical and Bioinformatics Engineering. https://lnkd.in/gChFEvpa
Comprehensive Analysis on Feature Selection, Machine Learning and Deep Learning Algorithms to Detect Driver Drowsiness-An EEG Study | Proceedings of the 2023 10th International Conference on Biomedical and Bioinformatics Engineering
dl.acm.org
To view or add a comment, sign in
-
Ph.D. student at Computational Sciences Coordination, Instituto Nacional de Astrofísica, Óptica y Electrónica (INAOE),
🚀 Exciting News! 🚀 We are thrilled to announce that our paper titled "Progressive Self-supervised Multi-Objective NAS for Image Classification" has been accepted for presentation at the leading European event on Bio-Inspired Computation (Evostar), taking place in Aberystwyth, Wales, United Kingdom, from April 3rd to April 5th, 2024. Our paper will be featured in the Evolutionary Machine Learning Accepted Papers track, and we are honored to have been selected for a Long Talks presentation. https://lnkd.in/e6zSV9RY Here's a brief summary of our research: We introduce a novel progressive self-supervised framework for neural architecture search. Our goal is to discover competitive yet significantly less complex CNN architectures that can serve multiple tasks, such as acting as a pretrained model. Leveraging Cartesian genetic programming (CGP) for neural architecture search (NAS), our approach integrates self-supervised learning with a progressive architecture search process within the continuous domain, tackled via multi-objective evolutionary algorithms (MOEAs). To validate our proposal, we conducted rigorous evaluations using the non-dominated sorting genetic algorithm II (NSGA-II) across datasets including CIFAR-100, CIFAR-10, SVHN, and CINIC-10. The experimental results demonstrate the competitiveness of our approach in terms of both classification performance and model complexity. Furthermore, our method exhibits strong generalization capabilities. 🌟 #Evostar2024 #BioInspiredComputation #NeuralArchitectureSearch #MachineLearning #ResearchConference
To view or add a comment, sign in
-
🐺 Unveiling the Potential of Grey Wolf Optimizer (GWO) in Tech & Science! Our comprehensive review dives into GWO's inspiration, applications, and performance in various domains like engineering, bioinformatics, and image processing. 🚀 Highlighting its adaptability and efficiency in complex problem-solving, we explore future directions for enhancing GWO - from large-scale problem handling to hybridization and dynamic optimization. A promising path for algorithmic evolution! 🌐 #Optimization #Bioinformatics #Engineering #Bioinformatics_applications #Bioinformatic_tools This work presents the first comprehensive literature review of the Grey Wolf Optimizer (GWO) algorithm, discussing its inspiration, mathematical model, performance, and applications across various fields such as feature selection, training artificial neural networks, classification, clustering, and more. The paper critiques current works on parameter tuning, operator design, encoding schemes, and hybrid models involving GWO, while highlighting its applications in engineering, bioinformatics, environmental modeling, and image processing. The review underscores GWO's potential for further development, particularly in handling large-scale problems, designing memetic algorithms through hybridization, exploring dynamic optimization, and investigating its theoretical underpinnings. The conclusion emphasizes GWO's capacity to address complex problems and suggests future research directions to enhance its performance and application breadth. https://lnkd.in/de6_bZsx
To view or add a comment, sign in
-
Chief Data Scientist - Applied Math, Computing, and Data | Generative AI and Technology Leader at Pacific Northwest National Laboratory
Accelerating the Frontiers of AI and Chemistry with CACTUS 🌵 : The Power of Open-Source LLMs and Scientific Tools at Pacific Northwest National Laboratory Excited to share our research on CACTUS (Chemistry Agent Connecting Tool-Usage to Science), an LLM-based agent that integrates cheminformatics tools to enable advanced reasoning and problem-solving in chemistry and molecular discovery. By harnessing the cognitive capabilities of open-source LLMs like Gemma-7b, Falcon-7b, MPT-7b, Llama2-7b, and Mistral-7b, and combining them with domain-specific tools, CACTUS significantly outperforms baseline LLMs on a benchmark data set. Key findings: ✅ Gemma-7b and Mistral-7b models achieve the highest accuracy, regardless of prompting strategy ✅ Domain-specific prompting and hardware configurations play a crucial role in model performance ✅ Smaller models can be deployed on consumer-grade hardware without significant loss in accuracy CACTUS opens up new possibilities for researchers in tasks such as molecular property prediction, similarity searching, and drug-likeness assessment, accelerating scientific advancement and unlocking new frontiers of novel, effective, and safe drug candidates, catalysts, and materials. By leveraging the strengths of open-source LLMs and domain-specific tools, CACTUS has the potential to revolutionize the way we approach scientific discovery. CACTUS's ability to be integrated with automated experimentation platforms and make data-driven decisions in real-time paves the way for autonomous discovery. The agent can design and prioritize experiments, analyze results, and iteratively refine its hypotheses, leading to more efficient and targeted exploration of chemical space. Kudos to the cactus team: Andrew McNaughton Carter Knutson Agustin Kruel Rohith Anand Varikoti, Ph.D. and Gautham Krishna Link to Preprint: https://lnkd.in/gtmuFSrW Link to Github : https://lnkd.in/g-R6-Rkf #CACTUS #AI #ScientificDiscovery #ChemicalResearch #OpenScience #Cheminformatics #MachineLearning #OpenLLM #MolecularDiscovery #AutonomousDiscovery #Gemma7b #Mistrel7b #Llama7b #Falcon7b
To view or add a comment, sign in
-
🌐 Introducing SPACEL: A New Deep Learning Method for Spatial Transcriptomics 🌐 Hello #Biologists, #Bioinformaticians, and #PharmaLeaders! I'm excited to introduce SPACEL, a cutting-edge deep-learning-based toolkit revolutionizing spatial transcriptomics data analysis. This innovative tool is set to transform our understanding of tissue architecture and disease microenvironments. 🧬 🔍 Key Features of SPACEL: 🟢 Spoint Module: Predicts cell type composition in seq-based ST technologies like 10X Visium, using a multiple-layer perceptron (MLP) with a probabilistic model. 🟡 Splane Module: Employs a graph convolutional network (GCN) and adversarial learning for joint analysis of multiple ST slices, identifying spatial domains based on cell type and spatial coordinates. 🟠 Scube Module: Aligns consecutive slices to construct a stacked 3D tissue architecture, utilizing mutual nearest neighbor (MNN) graphs and differential evolution algorithms. 📈 Application and Performance: Applied to 11 ST datasets comprising 156 slices from various technologies. Outperformed other methods in cell type deconvolution, spatial domain identification, and 3D architecture construction. 🎯 Robustness and Accuracy: Demonstrated superior performance in accurately deconvoluting cell-type composition and identifying spatial domains, even with hyperparameter variations. 🏥 Clinical Relevance: SPACEL's ability to identify functional spatial domains and reconstruct 3D tissue architecture holds immense potential for biological discoveries and practical applications in healthcare. Discover more about SPACEL 📚 Nature Communications paper: https://buff.ly/3N3TFBZ 💻 Github: https://buff.ly/3v0tWEy 📢 Join the Conversation📢 Share your thoughts, methods, and tools for this exciting field in the comments!👇 💬 #SpatialTranscriptomics #DeepLearning #Bioinformatics #DataAnalysis #InnovationInHealthcare #Biotechnology #ClinicalResearch
To view or add a comment, sign in
-
https://lnkd.in/geyUwiyZ #Predicting Nanoparticle Behavior in Biological System-A Machine Learning Approach
Predicting Nanoparticle Behavior in Biological Systems: A Machine Learning Approach
https://www.jnanoworld.com
To view or add a comment, sign in
-
Center for Computational Biology, Chemoinformatics, and Artificial Intelligence: This is the era of data science in the life sciences, of experts in computational biology, chemoinformatics, machine learning, artificial intelligence ... http://dlvr.it/TCFH3X
To view or add a comment, sign in
-
📖 I am pleased to announce the release of the second part of my image processing series for biology and biomedicine today! In this practical course, you will learn easy and powerful methods to analyze complex images using machine learning and deep learning. No prior knowledge is required. Through three practical activities, you will use Cellpose and Labkit to detect and segment cells, automate cell counting in a counting chamber, and detect bacteria-infected cells. All processing steps are discussed, and the codes and data are provided. With these advanced methods, you will be able to process the majority of your microscopy images, thanks to the robustness of machine learning. ✨You can reach the course through the Symbiophysics website: https://lnkd.in/eu5Zq4ze If you are interested in getting an overview of my approach before diving into machine learning, the first part, which covers the basic methods of biological image processing, is available for free and without registration: https://lnkd.in/eBhceCjz #ImageAnalysis #Symbiophysics #BioImaging #MachineLearning
Practical lesson - Cell Counting 2 - Symbiophysics
https://symbiophysics.fr
To view or add a comment, sign in
50,669 followers
Innovative Leadership in Life Sciences & Technology Talent Solutions: Founder & CEO of Recruits Lab & BioJobs Lab, Driving Organizational Success through Cutting-Edge Recruitment Strategies
3moRecursion, your advancements in using microscopy data to build foundational models of cellular biology are truly groundbreaking! The strides made in scaling microscopy images for biological research, especially with the introduction of your new framework based on self-supervised learning, represent a significant leap forward in the field. It's exciting to see how your team's innovative approaches, including the BioHive supercomputer and vision transformer masked autoencoders, are enhancing accuracy and preserving biological relationships in datasets. Looking forward to hearing more about your keynote and poster presentations at the IEEE/CVF Conference on Computer Vision and Pattern Recognition! #ai #ml #tech #techbio #wsl #vit #mae #data #cvpr Recruits Lab (We're hiring)