Genophore Inc.

Genophore Inc.

Software Development

Newark, Delaware 3,093 followers

Organisms are Algorithms

About us

Genophore Inc. is accelerating life sciences discoveries through the integration of biological databases, data visualization, bioinformatics, and collaboration tools. We bring in a deep understanding of the industry workflow bottlenecks and utilize advances in data visualization, AI and machine learning to help bringing new discoveries to the market faster.

Website
https://genophore.com/
Industry
Software Development
Company size
11-50 employees
Headquarters
Newark, Delaware
Type
Privately Held
Founded
2019
Specialties
Biotechnology, Bioinformatics, Software Services, Biological Data Visualization, Pharmaceuticals, Computer Added Drug Discovery, Protein Design, Antibody Design, Structural Enablement, LIMS, and Scientific Notebook

Locations

Employees at Genophore Inc.

Updates

  • View organization page for Genophore Inc., graphic

    3,093 followers

    AlphaFind: Machine Learning and Clustering Enable Proteome-Wide Fast 3D Structure Similarity Search Procházka et al. recently reported AlphaFind which employs a machine learning model to discover the most similar ternary structures of a given protein using AlphaFold 2 (AF2) database. AlphaFind attempts to overcome the limitations of existing protein search tools such as Foldseek, 3D-SURFER, and Dali server. The Dali server and the 3D-SURFER do not scale well to large protein structural data. Foldseek does not support the entire AF database as it uses a pre-clustered 52-million subset of the >200-million AF database. In addition, Foldseek focuses on local interactions between residues and neighbors, limiting its use for similarity search. The protein data bank has accumulated more than 200,000 experimentally determined protein structures over seven decades. This data was used to train the AF2 model that was in turn used to predict, with high accuracy, more than 200 million protein structures housed in the AF database. This massive amount of structural data requires fast methods to organize, explore, and utilize them efficiently. AlphaFind is a protein structure search tool that extracts protein 3D features and represents the structures using a previously reported compact data embedding method, combined with data clustering and a machine learning model to identify the most similar structures to a given query. The input to AlphaFind is the UniProt ID, PDB ID, or relevant gene ID for a given protein, while the output is a set of proteins similar to the query. When given a query, the sequence of events implemented by AlphaFind include: 1️⃣ Converting the input into a UniProt ID 2️⃣ Identifying the associated candidate proteins 3️⃣ Calculating global and local similarity 4️⃣ Retrieving metadata for query and results from AF database 5️⃣ Superposing and visualizing pairs of input and output using NGL viewer, with results also linked to Mol* 6️⃣ Optional expanding of search results 7️⃣ Downloading of search results. While AlphaFind is an incredible resource, it does have some limitations. AlphaFind was developed on top of relatively older AF2 version 3, prior to the release of version 4. Trading of computational load for precision, the results returned by AlphaFind for a given query are approximate and may not always contain all the most similar structures. Also, AlphaFind considers all segments of the entire AF2 structure equally, and does not distinguish between structured and unstructured (i.e. high and low confident regions), hence potentially biasing search results. Paper: https://lnkd.in/g-9EVeRZ GitHub: https://lnkd.in/gvbqYtNV Web app: https://lnkd.in/g2SF3CbZ Manual: https://lnkd.in/g_nxww4V #structuralbiology #drugdiscovery #bioinformatics

    • No alternative text description for this image
  • View organization page for Genophore Inc., graphic

    3,093 followers

    FABind : How Enhanced Pocket Prediction and Pose Generation Can Transform Your Molecular Docking FABind is an improved version of the FABind docking method. FABind is a Deep Learning regression-based method that showed comparable performance to DiffDock but with no promising advantage. FABind is an iteration of FABind with enhanced performance based on two key modifications, including improved pocket prediction and high-quality pose generation capabilities. FABind finds a middle point between traditional sampling and generative approaches by modifying the regression framework of FABind to devise a sampling strategy that is coupled with a confidence model. Traditional/conventional physics-based molecular docking methods often utilize extensive sampling and simulation techniques, translating to slow and resource-intensive computations. Many DL-based methods approach docking as a regression problem to predict protein-ligand binding pose in a single shot. This approach, while can be speedy, has resulted in DL methods struggling to match the accuracy of conventional methods despite increased prediction speed. Pocket prediction was identified as a crucial factor where inaccuracies limit the success of the FABind docking process. To overcome this, the authors devised an approach to dynamically predict the pocket radius as opposed to using a fixed-size sphere. Inspired by molecular conformer generators, FABind incorporates a permutation loss function to improve the performance of ligand conformation prediction. To accurately capture multiple binding sites and conformations, the regression-based FABind was directly transformed into a sampling-based model by implementing a clustering method to identify all potential pocket candidates. Using the PDBbind v2020 dataset, FABind was benchmarked against existing conventional and DL methods, including QVINA-W, GNINA, SMINA, GLIDE, VINA, DiffDock, EquiBind, TankBind, E3Bind, and FABind. 🔥 FABind not only outperformed the original FABind, but it also surpassed other DL based methods including DiffDock. The superior performance is both in terms of accuracy and speed, with the exception of EquiBind which has a faster speed but much weaker accuracy. 🔢 In blind docking tests, the regression-based FABind achieved a success rate of 43.5%  (ligand RMSD < 2Å), outperforming DiffDock by 5%. FABind showed better performance across board in both mean RMSD and percentage predictions below 2Å and 5Å. 🚀 The sampling-based FABind showed even better performance, with significantly improved accuracy for more difficult targets. With increased sampling size, FABind achieved a success rate of 51.2%. Paper: https://lnkd.in/gpjhi-Xm Code: https://lnkd.in/g-zMq3vd FABind paper: https://lnkd.in/gcRcT-ZS #drugdiscovery #structuralbiology #pharmaceuticals #ML #DL #bioinformatics #moleculardocking

    • No alternative text description for this image
  • View organization page for Genophore Inc., graphic

    3,093 followers

    🎉 𝐌𝐨𝐥𝐗 is live now!🎉….If you have used the RCSB (PDB) structure viewer before (Mol*), you will quickly find 𝐌𝐨𝐥𝐗 super familiar but more intuitive. While Mol* offers powerful visualization capabilities, it’s not the most user-friendly tool, especially for basic visualization and rendering operations. This is where 𝐌𝐨𝐥𝐗 shines, making molecular visualization much more accessible and straightforward. 🔥 𝐌𝐨𝐥𝐗 is more than just a viewer. It incorporates some user experience from most popular desktop applications that are known for their robust features but limited by their desktop nature, making cloud integration and regular updates challenging. 𝐌𝐨𝐥𝐗 overcomes these limitations by providing a web-based platform that ensures seamless connectivity to the cloud and facilitates continuous improvements. 🎯 𝐌𝐨𝐥𝐗 has been designed to integrate major sequence, structure, drugs, and ligands repositories, including the entire PDB (with experimental data), UniProt, and ChEMBL and more. Key Features of 𝐌𝐨𝐥𝐗: 1️⃣ 𝗘𝗻𝗵𝗮𝗻𝗰𝗲𝗱 𝗨𝘀𝗮𝗯𝗶𝗹𝗶𝘁𝘆: 𝐌𝐨𝐥𝐗’s intuitive interface makes it easier than ever to visualize complex biomolecular structures. 2️⃣ 𝗛𝗶𝗴𝗵-𝗣𝗲𝗿𝗳𝗼𝗿𝗺𝗮𝗻𝗰𝗲 𝗩𝗶𝘀𝘂𝗮𝗹𝗶𝘇𝗮𝘁𝗶𝗼𝗻: 𝐌𝐨𝐥𝐗 supports the simultaneous visualization of hundreds of protein structures, molecular dynamics trajectories, and cell-level models at atomic detail. Handle models with tens of millions of atoms with impressive efficiency. 3️⃣ 𝗦𝗲𝗮𝗺𝗹𝗲𝘀𝘀 𝗗𝗮𝘁𝗮 𝗜𝗻𝘁𝗲𝗴𝗿𝗮𝘁𝗶𝗼𝗻: Connect 𝐌𝐨𝐥𝐗 to a plethora of databases and data management tools within the Genophore platform or with your own. Effortlessly access and manage your data, enhancing your workflow and productivity. 4️⃣ 𝗖𝗼𝗺𝗽𝘂𝘁𝗮𝘁𝗶𝗼𝗻𝗮𝗹𝗹𝘆 𝗘𝗺𝗽𝗼𝘄𝗲𝗿𝗲𝗱: Experience the power of modern AI and ML algorithms integrated into 𝐌𝐨𝐥𝐗. From predictive modeling to generative design, 𝐌𝐨𝐥𝐗 provides a comprehensive solution for your molecular visualization needs (coming soon, contact us for early access!). Stay tuned for more updates and demonstrations of 𝐌𝐨𝐥𝐗’s capabilities. 𝐌𝐨𝐥𝐗 BETA:https://lnkd.in/gfw8nEYT  𝐌𝐨𝐥𝐗 example:https://lnkd.in/gG9TXH8D  Mol*:https://lnkd.in/gqud8Mni Annimation paper: https://lnkd.in/gSfhadEC #MolX #Genophore #structuralbiology #drugdiscovery #cryoEM #crystallography #pharmaceuticals #biologics

    • No alternative text description for this image
  • View organization page for Genophore Inc., graphic

    3,093 followers

    The convergence of infotech and biotech requires the seamless embedding of cloud computing and data processes in laboratory workflows. Modern drug discovery operations require a robust, scalable, and highly reliable infrastructure capable of handling a myriad of services efficiently. To achieve this, a combination of several cloud infrastructure such as AWS Container Service (ECS), CloudWatch, and Lambda, as well as several powerful open-source frameworks like Celery, Django, can provide an auto-scaling multi-service infrastructure. 𝐀𝐦𝐚𝐳𝐨𝐧 𝐄𝐥𝐚𝐬𝐭𝐢𝐜 𝐂𝐨𝐧𝐭𝐚𝐢𝐧𝐞𝐫 𝐒𝐞𝐫𝐯𝐢𝐜𝐞 (𝐄𝐂𝐒) Amazon ECS is a fully managed container orchestration service that simplifies the deployment, management, and scaling of containerized applications. At Genophore, ECS plays a pivotal role in our infrastructure for several reasons: 𝟭. 𝗖𝗼𝗻𝘁𝗮𝗶𝗻𝗲𝗿𝗶𝘇𝗮𝘁𝗶𝗼𝗻 𝗮𝗻𝗱 𝗠𝗶𝗰𝗿𝗼𝘀𝗲𝗿𝘃𝗶𝗰𝗲𝘀 ECS allows us to containerize our applications, enabling a microservices architecture. Each service runs in its own container, isolated from others, which ensures that different components of our infrastructure can be developed, deployed, and scaled independently. This modularity enhances our ability to update individual services without disrupting the entire system. 𝟮. 𝗔𝘂𝘁𝗼-𝗦𝗰𝗮𝗹𝗶𝗻𝗴 ECS supports auto-scaling, which is crucial for managing the fluctuating workloads inherent in our operations. By automatically adjusting the number of running container instances based on demand, ECS ensures optimal resource utilization and cost efficiency. This elasticity allows us to maintain performance during peak loads and scale down during off-peak times, reducing operational costs, and ensuring an excellent user experience at all times. 𝟯. 𝗜𝗻𝘁𝗲𝗴𝗿𝗮𝘁𝗶𝗼𝗻 𝘄𝗶𝘁𝗵 𝗔𝗪𝗦 𝗘𝗰𝗼𝘀𝘆𝘀𝘁𝗲𝗺 As part of the AWS ecosystem, ECS integrates seamlessly with other AWS services such as Elastic Load Balancing (ELB), CloudWatch, and Identity and Access Management (IAM). This integration provides comprehensive monitoring, load balancing, and security features, enhancing the reliability and security of our infrastructure. Read the full article in the link below

    How to enable drug discovery operations using auto-scaling multi-service cloud infrastructure on AWS

    How to enable drug discovery operations using auto-scaling multi-service cloud infrastructure on AWS

    Genophore Inc. on LinkedIn

  • View organization page for Genophore Inc., graphic

    3,093 followers

    AF2BIND: Automatic ligand binding site prediction using deep neural network AF2BIND is a deep learning method for the prediction of ligand binding sites on proteins. AF2BIND builds on the capacity of AlphaFold2 (AF2) by adapting the pairwise representation of the AF2 model. The AF2BIND was trained to accurately identify amino acid residues in a query protein that would bind to small-molecule ligands. Traditionally, ligand binding sites in target proteins are typically predicted by superposing the structure of the target protein on the structure of a related homologue with a known bound ligand. Essentially, the binding site of the known complex is implied based on the structural similarity. However, this approach fails when there is no known homologous structure with a bound ligand to be used as template for the query protein. Unlike other methods for the same task, AF2BIND is a logistic regression model directly trained on the pairwise representation of AF2 that does not rely on multiple sequence alignments, homology models, or prior knowledge of the true ligand for a target protein. The model was trained on labeled non-redundant protein-ligand complex structures from the PDB. The authors compared the suitability of single representation features from various models including AF2, ESM2, and ESM-IF1. This entails feature representation for either the sequence or structure of the target protein without bait amino acids. The pairwise representation of AF2 was found to be the most suitable for the binding site prediction task. AF2BIND takes as input the amino acid sequence of the target protein, its backbone structure, and the 20 individual canonical amino acids as baits functioning as surrogates for a small-molecule ligand. The prediction output by AF2BIND is a probability score, P(bind), for each residue of the target protein indicating the likelihood of the residue to be a ligand-binding residue. In benchmarking tests involving GPCRs and bromodomains, AF2BIND accurately predicted the binding site residues for the test proteins. Notably, AF2BIND assigns varying probability scores to the predicted residues, offering an inherent ranking of which residues might engage ligands more. AF2BIND offers a way to identify potential ligand binding sites in target proteins with or without ligand-bound homologous structures. Could it also facilitate the identification of cryptic pockets in drug targets? Paper: https://lnkd.in/gm-r3EN6 Code: https://lnkd.in/g3ViG6p6 Notebook: https://lnkd.in/gTHJwWTj #bioinformatics #SBDD #drugdiscovery #pharmacuiticals #ML #DL #structuralbiology #genophore

    • No alternative text description for this image
  • View organization page for Genophore Inc., graphic

    3,093 followers

    AFsample2: Sampling the Diverse Conformational Landscape of Proteins By Generative Models AFsample2 is a generative model based on AlphaFold2 (AF2), capable of predicting multiple conformations of protein structures from sequence. AFsample2 achieves this by introducing more noise to the inference step of the AF2 neural network. The ability to predict biologically relevant ensembles of protein structures would not only facilitate broader understanding of biological processes but also enable deeper insights into disease mechanisms, opening up new opportunities for targeted drug development. To achieve improved diversity of structures predicted by AF2, AFsample2 randomly masks columns in the MSA supplied to AF2 to by debias the model from the co-evolutionary reliance of its prediction process. In other words, AFsample2 reduces the constraints imposed by co-evolutionary signals in input MSAs. This favors the prediction of alternative structural states of query sequences as the breakage in covariance signals forces the network to arrive at varying solutions. The column masking approach employed in AFsample2 shares some similarity with that utilized in SPEACH_AF. However, it differs in that SPEACH_AF introduces a sliding window of alanines (i.e. alanine scanning) at specific columns informed by prior knowledge of interacting residues based on existing structural information or contacts in generated models. AFsample2 does not rely on the need for such prior knowledge. In a benchmark involving the open-closed conformations data sets, AFsample2 enabled the prediction of alternative states for 17 out of 23 cases, without loss of preference for the dominant end-state. For membrane protein transporters, AFsample2 achieved improved alternate state predictions for 12 of 16 test cases. The improved sampling by AFsample2 also enhanced the TM-score of prediction of end state conformations relative to experimental structures, improving previous predictions with 0.58 scores to 0.98. Further, AFsample2 improved the prediction and diversity of intermediate states by 70 % compared to AF2. Compared to other methods, including AFcluster, the quality of models generated by AFsample2 were significantly better. 💪 While AFsample2 predicts protein ensembles, the model also offers a novel way to select single alternate end-state structures from the generated conformations. This approach does not depend on the need for experimental reference structure and follows a three-stage process involving: 1️⃣ Calculating the similarity of each conformation in the ensemble to the best model, 2️⃣ Confidence screening to filter out models below a certain threshold, and 3️⃣ Extremity selection to identify the model (alternative state) that is furthest from the most confident model. 📜 Paper by Kalakoti and Wallner: https://lnkd.in/gYNhCv5W 💻 GitHub: https://lnkd.in/gYbSfNSt #structuralbiology #drugdiscovery #ML #AF2 #genophore

    • No alternative text description for this image
  • View organization page for Genophore Inc., graphic

    3,093 followers

    ABodyBuilder3: Language model embeddings enable scalable and precise antibody structure prediction Kenlay et al. recently reported ABodyBuilder3, which introduces major upgrades to the previous version, ABodyBuilder2. ABodyBuilder2 is part of the ImmuneBuilder tools, a set of specialized deep learning models for the prediction of immune protein structures, with ABodyBuilder2 being for antibodies (Abs). The other tools in the suite are NanoBodyBuilder2 (for nanobodies) and TCRBuilder2 (for T-cell receptors). ABodyBuilder2 diverged from its predecessor, ABodyBuilder, which was a homology modeling pipeline. Predicting the structures of Abs provides crucial indicators for accurately modeling their biophysical behaviors and enabling rational therapeutic Ab design pipelines. Although several methods have been developed to specifically predict Ab variable regions, including IgFold, DeepAb, ABlooper, and many more, the accuracy of predicted Ab models still has room for improvement. The upgrades introduced in ABodyBuilder3 encompass model implementation, data curation, sequence representation, and refinement of predicted models. These changes facilitate improved scalability and accuracy of ABodyBuilder3 over ABodyBuilder2. ABodyBuilder3 incorporates protein language models (PLMs) for residue representation, replacing the one-hot encoding used in ABodyBuilder2, leading to improved accuracy in predicting Ab complementarity-determining regions (CDRs). ABodyBuilder3 uses a ProT5-powered sequence embedding representation of the Ab variable region, which is fed into a series of 8 sequential structure modules to generate the all-atom structure of the Ab as well as uncertainty estimates. In addition to improved accuracy over ABodyBuilder2, ABodyBuilder3 introduces a pLDDT-based per-residue error estimate of its predictions, replacing the ensemble-based error estimate used in ABodyBuilder2. The accuracy of predicted models by ABodyBuilder3 can be further improved with additional steps of physics-based structure relaxation and refinement. To implement these structure refinement steps for improved accuracy, the authors evaluated OpenMM and YASARA for geometry and stereochemical corrections. The YASARA2 force field in an explicit water model was found to give better results. Paper: https://lnkd.in/gBsJc3W7 GitHub: https://lnkd.in/gC76sc9z Example (Notebook): https://lnkd.in/gTJQzf9q ImmuneBuilder/ABodyBuilder2 Paper: https://lnkd.in/eSdmrXBb #biopharmaceuticals #structuralbiology #drugdiscovery #biologics #ML

    • No alternative text description for this image
  • View organization page for Genophore Inc., graphic

    3,093 followers

    ProSTAGE: How Can Graph Convolutional Networks Reveal the Effect of Mutations on Protein Stability ProSTAGE is a deep learning method that fuses structure and sequence embeddings to predict protein stability changes upon single point mutations. The model leverages graph-based techniques and language models, combining the strengths of both to achieve superior predictive accuracy. ProSTAGE is designed to address the limitations of traditional methods by utilizing a larger dataset, nearly twice the size of the commonly used S2648 dataset. This approach ensures that ProSTAGE consistently outperforms existing state-of-the-art methods on mutation-affected problems, as benchmarked on several independent datasets. Protein thermodynamic stability is crucial for understanding the relationships between protein structure, function, and interaction. It plays a significant role in various biotechnological applications, including protein-based therapeutics, biocatalysts, and diagnostics. The impact of mutations on protein stability (ΔΔG) is particularly important, as mutations can lead to misfolding, genetic disorders, cancers, and neurodegenerative diseases. Key Features of ProSTAGE: Graph Convolutional Networks (GCN): ProSTAGE employs GCN to capture short-range residue interactions around mutation sites. The spatial adjacency matrix (SAM) captures the geometric relationships between amino acids, enhancing the model's ability to predict stability changes. Protein Sequence Embeddings: By using embeddings from the ProtT5-XL-Uniref50 pretrained model, ProSTAGE effectively models long-range sequence information without requiring domain-specific knowledge. This approach leverages the rich context provided by protein language models. Extensive Data Training: ProSTAGE is trained on a curated dataset of 11,304 mutations across 318 proteins, making it the largest dataset used for protein stability prediction to date. This extensive training ensures robust performance and minimizes overfitting. Highlights: 🔥 S669 Dataset: ProSTAGE achieved a PCC of 0.70, RMSE of 1.37, and MAE of 0.97 kcal/mol, outperforming all other predictors on this balanced and strict blind dataset. 🚀 Tm262 and Tm108 Datasets: ProSTAGE excelled in identifying stabilizing and destabilizing mutations, achieving AUC values of 0.80 and 0.71, respectively, with high accuracy and precision. 💪 Deep Mutational Scanning (DMS) Data: On the CAGI5 challenge datasets (PTEN and TPMT), ProSTAGE achieved PCC values of 0.56 and 0.53, respectively, significantly outperforming other methods. Paper: https://lnkd.in/gQc38sMS Data:https://lnkd.in/gMqV4qNv

    • No alternative text description for this image

Similar pages

Browse jobs