TransWikia.com

Datasets for making a ML-based model predicting if a PCR primer will match a mutated template

Bioinformatics Asked by Jantek Mikulski on February 1, 2021

I’m working on a neural-network-based model for predicting if a primer will successfully hybridize with a mutated DNA template during a PCR reaction. This could be useful for all research using PCR-based methods, including COVID-19 research.

As with all machine-learning modelling, what I need is a really big dataset for training the model. The training basically requires only three pieces of information: the template sequence, the primer sequence and if the hybridization (and therefore the PCR reaction) succeed.

Although I managed to find some databases connected to some papers, usually less connected to bioinformatics itself, they are usually quite small.

So here comes my question – do you know any big databases or libraries that contain these pieces of information that are available to the public or university workers? That would help me A LOT and probably would make the model much better.

None of the currently used models use neural networks and their maximum precision is around 85-90% – a lot of room for improvement that could help making the primers more immune to template mutations.

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP