TransWikia.com

FASTA and PDB: How to specify chain?

Bioinformatics Asked by lazer-guided-lazerbeam on March 19, 2021

For proteins that have multiple chains (e.g., 1EMS), is there an easy way to specify which chain I want to use for blastp?

I cannot imagine that I am the first person to have this problem, but so far my only solution has been to write my own script that gets the chain I want from the .pdb file and convert that to a .fasta file.

2 Answers

WEB VERSION OF BLASTP

When you run a blastp query using a fasta file with multiple chains you can choose which chain's results to examine using a drop-down box that appears after the search has run.

TERMINAL BLASTP

The results for each chain will be listed separately. For ease of searching you can specify the output to be a csv of your desired format (See blastp -help and look up the -outfmt option for more information).

Note that for some proteins, such as the 1EMS you chose for an example, the multiple chains are identical. When you get the fasta file for that specific protein it will only have one entry. However if you really need a fasta file formated so each chain has its own sequence you can get the fasta file for all PDB entries and extract the specific protein_chain you want.

Correct answer by lazer-guided-lazerbeam on March 19, 2021

The NCBI BLAST can be used with the PDB DB (which NCBI has). The PDB codes are stored as 4 letter codes underscore chain, e.g. 1GFL_B. The catch is segment identifiers —but generally they are the same peptide so shouldn't be an issue.

You can search specifically the PDB DB in NCBI (not the RCSB PDB) by setting the database to PDB.

Answered by Matteo Ferla on March 19, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP