Bioinformatics Asked on May 7, 2021
Hi I am trying to learnt python3 and Biopython,
I am trying to check imported fasta file before processing so far using:
sequences = SeqIO.parse("fake_fasta_file", "fasta") if not any(sequa): print('n SOMETHING IS WRONG WITH YOUR INPUT FILE n') exit()
works but as soon as I enter a ‘>’ in my fake_fasta_file , as an example
nihvivicòvo-l ,mnjobùJN , Békibpç , mnkn nbigp ù mlkn mlnù > ftfyfiuuogpigèi
any(sequences) returns True even if:
for seq_record in sequences: print(seq_record)
doesnt print anything
Any better way to check for real fasta file in Biopython ?
Or maybe it my fault using any() on the iterator instead of creating
a list of seq_record and checking on that ?
any() loops over the iterator until the first truthy item is found or the end is reached. The iterator is not reset to the start after this. Using a list will change that behaviour since a list is not exhausted like a iterator. So, remove your
any() call and you will see a sequence record printed. Your
'fake_fasta_file.fa' is actually recognised by Biopython as valid fasta:
>>> SeqIO.read('fake_fasta_file.fa', 'fasta') SeqRecord(seq=Seq(''), id='ftfyfiuuogpigèi', name='ftfyfiuuogpigèi', description=' ftfyfiuuogpigèi', dbxrefs=)
Biopython ignores lines before the first line beginning with
'>' (as possible comment lines) and then starts collecting records, in this case an empty sequence record is still a record. This behaviour is deliberate from Biopython, although you could argue if it's best or not, it's unlikely to change for historical reasons.
What is real fasta? (philosophical pause) If you want to validate a particular definition of FASTA format then you can write your own parser depending on the exact specification and rules you want.
from Bio import SeqIO records = SeqIO.parse("myfile.fa", "fasta") first = next(records, None) if first is None: print("No fasta records") elif first.seq == "": print("Empty sequence") elif all(c not in "ATCGN" for c in first.seq.upper()): print("Non ATCGN characters")
Correct answer by Chris_Rands on May 7, 2021
0 Asked on July 11, 2021
1 Asked on July 10, 2021 by i_dont_know
3 Asked on July 9, 2021 by igor-filippov
1 Asked on July 5, 2021
2 Asked on July 5, 2021
0 Asked on July 4, 2021
0 Asked on July 4, 2021
1 Asked on July 3, 2021 by ehsan-salehabadi
2 Asked on July 1, 2021 by yuval
8 Asked on June 26, 2021 by conchoecia
Get help from others!