TransWikia.com

PSET 6 DNA: How to count number of runs of consecutive STRs

Stack Overflow Asked by Leoness on September 14, 2020

I’ve written most of the code for PSET6 and I get the gist of it. But I am stuck on some of the logic surrounding how to iterate through the DNA sequence and compute the longest run of consecutive of STRs. For anyone not taking the CS50 class, I basically have to implement a program that identifies a person based on their DNA. To solve that problem, I have to iterate through a DNA sequence and count how many time a substring of DNA repeats and count the longest run of consecutive runs of that substring. Here is my code

def count_substring(sequence_dna, substring):
        strcounter = 0
        run = []
        substring = strlist
        for i in range(len(sequence_dna) - len(substring)):
            while i < len(sequence_dna):
                if sequence_dna[i:i+len(substring)] == substring:
                    strcounter += 1
                    i += i + len(substring)
                    run += 1
                    run.append()
                else:
                    run = 0
                    i += i + len(substring)
        return max(run)

So, I created variables to keep track of the total number of repeats (strcounter) and an array to keep track of the longest consecutive runs of STR repeats. I opened the csvfile and set the first row of STR names equal to substring. I then looped through the DNA text file and started counting how many times the STR appears, but this is where I am stuck, hence my code being incomplete. I am unsure of how to code for and add the number of runs to the array I created. I also am not sure if returning the max will take the max of all the values or the max of each STR’s number of runs. Can anyone help me? Any advice (on anything you see wrong) is greatly appreciated!

One Answer

Leoness, you seem to have declared run as a list with run = []. And you use it later as an int variable with run += 1 and again as list with run.append(). Consider how you want to program this part of the logic.

An alternate approach to calculating the run and "max" "length" would be to use regular expressions using the findall method.

Answered by Anil George on September 14, 2020

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP