PSET 6 DNA: How to count number of runs of consecutive STRs

Stack Overflow Asked by Leoness on September 14, 2020

I’ve written most of the code for PSET6 and I get the gist of it. But I am stuck on some of the logic surrounding how to iterate through the DNA sequence and compute the longest run of consecutive of STRs. For anyone not taking the CS50 class, I basically have to implement a program that identifies a person based on their DNA. To solve that problem, I have to iterate through a DNA sequence and count how many time a substring of DNA repeats and count the longest run of consecutive runs of that substring. Here is my code

def count_substring(sequence_dna, substring):
        strcounter = 0
        run = []
        substring = strlist
        for i in range(len(sequence_dna) - len(substring)):
            while i < len(sequence_dna):
                if sequence_dna[i:i+len(substring)] == substring:
                    strcounter += 1
                    i += i + len(substring)
                    run += 1
                    run = 0
                    i += i + len(substring)
        return max(run)

So, I created variables to keep track of the total number of repeats (strcounter) and an array to keep track of the longest consecutive runs of STR repeats. I opened the csvfile and set the first row of STR names equal to substring. I then looped through the DNA text file and started counting how many times the STR appears, but this is where I am stuck, hence my code being incomplete. I am unsure of how to code for and add the number of runs to the array I created. I also am not sure if returning the max will take the max of all the values or the max of each STR’s number of runs. Can anyone help me? Any advice (on anything you see wrong) is greatly appreciated!

One Answer

Leoness, you seem to have declared run as a list with run = []. And you use it later as an int variable with run += 1 and again as list with run.append(). Consider how you want to program this part of the logic.

An alternate approach to calculating the run and "max" "length" would be to use regular expressions using the findall method.

Answered by Anil George on September 14, 2020

Add your own answers!

Related Questions

How to solve this UTF-8 encoding C problem?

1  Asked on December 5, 2021 by eamon-ryan


What is the @return_value of stored procedure

2  Asked on December 5, 2021 by william-beachy


How to create bins in Python

2  Asked on December 5, 2021 by commander


Problem in executing code to convert hexadecimal to decimal

2  Asked on December 5, 2021 by user13882705


Does this method creates extra space?

1  Asked on December 5, 2021


Concatenating two numpy arrays side by side

2  Asked on December 5, 2021 by bare


Euclidean Algorithm in Python

2  Asked on December 5, 2021 by luke-newman


How do i remove the last 5 characters in a string?

2  Asked on December 5, 2021 by ktulu-hala


Basic problems in python

2  Asked on December 5, 2021 by ujjwal


comparing with `{}` and truthyness

4  Asked on December 5, 2021 by dcsan


List wtih Python

1  Asked on December 5, 2021 by klm


Ask a Question

Get help from others!

© 2022 All rights reserved. Sites we Love: PCI Database, MenuIva, UKBizDB, Menu Kuliner, Sharing RPP