TransWikia.com

Find the most frequent words that appear in the dataset

Stack Overflow Asked by pythonnew on November 29, 2021

I write a function that takes as input a list and returns the most common item in the list.

##Write the function
def most_frequent(List): 
    dict = {} 
    count, itm = 0, '' 
    for item in reversed(List): 
        dict[item] = dict.get(item, 0) + 1
        if dict[item] >= count : 
            count, itm = dict[item], item 
    return(item) 
  
    return num 

# verfiy the code 

list = [5,42,34,6,7,4,2,5]
print(most_frequent(list)) 

and then download two text file to get the most frequent words.

# Download the files restaurants.txt and restaurant-names.txt from Github
!curl https://raw.githubusercontent.com/ipeirotis/introduction-to-python/master/data/restaurant-names.txt -o restaurant-names.txt
!curl https://raw.githubusercontent.com/ipeirotis/introduction-to-python/master/data/restaurants.txt -o restaurants.txt



# create the list from the restaurants.txt
  List = open("restaurants.txt").readlines()

# get the most most frequent restaurant names
print("The most frequent restaurant names is ",most_frequent(List))

print(most_common(List))

but when i try to find the most frequent words that appear in the restaurant names. I got the same result. Could you help to check whether this is correct or not? Thanks

 # create the list from the restaurants.txt
List = open("restaurants.txt").readlines()

# get the most most frequent restaurant names
print("The most frequent restaurant names is ",most_frequent(List))

3 Answers

It's return itm (most common item) instead of return item (last part of your reversed list)

Answered by Blacksad on November 29, 2021

It might be the function that is wrong, what if you try the same test data but in a different order, for example: list = [42,5,34,6,5,7,4,2] instead of list = [5,42,34,6,7,4,2,5], is the output still 5?

Answered by MrPostduif on November 29, 2021

It seems as though you might be using the wrong filename for the restauarant names file. Judging from your curl command:

!curl https://raw.githubusercontent.com/ipeirotis/introduction-to-python/master/data/restaurant-names.txt -o restaurant-names.txt

The filename you should be using is restaurant-names.txt so your code should be:

 # create the list from the restaurants.txt
List = open("restaurants-names.txt").readlines()

# get the most most frequent restaurant names
print("The most frequent restaurant names is ",most_frequent(List))

Answered by Spencer Bard on November 29, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP