TransWikia.com

How do I check if a word falls in one or the other array?

Stack Overflow Asked by BHA on February 9, 2021

I’m trying to read through a text file word by word and then compare that word to a set of categorized arrays (joy, sadness, anger etc.) and if it is in that array, add the identifier for that array (a unique number assigned to them) to an empty list.

This is what I have:

anger = ['abandoned', 'abandonment', 'abhor', 'abhorrent', 'abolish',....]
joy = ['absolution', 'abundance', .....]
sadness = ['abandoned', ....]
trust = ['example word']

# Python program to read  
# file word by word 
DNA = []
# opening the text file 
with open('ch1.txt','r', encoding="utf8") as file: 
   
    # reading each line     
    for line in file: 
   
        # reading each word         
        for word in line.split(): 
             if word in joy:
               DNA.append('6');
    
    # reading each line     
    for line in file: 
   
        # reading each word         
        for word in line.split():            
             if word in anger:
                DNA.append('2');
                
     # reading each line     
    for line in file: 
   
        # reading each word         
        for word in line.split():      
             if word in fear:
                DNA.append('5');
           
      # reading each line     
    for line in file: 
   
        # reading each word         
        for word in line.split():        
             if word in sadness:
                DNA.append('7');
     
        
     # reading each line     
    for line in file: 
   
        # reading each word         
         for word in line.split():      
             if word in trust:
                DNA.append('9');
               
       # reading each line     
    for line in file: 
   
        # reading each word         
        for word in line.split(): 
             if word in surprise:
               DNA.append('8');
             
       # reading each line     
    for line in file: 
   
        # reading each word         
        for word in line.split():       
             if word in disgust:
               DNA.append('4');

 
print("Emotional DNA: " ,DNA)

What I want the code to do is something like this:

if word is in anger:
   add anger to the list;

or if word is in sadness:
   add sadness to the list;

The word can also appear in multiple categories and in that case both the category identifiers should be appended to the list.

This is the output I expect:

2, 2, 2, 4, 6, 4, 2, 3, 3, 4,8, 4, 5

In summary, all I want is to match the word being read in to see if it falls in one (or multiple) categories, and then add that category’s identifier to the empty list. This is to map the progression of these emotion categories in the text file, one by one.

What is happening with the current code is that it is just reading through the first if statement and giving me this:

6, 6, 6, 6, 6, 6, 6

When I tried replacing the if statements after the first one with elif, only the first two statements were read and the others were ignored. I don’t know how to make this if-else-if ladder work, any help would be appreciated.

2 Answers

If I understand correctly, the problem is that you are looping through the file several times, only looking for matches with one particular list each time.

You can put all the if statements into one for loop:

with open('ch1.txt','r', encoding="utf8") as file: 
   
    # reading each line     
    for line in file: 
   
        # reading each word         
        for word in line.split(): 
             if word in joy:
               DNA.append('6');

             if word in anger:
                DNA.append('2');

             if word in fear:
                DNA.append('5');

             if word in sadness:
                DNA.append('7');

This way the letters will be added word by word, rather than category by category.

Answered by PotatoesFall on February 9, 2021

You should only read through the lines and words once. Then test the words against each list inside that loop.

for line in file:
    for word in line.split():
        if word in joy:
            DNA.append('6')
        if word in anger:
            DNA.append('2')
        if word in fear:
            DNA.append('5')
        # and so on

You're only getting one set of numbers because after you've read through the file, there's nothing left for the next for line in file loop to read. You could rewind with file.seek(0) first, but then you'll get all the 6, then all the 2, etc. You won't know what order they actually occurred in the file.

BTW, if the word lists are large, you should use sets rather than lists.

Answered by Barmar on February 9, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP