Stack Overflow Asked by AndrewLittle1 on January 3, 2022
I have a list_3, with one element, a string:
[['nnn Headquarters or Regional OfficennnnntttttttttMain Headquarterstttttttnn', 'nnn FoundersnnnnntttttttttThomas Lon Vantttttttnn', 'nnn Founder DiversitynnnnntttttttttN/Atttttttnn', 'nnn Year Foundednnnnnttttttttt2016tttttttnn', 'nnn # of Employeesnnnnnttttttttt1-10tttttttnn', 'nnn Seeking Funding?nnnnntttttttttNo tttttttnn', 'nnn Funding PhasennnnntttttttttN/Atttttttnn'], ['nnn Headquarters or Regional OfficennnnntttttttttMain Headquarterstttttttnn', 'nnn FoundersnnnnntttttttttMacKenzie T Stout,tttttttnn', 'nnn Founder DiversitynnnnntttttttttN/Atttttttnn', 'nnn Year Foundednnnnnttttttttt2020tttttttnn', 'nnn # of Employeesnnnnnttttttttt1-10tttttttnn', 'nnn Seeking Funding?nnnnntttttttttYestttttttnn', 'nnn Funding PhasennnnntttttttttPre-Seedtttttttnn']]
I want to use regex to strip ntr, from the output and return the text in an easy to read format
This is what I have tried:
list_33 = []
for i in list_3:
string = ''.join(list_3)
list_33.append(re.sub('s+','', string))
print(list_33)
output:
['HeadquartersorRegionalOfficeMainHeadquarters', 'FoundersThomasLonVan', 'FounderDiversityN/A', 'YearFounded2016', '#ofEmployees1-10', 'SeekingFunding?No', 'FundingPhaseN/A']
This is almost what I need but I would like there to be one space between each word and colon after the first text block from list_3, ie:
['Headquarters or Regional Office: Main Headquarters', 'Founders: Thomas Lon Van', 'Founder Diversity: N/A', 'Year Founded: 2015', '# of Employees 1-10', 'Seeking Funding?: No', 'Funding Phase: N/A']
Any ideas of how I can incorporate both regex functions into one?
Thanks
ps. I know that I don’t need to use a for loop for a list with just one element, but in the future the list will have more elements, I am trying to generalize the code structure using just one input right now.
You can navigate through each string in the list and the use re.sub
to replace each occurrence of more than 2 white space by a :
>>> import re
>>> lst = ['nnn Headquarters or Regional OfficennnnntttttttttMain Headquarterstttttttnn', 'nnn FoundersnnnnntttttttttThomas Lon Vantttttttnn', 'nnn Founder DiversitynnnnntttttttttN/Atttttttnn', 'nnn Year Foundednnnnnttttttttt2016tttttttnn', 'nnn # of Employeesnnnnnttttttttt1-10tttttttnn', 'nnn Seeking Funding?nnnnntttttttttNo tttttttnn', 'nnn Funding PhasennnnntttttttttN/Atttttttnn']
>>> [re.sub(r'ss+', ': ', word).strip(': ') for word in lst]
['Headquarters or Regional Office: Main Headquarters', 'Founders: Thomas Lon Van', 'Founder Diversity: N/A', 'Year Founded: 2016', '# of Employees: 1-10', 'Seeking Funding?: No', 'Funding Phase: N/A']
Answered by Prem Anand on January 3, 2022
2 Asked on December 16, 2020 by robin-ellerkmann
1 Asked on December 16, 2020
0 Asked on December 15, 2020 by daan
1 Asked on December 15, 2020 by umakanth-pendyala
1 Asked on December 15, 2020 by nintendoeats
1 Asked on December 15, 2020 by niclassic
2 Asked on December 15, 2020 by roman-safonov
1 Asked on December 15, 2020
1 Asked on December 15, 2020 by jorge-valentini
4 Asked on December 15, 2020
4 Asked on December 15, 2020 by tom-hanks
1 Asked on December 15, 2020 by felipe
1 Asked on December 14, 2020 by kat
1 Asked on December 14, 2020 by hans
2 Asked on December 14, 2020 by frontdev24
Get help from others!
Recent Answers
Recent Questions
© 2023 AnswerBun.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP