TransWikia.com

Seperate strings with regex and panda

Stack Overflow Asked by Sara Daniel on January 11, 2021

I have below content and I need to seperate third part as below with panda in python:

My string:

FA0003 -BL- FA0005-BL
FA0004-BL-FA0008-BL

My Expected:

FA0005
FA0008

Imagine I have a string like this in a column named A, the regex of below string for retrieving FA0003 is as below, but i dont now how to retrieve FA0005?

FA0003 -BL- FA0005-BL
df[A].str.extract(r'(w+s*)', expand=False)
FA0003

One Answer

You can use

^(?:[^-]*-){2}s*([^-]+)

See the regex demo

In Pandas, use it with your current code:

df[A].str.extract(r'^(?:[^-]*-){2}s*([^-]+)', expand=False)

Details

  • ^ - start of string
  • (?:[^-]*-){2} - two occurrences of any chars other than - and then a -
  • s* - zero or more whitespaces (this is used to trim the output)
  • ([^-]+) - Capturing group 1 (the return value): one or more chars other than -.

Correct answer by Wiktor Stribiżew on January 11, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP