TransWikia.com

Pandas lookup and return Boolean

Stack Overflow Asked by daveskis on December 2, 2021

I have the below dataframe and my objective is to find whether a stock was held from one period to the next. To do this, I created two lookup codes str_previous_previous_period_code and str_current_period_code based on the concatenation of ticker and string conversion of date.

I need a new column to return a boolean value of 1 or 0 if it was held in the prior period. So the logic is:

  • lookup str_previous_period_code
  • if it is found in dataframe, then df['value'] = 1, else df['value'] = 0

I’ve tried .lookup() to start the logic, as per below:

df['value'] = df.lookup(df['str_previous_period_code'], df['str_current_period_code'])

However, I get the following Key Error:

KeyError: 'CXP2001-04-27' 
    ticker  date    close   next_period_close   NATR    score   return  str_previous_period_code    str_current_period_code
0   CXP 2001-04-27  4.615000    4.585000    3.700552    9   -0.006501   CXP2001-04-20   CXP2001-04-27
1   TOL 2001-04-27  1.851068    1.862219    3.174988    9   0.006024    TOL2001-04-20   TOL2001-04-27
2   WOW 2001-04-27  8.832543    8.941464    2.560720    9   0.012332    WOW2001-04-20   WOW2001-04-27
3   WES 2001-04-27  13.205642   12.771989   2.448139    9   -0.032839   WES2001-04-20   WES2001-04-27
4   PPT 2001-04-27  40.000000   40.400000   2.364224    9   0.010000    PPT2001-04-20   PPT2001-04-27
5   FLT 2001-04-27  23.398888   23.309237   2.281367    9   -0.003831   FLT2001-04-20   FLT2001-04-27
6   MIM 2001-04-27  1.260000    1.380000    5.696656    8   0.095238    MIM2001-04-20   MIM2001-04-27
7   ALL 2001-04-27  6.386961    6.113234    5.476623    8   -0.042857   ALL2001-04-20   ALL2001-04-27
8   CXP 2001-05-04  4.585000    4.650000    3.685788    9   0.014177    CXP2001-04-27   CXP2001-05-04
9   TOL 2001-05-04  1.862219    1.866679    3.139378    9   0.002395    TOL2001-04-27   TOL2001-05-04
10  WES 2001-05-04  12.771989   13.321481   2.572519    9   0.043023    WES2001-04-27   WES2001-05-04
11  WOW 2001-05-04  8.941464    9.456366    2.552963    9   0.057586    WOW2001-04-27   WOW2001-05-04
12  PPT 2001-05-04  40.400000   39.991000   2.313191    9   -0.010124   PPT2001-04-27   PPT2001-05-04
13  FLT 2001-05-04  23.309237   23.194881   2.262463    9   -0.004906   FLT2001-04-27   FLT2001-05-04
14  ALL 2001-05-04  6.113234    6.200552    5.699601    8   0.014283    ALL2001-04-27   ALL2001-05-04
15  MIM 2001-05-04  1.380000    1.340000    5.289190    8   -0.028986   MIM2001-04-27   MIM2001-05-04

One Answer

I guess you can do the lookup using either :

  • the map method of the Series you get with df['str_current_period_code'] :
# to avoid calling the tolist method on each iteration:
previous_period_code = df['str_previous_period_code'].tolist()

# fill the 'value' column according to your logic :
df['value'] = df['str_current_period_code'].apply(
    lambda x: 1 if x in previous_period_code else 0)
  • the isin method of the df['str_current_period_code'] Series, but the result will be as True/False instead of 1 and 0 as you ask (this method might be faster than the first one) :
df['value'] = df['str_current_period_code'].isin(df['str_previous_period_code'])

Answered by mgc on December 2, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP