AnswerBun.com

Get data from Multi-index dataframe based on numpy array

Stack Overflow Asked by Neyls on October 27, 2020

From the following dataframe:

dim_0 dim_1                                             
0     0       40.54  23.40  6.70  1.70  1.82  0.96  1.62
      1      175.89  20.24  7.78  1.55  1.45  0.80  1.44
      2        0.00   0.00  0.00  0.00  0.00  0.00  0.00
1     0       21.38  24.00  5.90  1.60  2.55  1.50  2.36
      1      130.29  18.40  8.49  1.52  1.45  0.80  1.47
      2        0.00   0.00  0.00  0.00  0.00  0.00  0.00
2     0        6.30  25.70  5.60  1.70  2.16  1.16  1.87    
      1       73.45  21.49  6.88  1.61  1.61  0.94  1.63
      2        0.00   0.00  0.00  0.00  0.00  0.00  0.00
3     0       16.64  25.70  5.70  1.60  2.17  1.12  1.76
      1      125.89  19.10  7.52  1.43  1.44  0.78  1.40
      2        0.00   0.00  0.00  0.00  0.00  0.00  0.00
4     0       41.38  24.70  5.60  1.50  2.08  1.16  1.85
      1        0.00   0.00  0.00  0.00  0.00  0.00  0.00
      2        0.00   0.00  0.00  0.00  0.00  0.00  0.00
5     0      180.59  16.40  3.80  1.10  4.63  3.86  5.71
      1        0.00   0.00  0.00  0.00  0.00  0.00  0.00
      2        0.00   0.00  0.00  0.00  0.00  0.00  0.00
6     0       13.59  24.40  6.10  1.70  2.62  1.51  2.36
      1      103.19  19.02  8.70  1.53  1.48  0.76  1.38
      2        0.00   0.00  0.00  0.00  0.00  0.00  0.00
7     0        3.15  24.70  5.60  1.50  2.14  1.22  2.00
      1       55.90  23.10  6.07  1.50  1.86  1.12  1.87
      2      208.04  20.39  6.82  1.35  1.47  0.95  1.67

How can I get only the rows from dim_01 that match the array [1 0 0 1 2 0 1 2]?

Desired result is:

 0      175.89  20.24  7.78  1.55  1.45  0.80  1.44
 1       21.38  24.00  5.90  1.60  2.55  1.50  2.36
 2        6.30  25.70  5.60  1.70  2.16  1.16  1.87
 3      125.89  19.10  7.52  1.43  1.44  0.78  1.40
 4        0.00   0.00  0.00  0.00  0.00  0.00  0.00
 5      180.59  16.40  3.80  1.10  4.63  3.86  5.71
 7      103.19  19.02  8.70  1.53  1.48  0.76  1.38
 8      208.04  20.39  6.82  1.35  1.47  0.95  1.67

I’ve tried using slicing, cross-section, etc but no success.

Thanks in advance for the help.

3 Answers

Use MultiIndex.from_arrays and select by DataFrame.loc:

arr = np.array([1, 0, 0, 1, 2, 0, 1 ,2])

df = df.loc[pd.MultiIndex.from_arrays([df.index.levels[0], arr])]
print (df)
          2      3     4     5     6     7     8
0                                               
0 1  175.89  20.24  7.78  1.55  1.45  0.80  1.44
1 0   21.38  24.00  5.90  1.60  2.55  1.50  2.36
2 0    6.30  25.70  5.60  1.70  2.16  1.16  1.87
3 1  125.89  19.10  7.52  1.43  1.44  0.78  1.40
4 2    0.00   0.00  0.00  0.00  0.00  0.00  0.00
5 0  180.59  16.40  3.80  1.10  4.63  3.86  5.71
6 1  103.19  19.02  8.70  1.53  1.48  0.76  1.38
7 2  208.04  20.39  6.82  1.35  1.47  0.95  1.67

arr = np.array([1, 0, 0, 1, 2, 0, 1 ,2])
df = df.loc[pd.MultiIndex.from_arrays([df.index.levels[0], arr])].droplevel(1)
print (df)
        2      3     4     5     6     7     8
0                                             
0  175.89  20.24  7.78  1.55  1.45  0.80  1.44
1   21.38  24.00  5.90  1.60  2.55  1.50  2.36
2    6.30  25.70  5.60  1.70  2.16  1.16  1.87
3  125.89  19.10  7.52  1.43  1.44  0.78  1.40
4    0.00   0.00  0.00  0.00  0.00  0.00  0.00
5  180.59  16.40  3.80  1.10  4.63  3.86  5.71
6  103.19  19.02  8.70  1.53  1.48  0.76  1.38
7  208.04  20.39  6.82  1.35  1.47  0.95  1.67

Correct answer by jezrael on October 27, 2020

Try the following code:

mask_array = [1 0 0 1 2 0 1 2]

df_first = pd.DataFrame() # < It's your first array > 

new_array = df_first[df_first['dim_1'].isin(mask_array)]

Answered by Roman_N on October 27, 2020

I'd go with advanced indexing using Numpy:

l = [1, 0, 0, 1, 2, 0, 1, 2]

i,j = df.index.levels
ix = np.array(l)+np.arange(i.max()+1)*(j.max()+1)
pd.DataFrame(df.to_numpy()[ix])

       0      1     2     3     4     5     6
0  175.89  20.24  7.78  1.55  1.45  0.80  1.44
1   21.38  24.00  5.90  1.60  2.55  1.50  2.36
2    6.30  25.70  5.60  1.70  2.16  1.16  1.87
3  125.89  19.10  7.52  1.43  1.44  0.78  1.40
4    0.00   0.00  0.00  0.00  0.00  0.00  0.00
5  180.59  16.40  3.80  1.10  4.63  3.86  5.71
6  103.19  19.02  8.70  1.53  1.48  0.76  1.38
7  208.04  20.39  6.82  1.35  1.47  0.95  1.67

Answered by yatu on October 27, 2020

Add your own answers!

Related Questions

Unable to understand error in D flip flop code

2  Asked on November 29, 2021 by chaitanya_12789

 

Git: Stashed Changes But Still Can’t Pull

1  Asked on November 29, 2021 by crawfordbenjamin

     

elif a==”no”: ^ SyntaxError: invalid syntax

3  Asked on November 29, 2021 by mayar-kurdi

 

ValueError: Unconverted data remains .000

2  Asked on November 29, 2021 by vbdashes

       

How to make an arraylist for the last dice rolls?

3  Asked on November 29, 2021 by acidixs

 

XMLRead all same Notes

2  Asked on November 29, 2021 by ddave

   

Find the most frequent words that appear in the dataset

3  Asked on November 29, 2021 by pythonnew

   

Jenkins auto generates wrong config for composer

1  Asked on November 29, 2021 by tiny-sunlight

   

Ask a Question

Get help from others!

© 2022 AnswerBun.com. All rights reserved. Sites we Love: PCI Database, MenuIva, UKBizDB, Menu Kuliner, Sharing RPP