Data Science Asked by Donum on January 14, 2021
(I’ve edited the first column name in the labels_df for clarity)
I have two DataFrames, train_df and labels_df. train_df has integers that map to attribute names in the labels_df. I would like to look up each number within a given train_df cell and return in the adjacent cell, the corresponding attribute name from the labels_df.
I’ve tried variations of the function below but fear I am wayyy off:
def my_mapping(df1, df2):
tags = df1['attribute_ids']
for i in tags.iteritems():
df1['new_col'] = df2.iloc[i]
return df1
The data are originally from two csv files:
train.csv
labels.csv
I tried this from @Danny :
sample_train_df['attribute_ids'].apply(lambda x: [sample_labels_df[sample_labels_df['attribute_name'] == i]
['attribute_id_num'] for i in x])
*please note – I am running the above code on samples of each DF due to run times on the original DFs.
which returned:
I created my own data.
train.csv
id,attrib
1,1 2 3
2,3 4 5
3,2 3 5
4,1 1 1
labels.csv
attrib_id,attrib_name
1,a
2,b
3,c
4,d
5,e
Read the csv files and create df1 and df2
After that use
def get_name(x):
result = []
for t in x.split(' '):
result.append(df2[df2['attrib_id']==int(t)]['attrib_name'].values[0])
return result
df1['attrib'] = df1['attrib'].apply(lambda x: get_name(x))
This will result in df1 looking like
I guess you also doing the same thing when you referred @Danny. The only thing important here is to convert the string into integer
Answered by shivam shah on January 14, 2021
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP