Stack Overflow Asked by DoubleDouble on July 22, 2020
I have two arrays of different sizes, but I am trying to overwrite some values within the first array with values from the second array on the matching "keys". My actual problem may have many, many rows, and I have already determined that this is currently bottle-necking my program.
edit: I failed to recognize that there may be duplicate values in a1, which should stay duplicated. I added one such example to the np.array examples.
example:
import numpy as np
# first two columns are 'keys', overwrite the 3rd column in a1 with the 3rd column from a2
# some values may be missing from a2. Those should keep the value in a1
a1 = np.array([[ 0.0, 2.0, 10.0 ],
[ 0.0, 2.0, 10.0 ],
[ 0.0, 3.0, 10.0 ],
[ 1.0, 3.0, 10.0 ],
[ 1.0, 13.0, 10.0 ],
[ 2.0, 2.0, 10.0 ],
[ 2.0, 5.0, 10.0 ]])
a2 = np.array([[ 0.0, 2.0, 0.0 ],
[ 0.0, 3.0, 0.713 ],
[ 1.0, 3.0, 0.713 ],
[ 1.0, 13.0, 1.0 ],
[ 2.0, 2.0, 0.0 ]])
# wanted result:
np.array([[ 0.0, 2.0, 0.0 ],
[ 0.0, 2.0, 0.0 ],
[ 0.0, 3.0, 0.713 ],
[ 1.0, 3.0, 0.713 ],
[ 1.0, 13.0, 1.0 ],
[ 2.0, 2.0, 0.0 ],
[ 2.0, 5.0, 10.0 ]])
When I do this brute force, I would simply take each row in a2
and loop through each row in a1
to replace values on matches, but is there a way to do this that runs more efficiently? Some way to vectorize the operation on at least one of the loops? My actual case involves many rows in both arrays and this takes a looooong time.
If column three is getting updated and you want to use pandas:
import numpy as np
import pandas as pd
a1 = np.array([[ 0.0, 2.0, 10.0 ],
[ 0.0, 2.0, 10.0 ],
[ 0.0, 3.0, 10.0 ],
[ 1.0, 3.0, 10.0 ],
[ 1.0, 13.0, 10.0 ],
[ 2.0, 2.0, 10.0 ],
[ 2.0, 5.0, 10.0 ]])
a2 = np.array([[ 0.0, 2.0, 0.0 ],
[ 0.0, 3.0, 0.713 ],
[ 1.0, 3.0, 0.713 ],
[ 1.0, 13.0, 1.0 ],
[ 2.0, 2.0, 0.0 ]])
d1 = pd.DataFrame(a1)
d2 = pd.DataFrame(a2)
d3 = d2.set_index([0,1])[[2]].combine_first(d1.set_index([0,1])[[2]]).reset_index().to_numpy()
d3
Output:
array([[ 0. , 2. , 0. ],
[ 0. , 2. , 0. ],
[ 0. , 3. , 0.713],
[ 1. , 3. , 0.713],
[ 1. , 13. , 1. ],
[ 2. , 2. , 0. ],
[ 2. , 5. , 10. ]])
Answered by Scott Boston on July 22, 2020
Concatenate a2
and a1
and leave only unique rows for first 2 columns.
a_all = np.r_[a2, a1]
a_all = a_all[np.unique(a_all[:, :2], axis=0, return_index=True)[1]]
Answered by V. Ayrat on July 22, 2020
The solution has 2 parts. First, you need to identify which keys in a1 aren't in a2, and then you need to figure out which row of a2 each row of a1 is associated with.
Here's my solution:
equiv = np.all(np.equal(a1[:,None,:2],a2[None,:,:2]),-1)
mask = np.any(equiv,-1)
ind = np.argmax(equiv,0)
a1[mask,2] = a2[ind,2]
I start by broadcasting both arrays to conforming dimensions and computing the equivalence matrix that tells me for each row of a1 and a2 which are equal for both elements.
Then, it's easy to figure out which rows of a1 are not included in a2 and make a boolean mask from the previous result. We can also find the element number associated with each pair.
Finally, you associate every value of the last column of a1 that has a correspondence in a2 with the associated element in a2.
Answered by asimoneau on July 22, 2020
Would you consider other packages like Pandas?
import pandas as pd
d2 = pd.DataFrame(a2).set_index([0,1])
d1 = pd.DataFrame(a1).set_index([0,1])
d1.update(d2)
d1.reset_index().values
Output:
array([[ 0. , 2. , 0. ],
[ 0. , 2. , 0. ],
[ 0. , 3. , 0.713],
[ 1. , 3. , 0.713],
[ 1. , 13. , 1. ],
[ 2. , 2. , 0. ],
[ 2. , 5. , 10. ]])
Answered by Quang Hoang on July 22, 2020
5 Asked on February 11, 2021 by indian_trash
0 Asked on February 11, 2021 by barry-griffey
0 Asked on February 10, 2021 by swati
2 Asked on February 10, 2021 by obsidian
3 Asked on February 10, 2021 by aniket-ujgare
2 Asked on February 10, 2021 by user14955679
2 Asked on February 10, 2021 by peter-flanagan
4 Asked on February 10, 2021 by diego
2 Asked on February 10, 2021 by data_mind
2 Asked on February 10, 2021 by gremlinshx
0 Asked on February 10, 2021 by zachary-kennedy
0 Asked on February 10, 2021 by matheus-nascimento
1 Asked on February 10, 2021 by aamer-salame
2 Asked on February 10, 2021 by morimn
3 Asked on February 9, 2021 by doliphin
4 Asked on February 9, 2021 by osk6r
Get help from others!
Recent Questions
Recent Answers
© 2022 AnswerBun.com. All rights reserved. Sites we Love: PCI Database, MenuIva, UKBizDB, Menu Kuliner, Sharing RPP