# Efficiently copying values from one ndarray to another on unequal sized arrays

Stack Overflow Asked by DoubleDouble on July 22, 2020

I have two arrays of different sizes, but I am trying to overwrite some values within the first array with values from the second array on the matching "keys". My actual problem may have many, many rows, and I have already determined that this is currently bottle-necking my program.

edit: I failed to recognize that there may be duplicate values in a1, which should stay duplicated. I added one such example to the np.array examples.

example:

import numpy as np

# first two columns are 'keys', overwrite the 3rd column in a1 with the 3rd column from a2
# some values may be missing from a2. Those should keep the value in a1

a1 = np.array([[ 0.0,  2.0,  10.0 ],
[ 0.0,  2.0,  10.0 ],
[ 0.0,  3.0,  10.0 ],
[ 1.0,  3.0,  10.0 ],
[ 1.0, 13.0,  10.0 ],
[ 2.0,  2.0,  10.0 ],
[ 2.0,  5.0,  10.0 ]])

a2 = np.array([[ 0.0,  2.0,  0.0   ],
[ 0.0,  3.0,  0.713 ],
[ 1.0,  3.0,  0.713 ],
[ 1.0, 13.0,  1.0   ],
[ 2.0,  2.0,  0.0   ]])

# wanted result:
np.array([[ 0.0,  2.0,  0.0   ],
[ 0.0,  2.0,  0.0   ],
[ 0.0,  3.0,  0.713 ],
[ 1.0,  3.0,  0.713 ],
[ 1.0, 13.0,  1.0   ],
[ 2.0,  2.0,  0.0   ],
[ 2.0,  5.0,  10.0   ]])


When I do this brute force, I would simply take each row in a2 and loop through each row in a1 to replace values on matches, but is there a way to do this that runs more efficiently? Some way to vectorize the operation on at least one of the loops? My actual case involves many rows in both arrays and this takes a looooong time.

If column three is getting updated and you want to use pandas:

import numpy as np
import pandas as pd

a1 = np.array([[ 0.0,  2.0,  10.0 ],
[ 0.0,  2.0,  10.0 ],
[ 0.0,  3.0,  10.0 ],
[ 1.0,  3.0,  10.0 ],
[ 1.0, 13.0,  10.0 ],
[ 2.0,  2.0,  10.0 ],
[ 2.0,  5.0,  10.0 ]])

a2 = np.array([[ 0.0,  2.0,  0.0   ],
[ 0.0,  3.0,  0.713 ],
[ 1.0,  3.0,  0.713 ],
[ 1.0, 13.0,  1.0   ],
[ 2.0,  2.0,  0.0   ]])

d1 = pd.DataFrame(a1)

d2 = pd.DataFrame(a2)

d3 = d2.set_index([0,1])[[2]].combine_first(d1.set_index([0,1])[[2]]).reset_index().to_numpy()
d3


Output:

array([[ 0.   ,  2.   ,  0.   ],
[ 0.   ,  2.   ,  0.   ],
[ 0.   ,  3.   ,  0.713],
[ 1.   ,  3.   ,  0.713],
[ 1.   , 13.   ,  1.   ],
[ 2.   ,  2.   ,  0.   ],
[ 2.   ,  5.   , 10.   ]])


Answered by Scott Boston on July 22, 2020

Concatenate a2 and a1 and leave only unique rows for first 2 columns.

a_all = np.r_[a2, a1]
a_all = a_all[np.unique(a_all[:, :2], axis=0, return_index=True)[1]]


Answered by V. Ayrat on July 22, 2020

The solution has 2 parts. First, you need to identify which keys in a1 aren't in a2, and then you need to figure out which row of a2 each row of a1 is associated with.

Here's my solution:

equiv = np.all(np.equal(a1[:,None,:2],a2[None,:,:2]),-1)
ind = np.argmax(equiv,0)



I start by broadcasting both arrays to conforming dimensions and computing the equivalence matrix that tells me for each row of a1 and a2 which are equal for both elements.

Then, it's easy to figure out which rows of a1 are not included in a2 and make a boolean mask from the previous result. We can also find the element number associated with each pair.

Finally, you associate every value of the last column of a1 that has a correspondence in a2 with the associated element in a2.

Answered by asimoneau on July 22, 2020

Would you consider other packages like Pandas?

import pandas as pd

d2 = pd.DataFrame(a2).set_index([0,1])
d1 = pd.DataFrame(a1).set_index([0,1])

d1.update(d2)
d1.reset_index().values


Output:

array([[ 0.   ,  2.   ,  0.   ],
[ 0.   ,  2.   ,  0.   ],
[ 0.   ,  3.   ,  0.713],
[ 1.   ,  3.   ,  0.713],
[ 1.   , 13.   ,  1.   ],
[ 2.   ,  2.   ,  0.   ],
[ 2.   ,  5.   , 10.   ]])


Answered by Quang Hoang on July 22, 2020

## Related Questions

### Adding up digits of an input number with recursion in Python

5  Asked on February 11, 2021 by indian_trash

### Java writing unlimited text lines to a UI

0  Asked on February 11, 2021 by barry-griffey

### R simple dplyr solution to filter

3  Asked on February 11, 2021 by triss

### Go back browser button action

2  Asked on February 11, 2021 by ljopata

### ValueError: Input 0 of layer sequential is incompatible with the layer: : expected min_ndim=4, found ndim=2. Full shape received: [None, 1]

0  Asked on February 10, 2021 by swati

### What is the difference between model.to(device) and model=model.to(device)?

2  Asked on February 10, 2021 by obsidian

### How to pass array to funtion to use as range based for loop

3  Asked on February 10, 2021 by aniket-ujgare

### passing data values from inner function to inner function with js/jquery

2  Asked on February 10, 2021 by user14955679

### react ref and query selector all

2  Asked on February 10, 2021 by peter-flanagan

### How to resolve “The item you were attempting to purchase could not be found”

4  Asked on February 10, 2021 by diego

### How to avoid getting broken words while webcrawling

2  Asked on February 10, 2021 by data_mind

### Viewmodel doesn’t update data from Android Room, but successfully insert in it

2  Asked on February 10, 2021 by gremlinshx

### Bit Banging a second UART Serial connection for my Raspberry Pi 3 B+ in Python?

0  Asked on February 10, 2021 by zachary-kennedy

### Direct code execution through the JS browser console

0  Asked on February 10, 2021 by matheus-nascimento

### How to correctly write word-break in css?

3  Asked on February 10, 2021 by aiko-schurmann

### Finding a user data based on email

1  Asked on February 10, 2021 by aamer-salame

### What does it mean that void* has the same representation and memory alignment as char*?

2  Asked on February 10, 2021 by morimn

### How to create DOM component with start and end tag in React?

3  Asked on February 9, 2021 by doliphin

### Can I use text when using functions? Or only numbers?

4  Asked on February 9, 2021 by osk6r

### Get the string before replace

0  Asked on February 9, 2021 by tanker