# The fastest way to find one unique value when all other values are the same

Code Review Asked by Jacques on January 4, 2022

I have completed this question and I am wondering what is the fastest way to solve it.

The question is "There is an array with some numbers. All numbers are equal except for one. Try to find it!"

Example:

find_uniq([ 1, 1, 1, 2, 1, 1 ]) == 2
find_uniq([ 0, 0, 0.55, 0, 0 ]) == 0.55


I came up with the solution:

from collections import Counter

def find_uniq(arr):
nums = list(Counter(arr).items())
data = [i for i in nums if i == 1]
return data


I decided on using Counter because I felt comfortable using it but when looking at others answers some use sets and others use counter as well.

I am wondering is my code sufficient and which method to solving this question would lead to the fastest time complexity?

Why do n comparisons when you only need ~n/2? We can compare every pair of elements until we find a non-matching pair, then "short-circuit" and return whichever element is unique.

def find_uniq(arr):
common = arr
if common not in arr[1:3]:
return common
for a, b in zip(arr[1::2], arr[2::2]):
if a != b:
if a == common:
return b
else:
return a
return arr[-1]


A further improvement would be to use iter to avoid copies of arr being made in the zip statement.

def find_uniq(arr):
iterator = iter(arr)
common = next(iterator)
if common not in arr[1:3]:
return common
for a, b in zip(iterator, iterator):
if a != b:
if a == common:
return b
else:
return a
return arr[-1]


Answered by Kyle G on January 4, 2022

First, check that there are, at least, 3 elements otherwise this is undefined!

Personally, I would check the first and second elements:

1. If different: one of them is the one you are looking for. Compare with third element.
2. If equal: iterate over all elements until you find it.

This seems to be the most optimal solution:

from collections.abc import Iterable

def unique_different(iterable):
assert isinstance(iterable, Iterable)
assert len(iterable) > 2
if iterable != iterable:
return iterable if iterable == iterable else iterable
else
for element in iterable[2:]:
if element != iterable:
return element
$$$$


Answered by Oussama Ennafii on January 4, 2022

This is my first time posting here, so please let me know if there are any conventions I'm missing.

Here is my solution, which doesn't need to traverse the entire array except by using the built-in sum() function:

def find_uniq(listToSearch):
if len(listToSearch) < 3:
return 'Cannot have one unique value unless there are at least three values.'

#out of three values, minimum of two must be the same
if listToSearch == listToSearch:
commonValue = listToSearch
elif listToSearch == listToSearch:
commonValue = listToSearch
elif listToSearch == listToSearch:
commonValue = listToSearch
else:
return 'Array has more than one unique value'

numberOfCommonItems = len(listToSearch) - 1;
uniqueValue = sum(listToSearch) - numberOfCommonItems * commonValue
return uniqueValue


These are the test cases I've tried:

find_uniq([ 1, 1, 1, 2, 1, 1 ])
find_uniq([ 0, 0, 0.55, 0, 0 ])
find_uniq([ 0, 0, -0.55, 0, 0 ])
find_uniq[ 1, 1.0, 1, 2, 1, 1 ])



And these are the outputs:

2
0.55
-0.55
2.0



This solution is O(n) as it only has to perform one extra addition per extra element of the array. Besides that, assuming the data format is valid, there are a maximum of four if statements, one multiplication operation and one subtraction operation.

Answered by Mehmet on January 4, 2022

Benchmarks!

Benchmarks for lists with a thousand or a million elements, with the unique element in the middle of the array to reflect the "typical"/"average" case. The results are times, so lower=faster.

n=1000
0.90 find_uniq_Jacques
1.18 find_uniq_tinstaafl_1
0.59 find_uniq_tinstaafl_2
0.88 find_uniq_GZ0_1
0.14 find_uniq_GZ0_2
0.88 find_uniq_Peilonrayz
0.22 find_uniq_RootTwo
0.26 find_uniq_HeapOverflow_1
0.28 find_uniq_HeapOverflow_2
0.26 find_uniq_HeapOverflow_3
0.09 find_uniq_HeapOverFlow_Codewars
0.06 find_uniq_HeapOverflow_GZ0
0.57 unique_different_ethiy
0.28 find_uniq_KyleG_1
0.25 find_uniq_KyleG_2

n=1000000
0.94 find_uniq_Jacques
1.36 find_uniq_tinstaafl_1
0.68 find_uniq_tinstaafl_2
0.99 find_uniq_GZ0_1
0.19 find_uniq_GZ0_2
0.98 find_uniq_Peilonrayz
0.19 find_uniq_RootTwo
0.23 find_uniq_HeapOverflow_1
0.26 find_uniq_HeapOverflow_2
0.25 find_uniq_HeapOverflow_3
0.09 find_uniq_HeapOverFlow_Codewars
0.04 find_uniq_HeapOverflow_GZ0
0.57 unique_different_ethiy
0.28 find_uniq_KyleG_1
0.22 find_uniq_KyleG_2


Done with Python 3.8.1 32 bit on Windows 10 64 bit.

Benchmark code:

from timeit import timeit
from collections import Counter
from itertools import groupby

solutions = []
def register(solution):
solutions.append(solution)
return solution

@register
def find_uniq_Jacques(arr):
nums = list(Counter(arr).items())
data = [i for i in nums if i == 1]
return data

@register
def find_uniq_tinstaafl_1(arr):
for i in range(len(arr)-1):
if arr[i] != arr[i+1]:
if i == 0 and arr[i] != arr[i + 2]:
return arr[i]
return arr[i + 1]

@register
def find_uniq_tinstaafl_2(arr):
for i in range(0,len(arr) - 1, 2):
if arr[i] != arr[i+1]:
if i == 0:
if arr[i] != arr[i + 2]:
return arr[i]
return arr[i + 1]
else:
if arr[i] != arr[i-1]:
return arr[i]
return arr[i + 1]
return arr[-1]

@register
def find_uniq_GZ0_1(arr):
return next(k for k, freq in Counter(arr).items() if freq == 1)

@register
def find_uniq_GZ0_2(arr):
group_iter = groupby(arr)
k1, g1 = next(group_iter)
c1 = len(list(g1))
k2, g2 = next(group_iter)
if c1 > 1:
# Group g1 has more than one element
return k2
try:
# Group g2 has more than one element
next(g2)
next(g2)
return k1
except StopIteration:
# Both g1 and g2 has one element
return k2 if next(group_iter) == k1 else k1

@register
def find_uniq_Peilonrayz(arr):
return Counter(arr).most_common()[-1]

@register
def find_uniq_RootTwo(arr):
a, b = set(arr)
return a if arr[:3].count(a) < 2 else b

@register
def find_uniq_HeapOverflow_1(arr):
a = arr
if a not in arr[1:3]:
return a
for b in arr:
if b != a:
return b

@register
def find_uniq_HeapOverflow_2(arr):
dupe = sorted(arr[:3])
for x in arr:
if x != dupe:
return x

@register
def find_uniq_HeapOverflow_3(arr):
a = arr
for b in arr:
if b != a:
return b if a in arr[1:3] else a

@register
def find_uniq_HeapOverFlow_Codewars(arr):
arr.sort()
return arr[-(arr == arr)]

@register
def find_uniq_HeapOverflow_GZ0(arr):
group_iter = groupby(arr)
k1, _ = next(group_iter)
k2, g2 = next(group_iter)
next(g2)
return k1 if k2 in g2 else k2

@register
def unique_different_ethiy(iterable):
# assert isinstance(iterable, Iterable)
# assert len(iterable) > 2
if iterable != iterable:
return iterable if iterable == iterable else iterable
else:
for element in iterable[2:]:
if element != iterable:
return element

@register
def find_uniq_KyleG_1(arr):
common = arr
if common not in arr[1:3]:
return common
for a, b in zip(arr[1::2], arr[2::2]):
if a != b:
if a == common:
return b
else:
return a
return arr[-1]

@register
def find_uniq_KyleG_2(arr):
iterator = iter(arr)
common = next(iterator)
if common not in arr[1:3]:
return common
for a, b in zip(iterator, iterator):
if a != b:
if a == common:
return b
else:
return a
return arr[-1]

# Run the benchmarks
for e in 3, 6:
n = 10**e
number = 10**(7 - e)  # fewer number of runs for larger n
print(f'{n=}')
arr =  * n
arr[n // 2] = 1

# Repeat round-robin to reduce effects of CPU speed changes etc
timeses = [[] for _ in solutions]
for i in range(20):
for solution, times in zip(solutions, timeses):
arrs = iter([arr[:] for _ in range(number)])
t = timeit(lambda: solution(next(arrs)), number=number)
times.append(t)
print(i, end=' ')
print()
for solution, times in zip(solutions, timeses):
print('%.2f' % min(times), solution.__name__)
print()


Answered by Kelly Bundy on January 4, 2022

Another one going only as far as necessary, with O(1) to check whether the first value is the outlier and otherwise simple O(n) to search for the outlier.

def find_uniq(arr):
a = arr
if a not in arr[1:3]:
return a
for b in arr:
if b != a:
return b


Slight variation, getting the duplicate value from the first three and then searching the non-dupe:

def find_uniq(arr):
dupe = sorted(arr[:3])
for x in arr:
if x != dupe:
return x


Another variation, finding a difference pair first:

def find_uniq(arr):
a = arr
for b in arr:
if b != a:
return b if a in arr[1:3] else a


Optimized version of this, also O(n) because, you know, Timsort:

def find_uniq(arr):
arr.sort()
return arr[-(arr == arr)]


Optimized version of GZ0's groupby solution, faster and taking only O(1) space:

def find_uniq(arr):
group_iter = groupby(arr)
k1, _ = next(group_iter)
k2, g2 = next(group_iter)
next(g2)
return k1 if k2 in g2 else k2


Answered by Kelly Bundy on January 4, 2022

No matter how the array is traversed, the distinguished element can occur at the end of the traversal. Therefore, it is necessary to go through the entire array in the worst case and there does not exist an algorithm that can have a better worst-case time complexity than $$n$$. However, in practise, the actual runtime of your implementation can be improved, as well as the average-case time complexity.

Firstly, your solution converts the key-value pairs of Counter(arr) into a list. Assuming the input is well-formed, this conversion is unnecessary since it is sufficient to return the first key that has a corresponding frequency value of 1. The improved implementation is as follows:

def find_uniq(arr):
return next(k for k, freq in Counter(arr).items() if freq == 1)


Secondly, creating a Counter requires going through the entire input array. In most cases, this can be avoided by returning the distinguished element once it is found, as mentioned in the previous answer. This approach improves the average-case time complexity by a constant factor of 2. Note that if the time complexity is described using the $$O(cdot)$$ and $$Theta(cdot)$$ notations there is no difference, since these notations only characterize the asymptotic order of growth of runtime given the input size. More explanations can be found here.

A Python-specific efficient implementation of this improved approach is to use the itertools.groupby function, as shown in the following. It avoids an explicit for-loop in Python, which is typically slower than an implicit-loop-based implementation, such as Counter(arr).

from itertools import groupby

def find_uniq(arr):
group_iter = groupby(arr)
k1, g1 = next(group_iter)
c1 = len(list(g1))
k2, g2 = next(group_iter)
if c1 > 1:
# Group g1 has more than one element
return k2
try:
# Group g2 has more than one element
next(g2)
next(g2)
return k1
except StopIteration:
# Both g1 and g2 has one element
return k2 if next(group_iter) == k1 else k1


Update: @HeapOverflow provides an improved version of this implementation in his answer.

Answered by GZ0 on January 4, 2022

One of things about the solutions presented so far, is they all require iterating over all the elements at least once.

Using an iterative approach allows you to short circuit the loop when the unique item is found. something like this would work:

def find_uniq(arr):
for i in range(len(arr)-1):
if arr[i] != arr[i+1]:
if i == 0 and arr[i] != arr[i + 2]:
return arr[i]
return arr[i + 1]]


Did some thinking and came up with an optimization which improves the time considerably:

def find_uniq(arr):
for i in range(0,len(arr) - 1, 2):
if arr[i] != arr[i+1]:
if i == 0:
if arr[i] != arr[i + 2]:
return arr[i]
return arr[i + 1]
else:
if arr[i] != arr[i-1]:
return arr[i]
return arr[i + 1]
return arr[-1]


The complexity of this in the worst case is O(n) the lengthy of the array - 1.

Answered by user33306 on January 4, 2022

A Counter is basically a "multiset". The question doesn't ask for a count of the numbers, so counting them may be extra overhead. Here's an possible set implementation:

def find_uniq(arr):
a, b = set(arr)
return a if arr[:3].count(a) < 2 else b


Both implementations pass through the list once, so they are O(n) time complexity. Your list comprehension, my .count(a), and @Peilonrays' .most_common() are insignificant for large n.

Answered by RootTwo on January 4, 2022

You can use .most_common to remove the need for the list comprehension. This makes the code easier to read. You will still need to use  as it will return a tuple of the key and the value.

def find_uniq(arr):
return Counter(arr).most_common()[-1]
`

Answered by Peilonrayz on January 4, 2022

## Related Questions

### std::array and std::vector Type Arbitrary Nested Iterable Generator Functions Implementation in C++

1  Asked on February 2, 2021

### 10 Kinds of People

1  Asked on January 29, 2021 by martin-york

### Generating product variants in Ruby

1  Asked on January 29, 2021 by dcangulo

### Structuring code logic for events on laravel controller

2  Asked on January 28, 2021 by grey

### Handling Circular References Without Recursion

2  Asked on January 27, 2021 by kittoes0124

### vector<tuple > to map

1  Asked on January 25, 2021 by valerij

### naming convention for atomic design

0  Asked on January 23, 2021 by irohitbhatia

### An Implementation of Two Dimensional Plane as Monochromic Image Container with std::unique_ptr in C++

1  Asked on January 21, 2021 by jimmyhu

### Cave explorer: using stack

0  Asked on January 21, 2021 by sherloock

### Calculate the free electron concentration and drift speed in a segment of wire

2  Asked on January 18, 2021 by mode77

### List of Happy Numbers in scala

3  Asked on January 18, 2021 by vikrant

### Calculator Program improvements

2  Asked on January 16, 2021 by the

### JavaScript async/await function performance improvement

0  Asked on January 14, 2021 by nmsdvid

### Is this good c++ code for a pin/socket for a node editor?

1  Asked on January 12, 2021 by apoqlite

### Arduino-based darkroom timer

0  Asked on January 12, 2021 by marcellothearcane

### Finding the possible number of equal pairs in an array

1  Asked on January 11, 2021 by alaa-jabre

### Python : adding a columns with count of missing values by row

1  Asked on January 10, 2021 by lcrmorin