# Compare 2 columns in 2 different tables with non uniques values

I want to compare and check whether the values in columnA of table A exist in column A of Table B. But the columnA consist of duplicate values in both the table.

You can simply get the desired result using LEFT OUTER JOIN as following:

SELECT DISTINCT
*
FROM
(
SELECT
A.COLUMN_A,
CASE
WHEN B.COLUMN_A IS NOT NULL THEN 'YES'
ELSE 'NO'
END AS "EXISTS IN TABLE_B"
FROM
TABLE_A A
LEFT JOIN TABLE_B B ON ( A.COLUMN_A = B.COLUMN_A )
);


Cheers!!

Answered by ORA_Dentist on August 18, 2020

From this answer which I reworked a little and came up with (depending on the OP's exact requirements which aren't entirely clear) the following answer (fiddle available here).

The data is the same (with the addition of a few records). First query run was this:

SELECT
UNIQUE columnA AS col_a,
COUNT(columnA) AS count_a,
'A' AS tab
FROM A
GROUP BY columnA
ORDER BY columns;


Result:

COL_A   COUNT_A TAB
a         6 A   -- extra records (5) in my fiddle, + 1 = 6
b         2 A
c         3 A
e         1 A


Do the same for table B.

Result:

COL_A   COUNT_A TAB
b         1   B
c         2   B
d         4   B
e         1   B


Perform a UNION ALL on these two queries. It's more efficient than UNION, i.e. no duplicate elimination required. We know that the records are all different anyway because of the tab "artificial" field but AFAIK, UNION has to perform a check anyway.

Result of UNION ALL - see fiddle for full query:

COL_A   COUNT_A TAB
a   6   A
b   1   B
b   2   A
... results snipped for brevity


Then, using COMMON TABLE EXPRESSIONs (aka CTEs - or the WITH clause), we combine the above result (cte1) with another query cte2 derived from cte1 as follows:

WITH cte1 AS (
SELECT UNIQUE columnA AS col_a, COUNT(columnA) as count_a, 'A' AS tab
FROM A
GROUP BY columnA
-- ORDER BY columnA
UNION ALL
SELECT UNIQUE columnA AS col_a, COUNT(columnA) as count_a, 'B' AS tab
FROM B
GROUP BY columnA
ORDER BY col_a
),
cte2 AS
(
SELECT col_a, COUNT(col_a) AS cnt
FROM cte1
GROUP BY col_a
HAVING COUNT(col_a) = 1   -- only records with columnA unique to either table
ORDER BY col_a
)
SELECT * FROM cte2;


Result:

COL_A   CNT
a     1  <<-- The CNT of 1 does not refer to the number of records with that
d     1  <<-- value in the table - it's merely a reflection of the
<<-- HAVING COUNT(col_a) = 1: only records with columnA unique to
<<-- either table and is shown for clarity


So, this gives us a and d which are indeed the values not shared by tableA and tableB but NOT which table they are in.

We then join this back to cte1 as follows:

... Rest of query snipped for brevity - see above (or fiddle)

HAVING COUNT(col_a) = 1   --<<< only records with columnA unique to either table
ORDER BY col_a
)
SELECT t1.tab, t1.col_a, t1.count_a, t2.cnt -- cnt not required - equal to 1 anyway.
FROM cte1 t1
JOIN cte2 t2 ON t1.col_a = t2.col_a
ORDER BY tab, col_a;


Which gives the final result:

TAB COL_A   COUNT_A CNT
A     a         6   1
B     d         4   1


which tells us that columnA contains the value a 6 times and none in tableB and that value d is in tableB 4 times and not in tableA. Again the CNT field is just to show how the result was derived. If you want all the records and their counts which are shared by both tables, put HAVING... = 2 - see end of fiddle.

The result for this would be:

TAB COL_A   COUNT_A
A   b   2
B   b   1
A   c   3
B   c   2
A   e   1
B   e   1


In future, I would strongly recommend that you use a fiddle to give sample data and then indicate the result that you with to obtain and the logic you used to obtain that result. Since your question isn't very clear, my answer may or may not be "overkill" - if this doesn't respond to your requirements, please let me know. My profile contains some articles on how to ask questions on this forum, you might want to take a look? p.s. welcome to the forum! :-)

Answered by Vérace on August 18, 2020

Maybe this will help: use analytics to count (the duplicate) values in each table, and use the results of these operations in a full outer join. Example (see dbfiddle):

Tables

create table A ( columnA )
as
select 'a' from dual union all
select 'b' from dual union all
select 'b' from dual union all
select 'c' from dual union all
select 'c' from dual union all
select 'c' from dual union all
select 'e' from dual -- one 'e' in each table
;

-- one 'a', 'b', 'c' (respectively) missing here
-- 4 x 'd' (does not exist in the other table)
create table B ( columnA)
as
select 'b' from dual union all
select 'c' from dual union all
select 'c' from dual union all
select 'd' from dual union all
select 'd' from dual union all
select 'd' from dual union all
select 'd' from dual union all
select 'e' from dual -- one 'e' in each table
;


Write to 2 queries, one for each table, that look something like ...

-- count the instances of each value
select unique
'table A ->' as tbl, columnA
, count(*) over ( partition by columnA ) as instances_a
from A

-- result
TBL         COLUMNA  INSTANCES_A
table A ->  a        1
table A ->  c        3
table A ->  b        2


If this works, then use the initial queries as "inline views", and write a full outer join, which will allow you to compare the value counts, eg

select
*
from (
select unique
'table A ->' as tbl, columnA
, count(*) over ( partition by columnA ) as instances_a
from A
) A_
full join
(
select unique
'table B ->' as tbl, columnA
, count(*) over ( partition by columnA ) as instances_b
from B
) B_
on A_.columnA = B_.columnA
order by A_.columnA, B_.columnA
;

-- result
TBL         COLUMNA  INSTANCES_A  TBL         COLUMNA  INSTANCES_B
table A ->  a        1            NULL        NULL     NULL
table A ->  b        2            table B ->  b        1
table A ->  c        3            table B ->  c        2
NULL        NULL     NULL         table B ->  d        4


You can then modify this query, in order to find out whether values are missing from one of the tables etc. Eg

select
A_.tbl, A_.columnA, count_a
, B_.tbl, B_.columnA, count_b
, case
when count_a = count_b then 'yes'
else 'no'
end as "column contents equal?"
from (
select unique
'table A ->' as tbl, columnA
, count(*) over ( partition by columnA ) as count_a
from A
) A_
full join
(
select unique
'table B ->' as tbl, columnA
, count(*) over ( partition by columnA ) as count_b
from B
) B_
on A_.columnA = B_.columnA
order by A_.columnA, B_.columnA
;

TBL         COLUMNA  COUNT_A  TBL         COLUMNA  COUNT_B  column contents equal?
table A ->  a        1        NULL        NULL     NULL     no
table A ->  b        2        table B ->  b        1        no
table A ->  c        3        table B ->  c        2        no
table A ->  e        1        table B ->  e        1        yes
NULL        NULL     NULL     table B ->  d        4        no


(Tested w/ Oracle 12c and 18c)

Answered by stefan on August 18, 2020

## Related Questions

### Putting some calculations on a database entry from phpmyadmin

1  Asked on December 12, 2021 by farhangdon

### Users on same server connect to MySQL with different hosts

1  Asked on December 12, 2021 by crm-dev

### Data modelling for specific units of measurement

1  Asked on December 12, 2021 by vegaaaa

### What is the preferred / common way to separate “access-tokens” in a database design

0  Asked on December 12, 2021

### compare the variance of sales from one day to another on the same table

1  Asked on December 12, 2021 by sabag

### Query output conditional upon values in dataset

1  Asked on December 10, 2021

### Sql Server Database 2008 R2 is in recovery mode still after 16 hours

1  Asked on December 10, 2021 by rs-ramgiri

### I can’t successfully install postgreSQL. ‘Problem running post-install step’ error message pops up

0  Asked on December 10, 2021

### How to increase max_connections in Postgresql-9.2

0  Asked on December 10, 2021 by jineshjk

### Index Strategy for MySQL Query that matches on one column and between two others

1  Asked on December 8, 2021

### How to Know if I have A Restricted or Full Access to Table

2  Asked on December 8, 2021 by msis

### Does the fact a table is big can impact the overall performance of a PostgreSQL server?

1  Asked on December 8, 2021 by florent2

### MSSQL TempDb full

2  Asked on December 8, 2021

### MySQL: order tables by entity relations?

0  Asked on December 8, 2021

### Schema level encryption in Postgres

0  Asked on December 8, 2021 by turgs

### Unclear update conflict

2  Asked on December 8, 2021 by pavel-zv

### Why is it not a good idea to use MariaDB’s extra_port for replication?

1  Asked on December 8, 2021

### How to use GROUP BY on a CLOB Column

1  Asked on December 8, 2021 by dallincha

### How to insert values in mysql taking previous row into account

1  Asked on December 6, 2021 by f79

### AWS KMS Encryption enable for SQL Server Columns?

2  Asked on December 6, 2021