TransWikia.com

Select max for a tuple in table

Stack Overflow Asked by Maxsteel on January 1, 2022

I have a table that looks like this:

host, job, folder, file, mtime

Folder names are not unique and can be same for a job spread across different hosts. I need to pick folder where the max(mtime for a file) is the max across all the folders by the same name across different hosts. Roughly I need something like this:

Select (host, folder) pair where tuple (host, job, folder) max(max (file mtime))

Example:

1, j1, f1, e1, 2
2, j1, f1, e2, 0
2, j1, f1, e9, 3
3, j1, f1, e3, 2
1, j2, f2, e4, 3
2, j2, f2, e5, 4
3, j2, f2, e6, 5
1, j3, f3, e7, 6
2, j3, f3, e8, 7

result would be:

2, j1, f1, e9, 3
3, j2, f2, e6, 5
2, j3, f3, e8, 7

The table is huge, so I trying to find out best possible way to do this. Thanks

2 Answers

A window function like ROW_NUMBER() should provide the best performance:

SELECT host, job, folder, file, mtime
FROM (
  SELECT *, ROW_NUMBER() OVER (PARTITION BY folder, job ORDER BY mtime DESC) rn
  FROM tablename
) 
WHERE rn = 1

See the demo.
Results:

| host | job | folder | file | mtime |
| ---- | --- | ------ | ---- | ----- |
| 2    | j1  | f1     | e9   | 3     |
| 3    | j2  | f2     | e6   | 5     |
| 2    | j3  | f3     | e8   | 7     |

Answered by forpas on January 1, 2022

You can filter with a subquery:

select t.*
from mytable t
where t.mtime = (
    select max(t1.mtime) from mytable t1 where t1.folder = t.folder and t1.job = t.job
)

For performance, consider an index on (folder, job, mtime).

You did not specify how you want to handle potential top ties (rows that relate to the same folder and job with the maximum mtime): this query does return them.

Answered by GMB on January 1, 2022

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP