TransWikia.com

Why does coalescing make this query quicker?

Database Administrators Asked on December 15, 2021

I have the following query

SELECT SQL_NO_CACHE
    `table1`.*
FROM 
    `table1`
LEFT JOIN `table2` ON table2.id = table1.table2_id
WHERE 
    (`table2`.`date_assigned` >= '2015-06-21')
    AND (
        `attempt_1_date`IS NOT NULL
        OR `attempt_2_date`IS NOT NULL
        OR `attempt_3_date`IS NOT NULL
        )
    AND (
        `attempt_1_date` >= '2015-08-16'
        OR `attempt_2_date` >= '2015-08-16'
        OR `attempt_3_date` >= '2015-08-16'
        )
    AND (
        `callback_date`IS NULL
        AND `callback_by_account_id`IS NULL
        AND `callback_result`IS NULL
        )
    AND (
        (
            `attempt_1_result` NOT IN ('complete,incorrect,decline,prospecting')
            OR `attempt_1_result` IS NULL
            )
        AND (
            `attempt_2_result` NOT IN ('complete,incorrect,decline,prospecting')
            OR `attempt_2_result` IS NULL
            )
        AND (
            `attempt_3_result` NOT IN ('complete,incorrect,decline,prospecting')
            OR `attempt_3_result` IS NULL
            )
        )

    #AND (`table2`.`date_completed` IS NULL)
    AND (COALESCE(`table2`.`date_completed`, '') = '')

If I coalesce the date_completed field first, my Database program says the query comes back in 0.000 seconds, yet if I used the above (commented out) line, just checking IS NULL, it takes just over 10 seconds. Both only return the same 5 results.

table1 has 24 columns, and ~171,000 rows, and table2 has 172 columns*, and 1.7 million rows. In 1.4 million of those rows, date_completed is null.

If you need any more information, just let me know.

Few more details I was advised to include:

MySQL version: 5.5.41-0ubuntu0.14.04.1-log

Explain outputs:

Using COALESCE

`id`, `select_type`, `table`, `type`, `possible_keys`, `key`, `key_len`, `ref`, `rows`, `Extra`
1, 'SIMPLE', 'table1', 'index_merge', 'table2_id,attempt_1_date,attempt_1_result,attempt_2_date,attempt_2_result,attempt_3_date,attempt_3_result', 'attempt_1_date,attempt_2_date,attempt_3_date', '9,9,9', NULL, 8, 'Using sort_union(attempt_1_date,attempt_2_date,attempt_3_date); Using where'
1, 'SIMPLE', 'table2', 'eq_ref', 'PRIMARY,ind_table2_date_assigned', 'PRIMARY', '4', 'database.table1.table2_id', 1, 'Using where'

Using IS NULL

`id`, `select_type`, `table`, `type`, `possible_keys`, `key`, `key_len`, `ref`, `rows`, `Extra`
1, 'SIMPLE', 'table2', 'ref', 'PRIMARY,ind_table2_date_completed,ind_table2_date_assigned', 'ind_table2_date_completed', '9', 'const', 10, 'Using where'
1, 'SIMPLE', 'table1', 'ref', 'table2_id,attempt_1_date,attempt_1_result,attempt_2_date,attempt_2_result,attempt_3_date,attempt_3_result', 'table2_id', '5', 'database.table2.id', 1, 'Using where'

Few more notes.

  • date_completed is type DATETIME
  • There are no entries where date_completed = ”

*120 of those rows relate to 60 questions, in the format question_x_score, and question_x_value

One Answer

The timings could be bogus because of caching. You probably have the Query cache on. Try again, but with SELECT SQL_NO_CACHE ...;

Recommend you add a new possibility to the ENUM behind this:

`attempt_1_result` NOT IN ('complete,incorrect,decline,prospecting')
        OR `attempt_1_result` IS NULL

That way you could avoid the NULL test and the OR.

AND `table2`.`date_assigned` >= '2015-06-21'
AND `table2`.`date_completed` IS NULL

would benefit from INDEX(date_assigned, date_completed) (in either order).

Why do you have LEFT JOIN? It seems like JOIN would give you the same result?

Why do you even mention table2? You are not fetching anything from it.

Answered by Rick James on December 15, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP