Shortcut CASE evaluation (lazy scalar subquery evaluation) #16022

hlcianfagna · 2024-05-23T15:28:54Z

Problem Statement

I have a query where on a limited number of cases an expensive lookup needs to take place.
At the moment CASE, COALESCE, and IF all result on the lookup taking place for all rows.

Example:

CREATE FUNCTION sleep (ms int)
RETURNS int language javascript
as
$$ function sleep(ms){
var end = new Date().getTime()   ms;
var a =1;
while(new Date().getTime() < end) {a  ;}
return 1;}
$$;
	
SELECT COALESCE(0,sleep(5000));
--> SELECT 1 row in set (5.006 sec)	
SELECT CASE WHEN 0 IS NULL THEN sleep(5000) ELSE 0 END;
--> SELECT 1 row in set (5.007 sec)

Possible Solutions

Only evaluate "the arguments that are needed to determine the result" as in https://www.postgresql.org/docs/current/functions-conditional.html

Considered Alternatives

Store intermediate results on working tables and then use a query with an UNION considering the different cases.

The text was updated successfully, but these errors were encountered:

mfussenegger · 2024-05-27T06:34:18Z

Only evaluate "the arguments that are needed to determine the result" as in

This is actually kinda the case, except that we also have a constant-folding/normalization step to reduce operations like 1 1 to 2, to avoid having to repeat the step per row.

In your example the sleep parameter is literal, so this logic kicks in.

How does the expensive lookup in your actual use-case look like?

hlcianfagna · 2024-06-19T12:58:32Z

How does the expensive lookup in your actual use-case look like?

It is along the lines of something like this:

CREATE TABLE invoices (
	some_data TEXT
	,customer_id INT
	,invoice_specific_payment_terms TEXT
);

INSERT INTO invoices
SELECT 'abc',1,'60 days from receipt of invoice';

REFRESH TABLE invoices;

--> NB in this example invoice_specific_payment_terms is never NULL

CREATE TABLE customers (
	customer_id INT
	,customer_default_payment_terms TEXT
);

INSERT INTO customers
SELECT 1,'30 days from receipt of invoice';

REFRESH TABLE customers;

CREATE TABLE distributed_summits AS
SELECT * FROM sys.summits;

ANALYZE;

SELECT some_data,
	CASE
		WHEN invoice_specific_payment_terms IS NOT NULL
			THEN invoice_specific_payment_terms
		ELSE (	SELECT customer_default_payment_terms
				FROM customers
				CROSS JOIN distributed_summits s1				
				CROSS JOIN distributed_summits s2
				CROSS JOIN distributed_summits s3
				WHERE customers.customer_id=invoices.customer_id
				ORDER BY s3.height DESC
				LIMIT 1
			)
	END 
FROM invoices;

mfussenegger · 2024-06-24T13:00:44Z

That's a bigger topic then as subqueries are generally treated in a special way, independent of the evaluation logic of an expression.

hlcianfagna added feature: performance breaking change labels May 23, 2024

mfussenegger added complexity: no estimate and removed breaking change labels May 27, 2024

mfussenegger added the needs info or feedback label May 27, 2024

mfussenegger added feature: sql: relations and removed needs info or feedback labels Jun 24, 2024

mfussenegger changed the title ~~Shortcut CASE evaluation~~ Shortcut CASE evaluation (lazy scalar subquery evaluation) Sep 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Shortcut CASE evaluation (lazy scalar subquery evaluation) #16022

Shortcut CASE evaluation (lazy scalar subquery evaluation) #16022

hlcianfagna commented May 23, 2024

mfussenegger commented May 27, 2024

hlcianfagna commented Jun 19, 2024

mfussenegger commented Jun 24, 2024

Shortcut CASE evaluation (lazy scalar subquery evaluation) #16022

Shortcut CASE evaluation (lazy scalar subquery evaluation) #16022

Comments

hlcianfagna commented May 23, 2024

Problem Statement

Possible Solutions

Considered Alternatives

mfussenegger commented May 27, 2024

hlcianfagna commented Jun 19, 2024

mfussenegger commented Jun 24, 2024