SQL实战经验 - 运算导致索引失效
Copyright © 2024 PawSQL
Problem
Let's say we have a b-tree index c_acctbal_idx
defined on table tpch.customer
to accelarate queries with a condition on column c_acctbal
. Let's check a query as follows:
select * from tpch.customer where c_acctbal + 100 = 10000.0
Use explain
to get the query plan:
-> Filter: ((customer.C_ACCTBAL + 100) = 10000.0) (cost=102301.50 rows=900955)
-> Table scan on customer (cost=102301.50 rows=900955)
From the query plan, we can see the database optimizer uses a table scan instead of the index lookup as we expected. The query cost is very high because of the full table scan and the arithmatic compution on each of the row.
Diagnosis
Let's replace the condition with an equivalent one by moving the calculation from left to right,
select * from tpch.customer where c_acctbal = 10000.0 - 100;
Use explain
to get the query plan:
-> Index lookup on customer using c_acctbal_idx (C_ACCTBAL=(10000.0 - 100)) (cost=5.50 rows=5)
From the query plan , we can see the database optimizer chooses index lookup
using index c_acctbal_idx
and the cost is less than 1%% of the previous SQL query.
Diagnosis: From above experiment, we can conclude that if there is some calculation on indexed column, the index usage will be disabled. This conclusion is not only appliable to arthematic but also functions, and all other calculations on indexed column.
select * from tpch.orders where ADDDATE(o_orderdate, INTERVAL 31 DAY) =date '2019-10-10'
While following equivalent query can take advantage of index on o_orderdate
select * from tpch.orders where o_orderdate = subdate(date '2019-10-10' , INTERVAL 31 DAY);
Solution
There are two solutions to solve this problem,
-
one solution is to rewrite the original query to an equivalent one by moving the calculation to other side of the condition, as the examples in Diagnosis paragraph, so that the query can take advantage of existing indexes on the column.
-
the other solution is to add a functional index to match the condition.
create index phone_func_idx on customer(left(c_phone, 3))
The functional index is not applicable to other conditions if the conditions don't have the same signature of the index definition. For example, the above index will not applicable to a condition
left(c_phone, 4) = '1399'
.
PawSQL
PawSQL implements the first solution by automatically detecting the problematic predicates, automatically rewriting them into equivalent ones in rewrite optimization rule FuncWithColumnInPredicateRewrite
.
Built-in Operators
PawSQL support following arithmatic operators and the combination of them.
* | Multiplication operator |
---|---|
+ | Addition operator |
- | Minus operator |
/ | Division operator |
- | Change the sign of the argument |