6 posts tagged with "aduit"
View All TagsAvoid using STRAIGHT_JOIN
Copyright © 2023 PawSQL
Problem Definition
STRAIGHT_JOIN is a table join method in MySQL that forces joining tables in the order defined, equivalent to an inner join in result. It gives developers some control over how the database executes the SQL query. However, it also loses out on join order optimizations by the optimizer, so should be used carefully based on the scenario. PawSQL provides a risk warning for statements using STRAIGHT_JOIN to avoid potential performance issues caused by it.
Taking the lineitem
and orders
tables in the tpch
database as an example, the following query will join the lineitem
table first and orders
table second directly, indicating the database should take lineitem
as the driving table and orders
as the driven table for the join operation, without optimizing the table order.
SELECT *
FROM lineitem
STRAIGHT_JOIN orders ON lineitem.l_orderkey = orders.o_orderkey;
Avoid Using Natural Join
Copyright © 2023 PawSQL
Definition
Natural join is a special type of equijoin that can be used with inner, outer and full joins. It automatically searches for all columns in both tables with the same name and type and performs an equijoin on those columns. Natural join can simplify the statement but implicit join conditions reduce code readability and understanding of relationships between tables, and are also prone to mistaken joins. PawSQL provides a risk warning for statements using natural join to avoid potential correctness issues caused by it.
Mixed Sort Directions Disable Index Usage
Copyright © 2023 PawSQL
Problem Definition
For ORDER BY clauses, all expressions must be sorted in the same ASC or DESC direction to utilize indexes. If an ORDER BY statement uses different sort directions on multiple conditions, indexes cannot be used.
For example, creating an index on the lineitem
table in TPCH:
create index l_partkey_suppkey_idx on lineitem(l_partkey, l_suppkey);
Avoid COUNT DISTINCT on Nullable Multi-Columns
Copyright © 2023 PawSQL
Problem Definition
When using COUNT(DISTINCT) on multiple columns, the result may differ from what you expect. COUNT(DISTINCT col) counts the number of non-NULL distinct values in that column, while COUNT(DISTINCT col, col2) will exclude rows where either column is NULL.
For example, in the following query counting distinct values in column a and column combination (a,b):
select count(distinct t.a) a_cnt, count(distinct t.a,t.b) a_b_cnt from (values row(1,2),row(3,null)) as t(a,b);
The result is 2 distinct values for column a, but only 1 distinct value for column combination (a,b). This may not match your expectation.