Implicit Data Type Conversion
Copyright © 2024 PawSQL
Problem
Let's say the O_ORDERDATE
column in TPCH.ORDERS
table is defined as VARCHAR(16)
, and there is an index ORDDATE_IDX
on ORDERS(O_ORDERDATE)
to accelerate the queries against this table if there is a condition on O_ORDERDATE
. If we want to count the number of orders of today using following SQL query.
select count(*) from ORDERS
where O_ORDERDATE = current_date();
Use explain
to get the query plan:
-> Filter: (orders.O_ORDERDATE = <cache>(curdate())) (cost=2621.05 rows=19938)
-> Table scan on ORDERS (cost=2621.05 rows=199378)
From the query plan, we can see the database optimizer uses a table scan instead of the index lookup as we expected. The query cost is high.
Counting Optimization
Copyright © 2024 PawSQL
DDL Definition
-- tpch.nation definition
CREATE TABLE `nation` (
`N_NATIONKEY` int NOT NULL,
`N_NAME` char(25) NOT NULL,
`N_REGIONKEY` int NOT NULL,
`N_COMMENT` varchar(152) DEFAULT NULL,
KEY `PAW_IDX1831397382` (`N_NATIONKEY`,`N_NAME`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;
Plan Visualization for MySQL
Copyright © 2024 PawSQL
Overview
The execution plan is an internal execution strategy generated by the database when executing SQL queries. It describes how the SQL is executed step-by-step internally in the database and is an important basis for diagnosing SQL performance issues and tuning. As the complexity of database applications increases, SQL execution plans become more and more complicated along with the SQL. Relying on plaintext execution plans makes it very challenging to grasp the overall execution plan, locate key paths and bottlenecks, bringing great difficulties to DBAs' performance tuning work.
The visual execution plan tool intuitively presents the execution plan in a flowchart, clearly showing the plan structure. It uses visual elements to highlight key operations and paths, while associating SQL statement source code for integrated analysis. This can greatly improve the efficiency for DBAs to understand and analyze execution plans, quickly locate performance bottlenecks, and perform precise optimization on SQL or indexes.
Optimize TPCH Queries with PawSQL
Overview
TPCH is a database test benchmark for decision support systems,developed by TPC (Transaction Processing Performance Council) in 1994. TPCH tests consist of a set of 22 complex business queries and some data maintenance queries to measure the performance of relational database management systems under complex analytical queries. These queries simulate real queries in a data warehouse of a wholesale supplier, covering aspects such as order analysis, supply chain management, and partial sales analysis.
Database vendors use TPCH tests to evaluate their products, assessing the impact of different servers, storage configurations, parallel processing, and resource scheduling on database performance.
In this article, we will evaluate the effectiveness of PawSQL's optimizations on the TPCH queries by only rewriting queries and recommending indexes without changing the hardware configuration and database parameters.