Skip to main content

31 posts tagged with "pawsql"

View All Tags

Implicit Data Type Conversion

· 6 min read
PawSQL Team
Optimize your SQL Queries by Clicks!

Copyright © 2024 PawSQL

Problem

Let's say the O_ORDERDATE column in TPCH.ORDERS table is defined as VARCHAR(16), and there is an index ORDDATE_IDX on ORDERS(O_ORDERDATE) to accelerate the queries against this table if there is a condition on O_ORDERDATE. If we want to count the number of orders of today using following SQL query.

select count(*) from ORDERS 
where O_ORDERDATE = current_date();

Use explain to get the query plan:

-> Filter: (orders.O_ORDERDATE = <cache>(curdate()))  (cost=2621.05 rows=19938)
-> Table scan on ORDERS (cost=2621.05 rows=199378)

From the query plan, we can see the database optimizer uses a table scan instead of the index lookup as we expected. The query cost is high.

Counting Optimization

· 2 min read
PawSQL Team
Optimize your SQL Queries by Clicks!

Copyright © 2024 PawSQL

DDL Definition

-- tpch.nation definition
CREATE TABLE `nation` (
`N_NATIONKEY` int NOT NULL,
`N_NAME` char(25) NOT NULL,
`N_REGIONKEY` int NOT NULL,
`N_COMMENT` varchar(152) DEFAULT NULL,
KEY `PAW_IDX1831397382` (`N_NATIONKEY`,`N_NAME`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;

Plan Visualization for MySQL

· 4 min read
PawSQL Team
Optimize your SQL Queries by Clicks!

Copyright © 2024 PawSQL

Overview

The execution plan is an internal execution strategy generated by the database when executing SQL queries. It describes how the SQL is executed step-by-step internally in the database and is an important basis for diagnosing SQL performance issues and tuning. As the complexity of database applications increases, SQL execution plans become more and more complicated along with the SQL. Relying on plaintext execution plans makes it very challenging to grasp the overall execution plan, locate key paths and bottlenecks, bringing great difficulties to DBAs' performance tuning work.

The visual execution plan tool intuitively presents the execution plan in a flowchart, clearly showing the plan structure. It uses visual elements to highlight key operations and paths, while associating SQL statement source code for integrated analysis. This can greatly improve the efficiency for DBAs to understand and analyze execution plans, quickly locate performance bottlenecks, and perform precise optimization on SQL or indexes.

Optimize TPCH Queries with PawSQL

· 8 min read
PawSQL Team
Optimize your SQL Queries by Clicks!

Overview

TPCH is a database test benchmark for decision support systems,developed by TPC (Transaction Processing Performance Council) in 1994. TPCH tests consist of a set of 22 complex business queries and some data maintenance queries to measure the performance of relational database management systems under complex analytical queries. These queries simulate real queries in a data warehouse of a wholesale supplier, covering aspects such as order analysis, supply chain management, and partial sales analysis.

Database vendors use TPCH tests to evaluate their products, assessing the impact of different servers, storage configurations, parallel processing, and resource scheduling on database performance.

In this article, we will evaluate the effectiveness of PawSQL's optimizations on the TPCH queries by only rewriting queries and recommending indexes without changing the hardware configuration and database parameters.

Order by Random Optimization

· 5 min read
PawSQL Team
Optimize your SQL Queries by Clicks!

Problem

Functionrand(MySQL) or random(PostgreSQL) returns a random floating-point value in the range 0 to 1.0. It is useful to get a random sample of dataset using following query.

select * from orders order by rand() limit 1; 

This works fine if customer table is less than 10,000 rows. But when you have 1,000,000 rows, the overhead for sorting the rows becomes unacceptable. The reason is quiet clear: we sort all rows, but only keep one.