Skip to main content

2024/PawSQL4VSCode

Β· 2 min read
PawSQL Team
Optimize your SQL Queries by Clicks!


PawSQL , Optimize your SQL Queries by Clicks!

Version Ratings



Optimize a SQL Query​

  1. Optimize in Default Workspace:

    • Click the "Optimize" button in the SQL statement prompt
  2. Optimize in Specific Workspace:

    • Click the "Optimize..." button in the SQL statement prompt
    • Select a workspace from the dropdown menu

Features​

  • Index Advise: Recommend optimal indexes based on input SQL syntax, database objects and statistics information
  • Query Rewrite: Recommend semantically equivalent, but more efficient SQLs
  • Performance Validation: Ensure better performance after SQL rewrite and index recommendation

Supported Databases​

MySQL, PostgreSQL, MariaDB, Oracle, openGauss, KingbaseES, GaussDB, Dameng, RDS for MySQL and etc.


Resources​


🌐 About PawSQL​

PawSQL is dedicated to automatic and intelligent database performance optimization. The products provided by PawSQL include:

  • PawSQL Cloud, an online automated SQL optimization tool that supports SQL auditing, intelligent query rewriting, cost-based index recommendations, suitable for database administrators and data application developers.
  • PawSQL Advisor, an IntelliJ plugin that is suitable for data application developers and can be installed via the IDEA/DataGrip marketplace by searching for "PawSQL Advisor" by name.

Contact Us​

Email: service@pawsql.com

Website: https://www.pawsql.com

PawSQL for DM/KES: Comprehensive Performance Optimization Solution for Emerging Databases

Β· 3 min read
PawSQL Team
Optimize your SQL Queries by Clicks!

0. Overview​

Against the backdrop of technological innovation in database systems, emerging databases like Kingbase Database (KES) and Dameng Database (DM) are increasingly playing crucial roles in key sectors such as government, finance, and telecommunications. With the exponential growth of data volume and increasing business complexity, SQL queries in modern, specialized databases have become critical to ensuring smooth business operations. PawSQL, a professional database performance optimization tool, covers the entire process of SQL development, testing, and operations, providing comprehensive optimization support for specialized database systems, helping users fully unlock their database performance potential across diverse technological ecosystems.

1. Database Management​

The workspace is the working environment for SQL optimization. PawSQL supports two methods for creating SQL optimization task workspaces for Kingbase KES/Dameng databases:

Offline - DDL Parsing: By parsing the database's DDL files to construct the workspace. Online - Metadata Retrieval: Directly obtaining metadata from KingbaseES databases to quickly establish the workspace.

2. Configurable SQL Review​

PawSQL provides intelligent SQL review functionality for Kingbase KES/Dameng database application developers, ensuring code accuracy, efficiency, readability, maintainability, and security.

3. Comprehensive Query Rewriting Optimization​

PawSQL offers SQL rewriting optimization algorithms based on heuristic rules and cost-based approaches, recommending semantically equivalent but higher-performance SQL for KingbaseES. It also provides SQL text comparisons before and after rewriting, with highlighted rewritten sections.

4. Intelligent Index Recommendation​

PawSQL's intelligent index recommendation engine provides optimal indexing strategies for application queries on KingbaseES, adapting to various SQL syntax requirements and significantly improving query efficiency.

5. Execution Plan Visualization (PPV)​

PawSQL's Execution Plan Visualization tool (PawSQL Plan Visualizer) supports graphical display and analysis of KingbaseES execution plans, helping users easily understand query execution processes and quickly identify performance bottlenecks.

6. Performance Verification​

PawSQL automatically collects execution plans from KingbaseES databases before and after SQL optimization, obtaining optimization costs to ensure that recommended optimization suggestions can genuinely improve database performance.

7. Automatic Slow Query Collection and Management​

PawSQL's performance inspection platform supports collection and management of slow queries from KingbaseES, supporting both crontab-based scheduled collection and manual collection.

8. Database Object Inspection​

PawSQL's performance inspection platform supports periodic or manual inspection of database objects in KingbaseES V8, covering tables, columns, character sets, indexes, constraints, and other database objects to identify potential security and performance issues, and provide warnings.

🌟 Summary​

PawSQL for KingbaseES, a comprehensive SQL optimization solution for Kingbase databases, provides professional and efficient solutions to enhance database performance, whether for daily queries or complex data processing.

🌐 About PawSQL​

PawSQL is dedicated to automatic and intelligent database performance optimization. The products provided by PawSQL include:

  • PawSQL Cloud, an online automated SQL optimization tool that supports SQL auditing, intelligent query rewriting, cost-based index recommendations, suitable for database administrators and data application developers.
  • PawSQL Advisor, an IntelliJ plugin that is suitable for data application developers and can be installed via the IDEA/DataGrip marketplace by searching for "PawSQL Advisor" by name.

Contact Us​

Email: service@pawsql.com

Website: https://www.pawsql.com

PawSQL Optimization Case: TPC-H Query 9 Performance Improved by 1195.14%

Β· 4 min read
PawSQL Team
Optimize your SQL Queries by Clicks!

Introduction​

After automatically optimizing TPC-H benchmark Query 9, PawSQL improved its performance by 1195.14%. This article will explore PawSQL's effective strategies and actual results for SQL performance optimization based on its optimization process. By comparing the query structure, index design, and execution plans before and after optimization, we will explain how to achieve significant performance improvements.

This case can be viewed online at: https://pawsql.com/statement/1837384704875499522

TPC-H Query 9 Introduction: The 9th query of TPC-H calculates the total profit of all ordered parts by country and year. Its query characteristics include:

  • Grouping, sorting, and aggregation operations
  • Multiple table joins and subqueries
  • Uses a full-match LIKE operator

1️⃣ Query Rewriting Optimization​

Original Query:

SELECT profit.nation, profit.o_year, SUM(profit.amount) AS sum_profit
FROM (
SELECT nation.n_name AS nation, EXTRACT(YEAR FROM orders.o_orderdate) AS o_year,
lineitem.l_extendedprice * (1 - lineitem.l_discount) - partsupp.ps_supplycost * lineitem.l_quantity AS amount
FROM part, supplier, lineitem, partsupp, orders, nation
WHERE supplier.s_suppkey = lineitem.l_suppkey
AND partsupp.ps_suppkey = lineitem.l_suppkey
AND partsupp.ps_partkey = lineitem.l_partkey
AND part.p_partkey = lineitem.l_partkey
AND orders.o_orderkey = lineitem.l_orderkey
AND supplier.s_nationkey = nation.n_nationkey
AND part.p_name LIKE '%dim%'
) AS profit
GROUP BY profit.nation, profit.o_year
ORDER BY profit.nation, profit.o_year DESC;

Optimized Query:

SELECT profit_nation.n_name AS nation, EXTRACT(YEAR FROM profit_orders.o_orderdate) AS o_year,
SUM(profit_lineitem.l_extendedprice * (1 - profit_lineitem.l_discount) - profit_partsupp.ps_supplycost * profit_lineitem.l_quantity) AS sum_profit
FROM part AS profit_part, supplier AS profit_supplier,
lineitem AS profit_lineitem, partsupp AS profit_partsupp,
orders AS profit_orders, nation AS profit_nation
WHERE profit_supplier.s_suppkey = profit_lineitem.l_suppkey
AND profit_partsupp.ps_suppkey = profit_lineitem.l_suppkey
AND profit_partsupp.ps_partkey = profit_lineitem.l_partkey
AND profit_part.p_partkey = profit_lineitem.l_partkey
AND profit_orders.o_orderkey = profit_lineitem.l_orderkey
AND profit_supplier.s_nationkey = profit_nation.n_nationkey
AND profit_part.p_name LIKE '%dim%'
GROUP BY profit_nation.n_name, o_year
ORDER BY profit_nation.n_name, o_year DESC;

Query Rewriting Key Points:

  • Eliminate subquery: Lift the subquery logic to the main query, simplifying the structure.
  • Table alias renaming: Improve query readability and maintainability.
  • Preserve calculation logic: Ensure that the business logic of the query remains unchanged.

2️⃣ πŸ” Index Optimization Strategy​

PawSQL's proposed index optimization scheme:

CREATE INDEX PAWSQL_IDX0485218972 ON tpch.lineitem(L_PARTKEY, L_SUPPKEY, L_SHIPDATE);
CREATE INDEX PAWSQL_IDX0214365528 ON tpch.supplier(S_NATIONKEY, S_SUPPKEY, S_NAME, S_ADDRESS);
CREATE INDEX PAWSQL_IDX0327029402 ON tpch.part(P_NAME);

Index Optimization Analysis:

  • lineitem table index: Targets join conditions and date filtering fields to reduce I/O operations.
  • supplier table index: Covers join and query fields, avoiding table lookups and improving retrieval efficiency.
  • part table index: Optimizes fuzzy matching of p_name using the LIKE operator.

3️⃣ Execution Plan Comparison​

Execution Plan Before Optimization:

  • Nested Loop Joins: Extensively used Nested loop joins, with low efficiency.
  • Full Table Scan: Performed full table scan on lineitem table, involving 60,175 rows.
  • Reliance on Primary Key Index: Primarily used primary key index for table joins.

Execution Plan After Optimization:

  • Join Order Optimization: Still uses nested loop, but with optimized join order.
  • Index Scan: Utilizes the newly created PAWSQL_IDX0327029402 index, reducing part table scan from 60,175 to 114 rows.
  • Precise lineitem Table Lookup: Avoids full table scan, uses PAWSQL_IDX0485218972 for precise matching, further reducing processed rows.

Ultimately, the total number of processed rows decreased from 60,175 to 3,513.

Image

4️⃣ Performance Improvement​

  • Execution Time: Reduced from 260 milliseconds to 20 milliseconds.
  • Performance Improvement: Achieved a 1195.14% improvement.
  • Main Factors: Index optimization significantly improved data access methods, reducing the number of processed rows.

🌐 About PawSQL​

PawSQL is dedicated to automatic and intelligent database performance optimization. The products provided by PawSQL include:

  • PawSQL Cloud, an online automated SQL optimization tool that supports SQL auditing, intelligent query rewriting, cost-based index recommendations, suitable for database administrators and data application developers.
  • PawSQL Advisor, an IntelliJ plugin that is suitable for data application developers and can be installed via the IDEA/DataGrip marketplace by searching for "PawSQL Advisor" by name.

Contact Us​

Email: service@pawsql.com

Website: https://www.pawsql.com

How Does PawSQL Optimize Correlated Scalar Subqueries?

Β· 4 min read
PawSQL Team
Optimize your SQL Queries by Clicks!

In the field of data analysis, correlated scalar subqueries are undoubtedly a double-edged sword: they are powerful, capable of solving many complex problems, while simultaneously presenting significant challenges to database optimizers. Currently, only commercial database giants like Oracle have achieved relatively outstanding performance in this area. Among domestic databases, only PolarDB provides some support for correlated subqueries. This article will not only introduce how PawSQL performs cost-based rewriting optimization for correlated scalar subqueries but also demonstrate how PawSQL identifies and merges multiple similar scalar subqueries in a query to further improve scalar subquery optimization performance. With PawSQL, you can experience Oracle-like rewriting optimization capabilities on databases like MySQL and PostgreSQL.

*## 🌟 Introduction to Correlated Scalar Subqueries

In the world of SQL, correlated scalar subqueries are a powerful tool that allows subqueries to depend on column values from the outer query. This is in stark contrast to non-correlated scalar subqueries that are independent of the outer query. Correlated scalar subqueries calculate results for each row by referencing columns from the outer query.

Example:

SELECT employee_name
FROM employees e
WHERE salary > (SELECT AVG(salary)
FROM employees
WHERE department_id = e.department_id);

In this example, the subquery calculates the average salary for each department and compares it with the salary in the main query, demonstrating the powerful functionality of correlated scalar subqueries.

🏎️ Challenges and Opportunities: From an Optimizer's Perspective​

Although correlated scalar subqueries are powerful, they present significant challenges to database optimizers:

  • Repeated Calculations: On large datasets, subqueries may be recalculated repeatedly, impacting performance.
  • High Computational Overhead: Complex calculations, such as aggregate functions, can lead to query performance degradation.
  • Query Rewriting Difficulties: Transforming scalar subqueries into join operations or other forms is not always easy.
  • Data Dependency: Optimization effectiveness depends on data distribution and table structure, requiring flexible optimizer responses.

For correlated scalar subqueries, while de-correlation does not always perform better than correlated subqueries, cost-based rewriting optimization strategies offer a new perspective. Currently, only a few databases like Oracle and PolarDB have implemented these advanced optimization techniques.

πŸš€ PawSQL: A New Realm of Correlated Scalar Subquery Optimization​

PawSQL optimizes correlated scalar subqueries through:

  1. Cost-Based Rewriting: Supporting de-correlation of scalar subqueries in conditions and select columns.
  2. Merge Rewriting: Optimizing multiple structurally similar scalar subqueries.

🎯 Case Study​

Original Query: The original query used two correlated subqueries, calculating the total order price and order count for each customer on a specific date. This structure is typically inefficient, as it requires repeated execution of two subqueries for each customer.

SELECT c_custkey,
(SELECT SUM(o_totalprice)
FROM ORDERS
WHERE o_custkey = c_custkey AND o_orderdate = '2020-04-16') AS total,
(SELECT COUNT(*)
FROM ORDERS
WHERE o_custkey = c_custkey AND o_orderdate = '2020-04-16') AS cnt
FROM CUSTOMER

Rewritten Query: PawSQL's optimization engine merges the two correlated subqueries into a derived table, then associates it with the main query through a left outer join.

SELECT c_custkey, SUM_ AS total, count_ AS cnt
FROM CUSTOMER LEFT OUTER JOIN (
SELECT o_custkey, SUM(o_totalprice) AS SUM_, COUNT(*) AS count_
FROM ORDERS
WHERE o_orderdate = '2020-04-16'
GROUP BY o_custkey) AS SQ ON o_custkey = c_custkey

Performance Improvement: After optimization, performance is expected to improve by 1,131.26%, a significant enhancement!

Scalar Correlated Subquery Execution Plan Comparison

Execution Plan Improvements:

  • Preliminary aggregation of orders table data significantly reduces the volume of data to be processed
  • Eliminates repeated subquery executions by merging two subqueries into one
  • Uses hash join to efficiently associate customer and aggregated orders data

This optimization case demonstrates the effectiveness of PawSQL's correlated scalar subquery rewriting technique. By merging multiple correlated subqueries into a single derived table and using an outer join, the optimizer can significantly reduce redundant calculations and data access.

🌐 About PawSQL​

PawSQL is dedicated to automatic and intelligent database performance optimization. The products provided by PawSQL include:

  • PawSQL Cloud, an online automated SQL optimization tool that supports SQL auditing, intelligent query rewriting, cost-based index recommendations, suitable for database administrators and data application developers.
  • PawSQL Advisor, an IntelliJ plugin that is suitable for data application developers and can be installed via the IDEA/DataGrip marketplace by searching for "PawSQL Advisor" by name.

Contact Us​

Email: service@pawsql.com

Website: https://www.pawsql.com

PawSQL's Smart Index Recommendation Helps Improve Window Function Performance by 50 Times

Β· 3 min read
PawSQL Team
Optimize your SQL Queries by Clicks!

🌟 Introduction​

In the data-driven modern world, SQL query speed is key to rapid application response. Especially for complex queries involving window functions, performance bottlenecks can become an obstacle without appropriate index support. This article will explore how PawSQL significantly improves SQL query performance for window functions through intelligent index recommendations.

πŸ” Case Analysis​

Through a practical case, we will demonstrate how PawSQL optimizes a query containing window functions.

πŸ“ Original Query​

SELECT *
FROM (
SELECT o.o_custkey, o.o_totalprice,
RANK() OVER (PARTITION BY o.o_custkey ORDER BY o.o_totalprice) AS rn
FROM orders AS o
WHERE o.o_orderdate = '1996-06-20'
) AS A
WHERE A.rn = 1

This query aims to find the lowest order amount for each customer on June 20, 1996.

🎩 PawSQL's Optimization Techniques​

PawSQL conducted an in-depth analysis of the query and proposed the following optimization recommendations: Recommending Indexes for Window Functions

🌈 Secrets of Performance Improvement​

The optimization details page shows that the indexes recommended by PawSQL can improve query performance by approximately 5181.55%. How is this achieved?

Window Function Optimization Performance Verification

1. Precise Index Matching​

The new index PAWSQL_IDX1878194728 perfectly matches the query requirements:

  • o_orderdate as the first column supports rapid data filtering.
  • The combination of o_custkey and o_totalprice provides support for window function partitioning and sorting.

2. Avoiding Sorting Operations​

Since the index is already sorted by o_custkey and o_totalprice, the database can directly utilize the index order, eliminating additional sorting steps.

3. The Power of Covering Indexes​

The new index contains all columns required by the query, achieving a "covering index". This means the database can retrieve all data directly from the index without accessing actual data pages, significantly reducing I/O operations.

4. Changes in Execution Plan​

Comparison of Window Function Optimization Execution Plans

Before Optimization:

  • Uses Bitmap index scan and heap scan.
  • Requires additional sorting operations.
  • Execution time: 22.926 ms

After Optimization:

  • Uses index-only scan.
  • No additional sorting needed.
  • Execution time reduced to 0.452 ms

πŸ“š Best Practices​

  1. Regular Analysis: Use PawSQL to periodically analyze your queries, especially complex queries containing window functions.
  2. Balance Trade-offs: While the new index improves query performance, it also increases storage overhead and affects write performance. Real-world applications require careful consideration.
  3. Remove Redundancy: Promptly clean up old indexes covered by the new index, such as odtidx in this example.
  4. Holistic Perspective: Consider the query patterns of the entire application, and don't optimize individual queries at the expense of overall performance.

πŸ“ˆ Conclusion​

PawSQL demonstrates how to significantly improve SQL query performance for window functions through intelligent index recommendations. By creating indexes that precisely match query requirements, we can substantially reduce execution time and improve application response speed. In the big data era, such optimization not only enhances efficiency but also saves valuable computational resources.

Remember, database optimization is an ongoing process. Regularly using tools like PawSQL for analysis and optimization will help keep your application in peak performance condition.

🌐 About PawSQL​

PawSQL is dedicated to automatic and intelligent database performance optimization. The products provided by PawSQL include:

  • PawSQL Cloud, an online automated SQL optimization tool that supports SQL auditing, intelligent query rewriting, cost-based index recommendations, suitable for database administrators and data application developers.
  • PawSQL Advisor, an IntelliJ plugin that is suitable for data application developers and can be installed via the IDEA/DataGrip marketplace by searching for "PawSQL Advisor" by name.

Contact Us​

Email: service@pawsql.com

Website: https://www.pawsql.com