Skip to main content

4 posts tagged with "SQL optimization"

View All Tags

PawSQL for OceanBase Database

· 3 min read
PawSQL Team
Optimize your SQL Queries by Clicks!

OceanBase database has been widely adopted in finance, e-commerce, and government sectors due to its high availability, strong consistency, and high performance. As business scale expands and data volume surges, optimizing OceanBase database query performance becomes increasingly important. PawSQL provides comprehensive performance optimization support for OceanBase database, helping users fully unleash OceanBase's performance potential.

I. Core Optimization Technologies

PawSQL introduces two specialized capabilities for OceanBase database:

1. Deep OceanBase SQL Syntax Support

  • Dual-mode compatibility: Full support for MySQL/Oracle syntax systems
  • Complete parsing of OceanBase-specific DDL syntax:
    CREATE TABLE nation_d (
    n_nationkey INTEGER NOT NULL PRIMARY KEY,
    n_name CHAR(25) NOT NULL,
    n_regionkey INTEGER NOT NULL,
    n_comment VARCHAR(152)
    ) duplicate_scope = cluster/none;

    CREATE TABLE part (
    p_partkey int NOT NULL PRIMARY KEY,
    p_name VARCHAR(55) NOT NULL,
    p_mfgr CHAR(25) NOT NULL,
    p_brand CHAR(10) NOT NULL,
    p_type VARCHAR(25) NOT NULL,
    p_size NUMBER NOT NULL,
    p_container CHAR(10) NOT NULL,
    p_retailprice NUMBER(15,2) NOT NULL,
    p_comment VARCHAR(23)
    ) PARTITION BY HASH(p_partkey) PARTITIONS 16;

2. Rule System Upgrade

  • Three new distributed design standards:
    • Avoid table join fields that are not distribution keys
    • DML in distributed databases should avoid table joins
    • DML operations in distributed databases lack sharding field equality conditions
  • Three new distribution key design standards:
    • Local tables are not recommended in distributed databases
    • Multiple fields are not recommended for distribution keys
    • Distribution keys should use highly distinctive fields
  • Two new distribution strategy design standards:
    • Replicated distribution is not recommended for large tables
    • Hash distribution is recommended for distribution method

II. Product Matrix of SQL Lifecycle

2.1 Development and Testing Phase: Intelligent SQL Optimization

PawSQL Optimization Platform is a one-stop online SQL optimization tool for application developers and testers. It integrates industry best practices for relational database query optimization, helping application developers and database administrators solve SQL performance issues through query rewriting optimization and intelligent index recommendations. PawSQL Optimization Platform has completed integration with common IDEs, allowing developers to perform SQL optimization without leaving their development environment.

The PawSQL Optimization Platform is an online SQL optimization tool for developers and DBAs, incorporating industry-leading query optimization technologies, including:

  • Intelligent Query Rewriting: Automatically optimizes inefficient SQL statements
  • Index Recommendation Engine: Precisely recommends optimal index combinations
  • Distributed Optimization Strategies: Provides specialized optimization suggestions for OceanBase's distributed characteristics

2.2 Integration Phase: Intelligent SQL Review

PawSQL Review Platform, with its leading core technologies such as self-developed SQL parser, syntax tree-based rule matching, and context information updates, provides comprehensive and precise (accuracy over 95%) intelligent SQL review capabilities for SQL quality management teams. It conducts comprehensive checks from multiple dimensions including syntax specifications, performance efficiency, and security, providing targeted optimization suggestions to help enterprises improve SQL performance and application efficiency.

For OceanBase database's distributed characteristics, PawSQL provides specialized distributed query optimization suggestions, with applicable rules exceeding 190.

2.3 Operating Phase: Performance Patroller

PawSQL Database Performance Patroller automatically captures slow queries generated in the database periodically and provides SQL optimization suggestions, including automatic SQL rewriting, intelligent index recommendations, and existing redundant index analysis. It automatically inspects database objects periodically, identifying potential performance, security, and maintainability issues, and provides optimization suggestions. OceanBase Inspection

III. Summary

PawSQL for OceanBase database provides you with a one-stop performance optimization solution. From daily query optimization to complex distributed scenario handling, PawSQL can help you tackle challenges with ease. 🚀 Experience PawSQL now and unlock OceanBase database's performance potential!

PawSQL for Jetbrains - Optimize your SQL Queries within Jetbrains IDEs

· 2 min read
PawSQL Team
Optimize your SQL Queries by Clicks!

💡 Introduction

PawSQL for Jetbrains (PawSQL Client) is a powerful Jetbrains plugin that seamlessly integrates PawSQL engine's intelligent SQL optimization capabilities into your development environment. It supports smart index recommendations, query rewriting, and automated performance validation. Working in conjunction with PawSQL Cloud (https://pawsql.com) or enterprise private deployment of the PawSQL optimization platform, it provides developers with one-click SQL performance optimization capabilities.

🎯 One-Click SQL Optimization in IDEA/DataGrip

Select the SQL text to be optimized, right-click, and choose which workspace to perform SQL optimization Image

✨ Core Features

  • 🚀 One-Click Optimization - Intelligently analyze and optimize SQL query performance
  • 📊 Index Optimization - Smart recommendations for optimal index combinations, comprehensively improving query efficiency
  • 🔄 Query Rewriting - Automatically generate equivalent SQL statements with better performance
  • 📈 Performance Validation - Precisely quantify optimization effects, providing detailed performance improvement data

🔧 Installation Guide

Install from Jetbrains Marketplace

  1. Launch Jetbrains IDE tool (such as IDEA/DataGrip)
  2. Search for "PawSQL Client" in Plugins and install

 Install from Jetbrains Marketplace

Install from Disk

  1. Download this plugin from Jetbrains Marketplace

Download this plugin from Jetbrains Marketplace

  1. Install it from Disk

Install it from Disk

⚙️ Initial Configuration

Image

  1. Fill in the following information in the configuration interface:
  • PawSQL Server address (e.g., https://pawsql.com)
  • Account information
  • Password
  1. Click save to complete configuration

🌐 About PawSQL

PawSQL focuses on database performance optimization automation and intelligence, providing solutions covering the entire process of SQL development, testing, and operations. It widely supports various mainstream commercial and open-source databases including KingbaseES, offering developers and enterprises a one-stop innovative SQL optimization solution. It effectively solves database SQL performance and quality issues, improves database system stability, application performance, and infrastructure utilization, saving enterprises significant maintenance costs and time investment.

How Does PawSQL Optimize Correlated Scalar Subqueries?

· 4 min read
PawSQL Team
Optimize your SQL Queries by Clicks!

In the field of data analysis, correlated scalar subqueries are undoubtedly a double-edged sword: they are powerful, capable of solving many complex problems, while simultaneously presenting significant challenges to database optimizers. Currently, only commercial database giants like Oracle have achieved relatively outstanding performance in this area. Among domestic databases, only PolarDB provides some support for correlated subqueries. This article will not only introduce how PawSQL performs cost-based rewriting optimization for correlated scalar subqueries but also demonstrate how PawSQL identifies and merges multiple similar scalar subqueries in a query to further improve scalar subquery optimization performance. With PawSQL, you can experience Oracle-like rewriting optimization capabilities on databases like MySQL and PostgreSQL.

*## 🌟 Introduction to Correlated Scalar Subqueries

In the world of SQL, correlated scalar subqueries are a powerful tool that allows subqueries to depend on column values from the outer query. This is in stark contrast to non-correlated scalar subqueries that are independent of the outer query. Correlated scalar subqueries calculate results for each row by referencing columns from the outer query.

Example:

SELECT employee_name
FROM employees e
WHERE salary > (SELECT AVG(salary)
FROM employees
WHERE department_id = e.department_id);

In this example, the subquery calculates the average salary for each department and compares it with the salary in the main query, demonstrating the powerful functionality of correlated scalar subqueries.

🏎️ Challenges and Opportunities: From an Optimizer's Perspective

Although correlated scalar subqueries are powerful, they present significant challenges to database optimizers:

  • Repeated Calculations: On large datasets, subqueries may be recalculated repeatedly, impacting performance.
  • High Computational Overhead: Complex calculations, such as aggregate functions, can lead to query performance degradation.
  • Query Rewriting Difficulties: Transforming scalar subqueries into join operations or other forms is not always easy.
  • Data Dependency: Optimization effectiveness depends on data distribution and table structure, requiring flexible optimizer responses.

For correlated scalar subqueries, while de-correlation does not always perform better than correlated subqueries, cost-based rewriting optimization strategies offer a new perspective. Currently, only a few databases like Oracle and PolarDB have implemented these advanced optimization techniques.

🚀 PawSQL: A New Realm of Correlated Scalar Subquery Optimization

PawSQL optimizes correlated scalar subqueries through:

  1. Cost-Based Rewriting: Supporting de-correlation of scalar subqueries in conditions and select columns.
  2. Merge Rewriting: Optimizing multiple structurally similar scalar subqueries.

🎯 Case Study

Original Query: The original query used two correlated subqueries, calculating the total order price and order count for each customer on a specific date. This structure is typically inefficient, as it requires repeated execution of two subqueries for each customer.

SELECT c_custkey,
(SELECT SUM(o_totalprice)
FROM ORDERS
WHERE o_custkey = c_custkey AND o_orderdate = '2020-04-16') AS total,
(SELECT COUNT(*)
FROM ORDERS
WHERE o_custkey = c_custkey AND o_orderdate = '2020-04-16') AS cnt
FROM CUSTOMER

Rewritten Query: PawSQL's optimization engine merges the two correlated subqueries into a derived table, then associates it with the main query through a left outer join.

SELECT c_custkey, SUM_ AS total, count_ AS cnt
FROM CUSTOMER LEFT OUTER JOIN (
SELECT o_custkey, SUM(o_totalprice) AS SUM_, COUNT(*) AS count_
FROM ORDERS
WHERE o_orderdate = '2020-04-16'
GROUP BY o_custkey) AS SQ ON o_custkey = c_custkey

Performance Improvement: After optimization, performance is expected to improve by 1,131.26%, a significant enhancement!

Scalar Correlated Subquery Execution Plan Comparison

Execution Plan Improvements:

  • Preliminary aggregation of orders table data significantly reduces the volume of data to be processed
  • Eliminates repeated subquery executions by merging two subqueries into one
  • Uses hash join to efficiently associate customer and aggregated orders data

This optimization case demonstrates the effectiveness of PawSQL's correlated scalar subquery rewriting technique. By merging multiple correlated subqueries into a single derived table and using an outer join, the optimizer can significantly reduce redundant calculations and data access.

🌐 About PawSQL

PawSQL is dedicated to automatic and intelligent database performance optimization. The products provided by PawSQL include:

  • PawSQL Cloud, an online automated SQL optimization tool that supports SQL auditing, intelligent query rewriting, cost-based index recommendations, suitable for database administrators and data application developers.
  • PawSQL Advisor, an IntelliJ plugin that is suitable for data application developers and can be installed via the IDEA/DataGrip marketplace by searching for "PawSQL Advisor" by name.

Contact Us

Email: service@pawsql.com

Website: https://www.pawsql.com

PawSQL's Smart Index Recommendation Helps Improve Window Function Performance by 50 Times

· 3 min read
PawSQL Team
Optimize your SQL Queries by Clicks!

🌟 Introduction

In the data-driven modern world, SQL query speed is key to rapid application response. Especially for complex queries involving window functions, performance bottlenecks can become an obstacle without appropriate index support. This article will explore how PawSQL significantly improves SQL query performance for window functions through intelligent index recommendations.

🔍 Case Analysis

Through a practical case, we will demonstrate how PawSQL optimizes a query containing window functions.

📝 Original Query

SELECT *
FROM (
SELECT o.o_custkey, o.o_totalprice,
RANK() OVER (PARTITION BY o.o_custkey ORDER BY o.o_totalprice) AS rn
FROM orders AS o
WHERE o.o_orderdate = '1996-06-20'
) AS A
WHERE A.rn = 1

This query aims to find the lowest order amount for each customer on June 20, 1996.

🎩 PawSQL's Optimization Techniques

PawSQL conducted an in-depth analysis of the query and proposed the following optimization recommendations: Recommending Indexes for Window Functions

🌈 Secrets of Performance Improvement

The optimization details page shows that the indexes recommended by PawSQL can improve query performance by approximately 5181.55%. How is this achieved?

Window Function Optimization Performance Verification

1. Precise Index Matching

The new index PAWSQL_IDX1878194728 perfectly matches the query requirements:

  • o_orderdate as the first column supports rapid data filtering.
  • The combination of o_custkey and o_totalprice provides support for window function partitioning and sorting.

2. Avoiding Sorting Operations

Since the index is already sorted by o_custkey and o_totalprice, the database can directly utilize the index order, eliminating additional sorting steps.

3. The Power of Covering Indexes

The new index contains all columns required by the query, achieving a "covering index". This means the database can retrieve all data directly from the index without accessing actual data pages, significantly reducing I/O operations.

4. Changes in Execution Plan

Comparison of Window Function Optimization Execution Plans

Before Optimization:

  • Uses Bitmap index scan and heap scan.
  • Requires additional sorting operations.
  • Execution time: 22.926 ms

After Optimization:

  • Uses index-only scan.
  • No additional sorting needed.
  • Execution time reduced to 0.452 ms

📚 Best Practices

  1. Regular Analysis: Use PawSQL to periodically analyze your queries, especially complex queries containing window functions.
  2. Balance Trade-offs: While the new index improves query performance, it also increases storage overhead and affects write performance. Real-world applications require careful consideration.
  3. Remove Redundancy: Promptly clean up old indexes covered by the new index, such as odtidx in this example.
  4. Holistic Perspective: Consider the query patterns of the entire application, and don't optimize individual queries at the expense of overall performance.

📈 Conclusion

PawSQL demonstrates how to significantly improve SQL query performance for window functions through intelligent index recommendations. By creating indexes that precisely match query requirements, we can substantially reduce execution time and improve application response speed. In the big data era, such optimization not only enhances efficiency but also saves valuable computational resources.

Remember, database optimization is an ongoing process. Regularly using tools like PawSQL for analysis and optimization will help keep your application in peak performance condition.

🌐 About PawSQL

PawSQL is dedicated to automatic and intelligent database performance optimization. The products provided by PawSQL include:

  • PawSQL Cloud, an online automated SQL optimization tool that supports SQL auditing, intelligent query rewriting, cost-based index recommendations, suitable for database administrators and data application developers.
  • PawSQL Advisor, an IntelliJ plugin that is suitable for data application developers and can be installed via the IDEA/DataGrip marketplace by searching for "PawSQL Advisor" by name.

Contact Us

Email: service@pawsql.com

Website: https://www.pawsql.com