03.PostgreSQL常用索引与优化

本文介绍: name text如果没有索引，数据库需要扫描整个表才能找到相应的数据。利用EXPLAINPar allel Seq Scan 表示并行顺序扫描，执行消耗了 12 s；由于表中有包含大量数据，而查询只返回一行数据，显然这种方法效率很低。关于执行计划的更多信息，可以参考这篇文章。此时，如果在 id 列上存在索引，则可以通过索引快速找到匹配的结果。创建索引需要消耗一定的时间。Ind e x Scan 表示索引扫描，执行消耗了 20 ms；

索引（Inde x）可以用于提高数据库的查询性能；但是索引也需要进行读写，同时还会占用更多的存储空间；因此了解并适当利用索引对于数据库的优化至关重要。本篇我们就来介绍如何高效地使用 Pos t g reSQL 索引。

假设存在以下数据表：

CREATE TABLE test (
  id integer,
  name text
);

insert into test
select v,'val:'||v from generate_series(1, 10000000) v;

我们经常需要使用类似以下的查询返回结果：

SELECT name FROM test WHERE id = 10000;

如果没有索引，数据库需要扫描整个表才能找到相应的数据。利用EXPLAIN命令可以看到数据库的执行计划，也就是 Pos t g reSQL 执行 SQL 语句的具体步骤：

explain analyze
SELECT name FROM test WHERE id = 10000;
QUERY PLAN                                                                                                              |
------------------------------------------------------------------------------------------------------------------------|
Gather  (cost=1000.00..107137.70 rows=1 width=11) (actual time=50.266..12082.777 rows=1 loops=1)                        |
  Workers Planned: 2                                                                                                    |
  Workers Launched: 2                                                                                                   |
  -&gt;  Parallel Seq Scan on test  (cost=0.00..106137.60 rows=1 width=11) (actual time=7674.992..11553.964 rows=0 loops=3)|
        Filter: (id = 10000)                                                                                            |
        Rows Removed by Filter: 3333333                                                                                 |
Planning Time: 16.480 ms                                                                                                |
Execution Time: 12093.016 ms                                                                                            |

Par allel Seq Sc an 表示并行顺序扫描，执行消耗了 12 s；由于表中有包含大量数据，而查询只返回一行数据，显然这种方法效率很低。

此时，如果在 id 列上存在索引，则可以通过索引快速找到匹配的结果。我们先创建一个索引：

CREATE INDEX test_id_index ON test (id);

explain analyze
SELECT name FROM test WHERE id = 10000;
QUERY PLAN                                                                                                           |
---------------------------------------------------------------------------------------------------------------------|
Index Scan using test_id_index on test  (cost=0.43..8.45 rows=1 width=11) (actual time=20.410..20.412 rows=1 loops=1)|
  Index Cond: (id = 10000)                                                                                           |
Planning Time: 14.989 ms                                                                                             |
Execution Time: 20.521 ms                                                                                            |

索引不仅仅能够优化查询语句，某些包含WHERE条件的UPDATE、DELETE语句也可以利用索引提高性能，因为修改数据的前提是找到数据。

另外，如果模式匹配运算符LIKE和~中模式的开头不是通配符，优化器也可以使用 B-树索引，例如：

col  LIKE 'foo%' 
col  ~ '^foo'

对于不区分大小的的ILIKE和~*运算符，如果匹配的模式以非字母的字符（不受大小写转换影响）开头，也可以使用 B-树索引。

SELECT col1, col2
  FROM t
 WHERE col1 BETWEEN 100 AND 200
 ORDER BY col1;

创建哈希索引需要使用HASH关键字：

CREATE INDEX index_name 
ON table_name USING HASH (column_name);

CREATE INDEX语句用于创建索引，USING子句指定索引的类型，具体参考下文。

SELECT * FROM places ORDER BY location <-> point '(101,456)' LIMIT 10;

CREATE INDEX bloomidx ON tbloom USING bloom (i1,i2,i3)
       WITH (length=80, col1=2, col2=2, col3=4);

签名长度 80 bit, 最大允许4096 bits
col1 - col32，分别指定每列的bits，默认长度2，最大允许4095 bits.

PostgreSQL 使用CREATE INDEX语句创建新的索引：

CREATE INDEX index_name ON table_name 
[USING method]
(column_name [ASC | DESC] [NULLS FIRST | NULLS LAST]);

CREATE INDEX test_name_index ON test (name);

explain analyze
SELECT * FROM test WHERE name IS NULL;
QUERY PLAN                                                                                                           |
---------------------------------------------------------------------------------------------------------------------|
Index Scan using test_name_index on test  (cost=0.43..5.77 rows=1 width=15) (actual time=0.036..0.037 rows=0 loops=1)|
  Index Cond: (name IS NULL)                                                                                         |
Planning Time: 1.067 ms                                                                                              |
Execution Time: 0.048 ms                                                                                             |

基于索引字段的IS NULL运算符同样可以利用索引进行优化。

在创建索引时，可以使用UNIQUE关键字指定唯一索引：

CREATE UNIQUE INDEX index_name
ON table_name (column_name [ASC | DESC] [NULLS FIRST | NULLS LAST]);

CREATE [UNIQUE] INDEX index_name ON table_name
[USING method]
(column1 [ASC | DESC] [NULLS FIRST | NULLS LAST], ...);

WHERE c1 = v1 and c2 = v2 and c3 = v3;
WHERE c1 = v1 and c2 = v2;
WHERE c1 = v1;

WHERE c2 = v2;
WHERE c3 = v3;
WHERE c2 = v2 and c3 = v3;

CREATE [UNIQUE] INDEX index_name 
ON table_name (expression);

explain analyze
SELECT * FROM test WHERE upper(name) ='VAL:10000';
QUERY PLAN                                                                                                                 |
---------------------------------------------------------------------------------------------------------------------------|
Gather  (cost=1000.00..122556.19 rows=50001 width=15) (actual time=18.629..7310.422 rows=1 loops=1)                        |
  Workers Planned: 2                                                                                                       |
  Workers Launched: 2                                                                                                      |
  ->  Parallel Seq Scan on test  (cost=0.00..116556.09 rows=20834 width=15) (actual time=4746.266..7171.452 rows=0 loops=3)|
        Filter: (upper(name) = 'VAL:10000'::text)                                                                          |
        Rows Removed by Filter: 3333333                                                                                    |
Planning Time: 0.100 ms                                                                                                    |
Execution Time: 7310.444 ms                                                                                                |

drop index test_name_index;
create index test_name_index on test(upper(name));

explain analyze
SELECT * FROM test WHERE upper(name) ='VAL:10000';
QUERY PLAN                                                                                                                     |
-------------------------------------------------------------------------------------------------------------------------------|
Bitmap Heap Scan on test  (cost=1159.93..57095.47 rows=50000 width=15) (actual time=17.046..17.047 rows=1 loops=1)             |
  Recheck Cond: (upper(name) = 'VAL:10000'::text)                                                                              |
  Heap Blocks: exact=1                                                                                                         |
  ->  Bitmap Index Scan on test_name_index  (cost=0.00..1147.43 rows=50000 width=0) (actual time=17.032..17.032 rows=1 loops=1)|
        Index Cond: (upper(name) = 'VAL:10000'::text)                                                                          |
Planning Time: 1.985 ms                                                                                                        |
Execution Time: 17.080 ms                                                                                                      |

create table orders(order_id int primary key, order_ts timestamp, finished boolean);

create index orders_unfinished_index
on orders (order_id)
where finished is not true;

explain analyze
select order_id
from orders
where finished is not true;
QUERY PLAN                                                                                                                      |
--------------------------------------------------------------------------------------------------------------------------------|
Bitmap Heap Scan on orders  (cost=4.38..24.33 rows=995 width=4) (actual time=0.010..0.010 rows=0 loops=1)                       |
  Recheck Cond: (finished IS NOT TRUE)                                                                                          |
  ->  Bitmap Index Scan on orders_unfinished_index  (cost=0.00..4.13 rows=995 width=0) (actual time=0.004..0.004 rows=0 loops=1)|
Planning Time: 0.130 ms                                                                                                         |
Execution Time: 0.049 ms                                                                                                        |

CREATE TABLE t (a int, b int, c int);
CREATE UNIQUE INDEX idx_t_ab ON t USING btree (a, b) INCLUDE (c);

以上语句基于字段 a 和 b 创建了多列索引，同时利用INCLUDE在索引的叶子节点存储了字段 c 的值。以下查询可以利用 Index-Only 扫描：

explain analyze
select a, b, c 
from t 
where a = 100 and b = 200;
QUERY PLAN                                                                                                      |
----------------------------------------------------------------------------------------------------------------|
Index Only Scan using idx_t_ab on t  (cost=0.15..8.17 rows=1 width=12) (actual time=0.007..0.007 rows=0 loops=1)|
  Index Cond: ((a = 100) AND (b = 200))                                                                         |
  Heap Fetches: 0                                                                                               |
Planning Time: 0.078 ms                                                                                         |
Execution Time: 0.021 ms                                                                                        |

select * from pg_indexes where tablename = 'test';
schemaname|tablename|indexname      |tablespace|indexdef                                                             |
----------|---------|---------------|----------|---------------------------------------------------------------------|
public    |test     |test_id_index  |          |CREATE INDEX test_id_index ON public.test USING btree (id)           |
public    |test     |test_name_index|          |CREATE INDEX test_name_index ON public.test USING btree (upper(name))|

如果使用 psql 客户端连接，可以使用d table_name命令查看表的结构，包括表中的索引信息。

ALTER INDEX index_name RENAME TO new_name;
ALTER INDEX index_name SET TABLESPACE tablespace_name;

REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM }  index_name;

两个ALTER INDEX语句分别用于重命名索引和移动索引到其他表空间；REINDEX用于重建索引数据，支持不同级别的索引重建。

另外，索引被创建之后，系统会在修改数据的同时自动更新索引。不过，我们需要定期执行ANALYZE命令更新数据库的统计信息，以便优化器能够合理使用索引。

DROP INDEX index_name [ CASCADE | RESTRICT ];

DROP INDEX test_id_index, test_name_index;

显示所有内容

声明：本站所有文章，如无特殊说明或标注，均为本站原创发布。任何个人或组织，在未征得本站同意时，禁止复制、盗用、采集、发布本站内容到任何网站、书籍等各类媒体平台。如若本站内容侵犯了原著者的合法权益，可联系我们进行处理。

postgresql 数据库索引

Pos t g reSQL常用索引与优化

索引简介

索引类型

1.B-树索引

2.哈希索引

3.GiST 索引

4.SP-GiST 索引

5.GIN 索引

6.BRIN 索引

7.rum 索引

8.bloom 索引

9.zomb odb 索引

创建索引

唯一索引

多列索引

函数索引

部分索引

覆盖索引

查看索引

维护索引

删除索引

发表回复取消回复

PostgreSQL常用索引与优化