文章/答案/技术大牛

发布

社区首页 >问答首页 >理解为什么没有索引的Postgres查询速度更快

问理解为什么没有索引的Postgres查询速度更快
EN

Stack Overflow用户

提问于 2021-12-09 14:36:23

回答 2查看 763关注 0票数 1

我们继承了Postgresql驱动的数据仓库，存在一些严重的性能问题。我们已经为我们的一个客户选择了一个数据库，并且正在对查询进行基准测试。我们对这些查询进行了筛选，并找到了几个在所有查询中选择的通用表，我们认为这是我们糟糕性能的核心所在。我们特别关注冷启动性能，当没有数据加载到共享缓冲区，因为这是我们的客户的常见场景。我们进行了一个大的查询，并将其简化为最慢的部分；

explain (analyze, buffers, costs, format json)
        select
            poi."SKUId" as "SKUId",
            poi."ConvertedLineTotal" as "totalrevenue",
            poi."TotalDispachCost" as "totaldispachcost",
            poi."Quantity" as "quantitysold"
        from public.processedorder o
        join public.processedorder_item poi on poi."OrderId" = o."OrderId"
        WHERE o."ReceivedDate" >= '2020-09-01' and o."ReceivedDate" <= '2021-01-01';

事实上，把它进一步简化成一张似乎特别慢的桌子；

explain (analyze, buffers, costs, format json)
select o."OrderId", o."ChannelId"
from public.processedorder o
WHERE o."ReceivedDate" >= '2020-09-01' and o."ReceivedDate" <= '2021-01-01'

这个processedorders表大约有210万行，每个月添加大约35/40k行。

这张桌子看起来像这样--它相当宽；

CREATE TABLE public.processedorder (
    "OrderId" int4 NOT NULL,
    "ChannelId" int4 NOT NULL,
    "ShippingId" int4 NOT NULL,
    "CountryId" int4 NOT NULL,
    "LocationId" int4 NOT NULL,
    "PackagingId" int4 NOT NULL,
    "ConvertedTotal" numeric(18, 6) NOT NULL,
    "ConvertedSubtotal" numeric(18, 6) NOT NULL,
    "ConvertedShippingCost" numeric(18, 6) NOT NULL,
    "ConvertedShippingTax" numeric(18, 6) NOT NULL,
    "ConvertedTax" numeric(18, 6) NOT NULL,
    "ConvertedDiscount" numeric(18, 6) NOT NULL,
    "ConversionRate" numeric(18, 6) NOT NULL,
    "Currency" varchar(3) NOT NULL,
    "OriginalTotal" numeric(18, 6) NOT NULL,
    "OriginalSubtotal" numeric(18, 6) NOT NULL,
    "OriginalShippingCost" numeric(18, 6) NOT NULL,
    "OrignalShippingTax" numeric(18, 6) NOT NULL,
    "OriginalTax" numeric(18, 6) NOT NULL,
    "OriginalDiscount" numeric(18, 6) NOT NULL,
    "ReceivedDate" timestamp NOT NULL,
    "DispatchByDate" timestamp NOT NULL,
    "ProcessedDate" timestamp NOT NULL,
    "HoldOrCancel" bool NOT NULL,
    "CustomerHash" varchar(100) NOT NULL,
    "EmailHash" varchar(100) NOT NULL,
    "GetPostalCode" varchar(10) NOT NULL,
    "TagId" uuid NOT NULL,
    "timestamp" timestamp NOT NULL,
    "IsRMA" bool NOT NULL DEFAULT false,
    "ConversionType" int4 NOT NULL DEFAULT 0,
    "ItemWeight" numeric(18, 6) NULL,
    "TotalWeight" numeric(18, 6) NULL,
    "PackageWeight" numeric(18, 6) NULL,
    "PackageCount" int4 NULL,
    CONSTRAINT processedorder_tagid_unique UNIQUE ("TagId")
)
WITH (
    fillfactor=50
);

我们的困惑是，在数据库的本地副本上运行最小的查询，在receivedDate上使用一个简单的索引，它在4秒内返回结果-

create INDEX if not exists ix_processedorder_btree_receieveddate ON public.processedorder USING btree ("ReceivedDate" DESC);

在这里可以看到执行计划，https://explain.tensor.ru/archive/explain/639d403ef7bf772f698502ed98ae3f63:0:2021-12-08#explain

Hash Join  (cost=166953.18..286705.36 rows=201855 width=19) (actual time=3078.441..4176.251 rows=198552 loops=1)
  Hash Cond: (poi."OrderId" = o."OrderId")
  Buffers: shared hit=3 read=160623
  ->  Seq Scan on processedorder_item poi  (cost=0.00..108605.28 rows=2434228 width=23) (actual time=0.158..667.435 rows=2434228 loops=1)
        Buffers: shared read=84263
  ->  Hash  (cost=164773.85..164773.85 rows=174346 width=4) (actual time=3077.990..3077.991 rows=173668 loops=1)
        Buckets: 262144 (originally 262144)  Batches: 1 (originally 1)  Memory Usage: 8154kB
        Buffers: shared read=76360
        ->  Bitmap Heap Scan on processedorder o  (cost=3703.48..164773.85 rows=174346 width=4) (actual time=27.285..3028.721 rows=173668 loops=1)
              Recheck Cond: (("ReceivedDate" >= '2020-09-01 00:00:00'::timestamp without time zone) AND ("ReceivedDate" <= '2021-01-01 00:00:00'::timestamp without time zone))
              Heap Blocks: exact=75882
              Buffers: shared read=76360
              ->  Bitmap Index Scan on ix_receiveddate  (cost=0.00..3659.89 rows=174346 width=0) (actual time=17.815..17.815 rows=173668 loops=1)
                    Index Cond: (("ReceivedDate" >= '2020-09-01 00:00:00'::timestamp without time zone) AND ("ReceivedDate" <= '2021-01-01 00:00:00'::timestamp without time zone))
                    Buffers: shared read=478

然后，我们将这个索引应用于我们的暂存服务器(DB的相同副本)并运行查询，但这一次需要44秒钟；https://explain.tensor.ru/archive/explain/9610c603972ba89aac4e223072f27575:0:2021-12-08

Gather  (cost=112168.21..275565.45 rows=174360 width=19) (actual time=42401.776..44549.996 rows=145082 loops=1)
  Workers Planned: 2
  Workers Launched: 2
  Buffers: shared hit=149058 read=50771
  ->  Hash Join  (cost=111168.21..257129.45 rows=72650 width=19) (actual time=42397.903..44518.824 rows=48361 loops=3)
        Hash Cond: (poi."OrderId" = o."OrderId")
        Inner Unique: true
        Buffers: shared hit=445001 read=161024
        ->  Parallel Seq Scan on processedorder_item poi  (cost=0.00..117223.50 rows=880850 width=23) (actual time=0.302..1426.753 rows=702469 loops=3)
              Filter: ((NOT "ContainsComposites") AND ("SKUId" <> 0))
              Rows Removed by Filter: 108940
              Buffers: shared read=84260
        ->  Hash  (cost=105532.45..105532.45 rows=173408 width=4) (actual time=42396.156..42396.156 rows=173668 loops=3)
              Buckets: 262144 (originally 262144)  Batches: 1 (originally 1)  Memory Usage: 8154kB
              Buffers: shared hit=444920 read=76764
              ->  Index Scan using ix_processedorder_receieveddate on processedorder o  (cost=0.43..105532.45 rows=173408 width=4) (actual time=0.827..42152.428 rows=173668 loops=3)
                    Index Cond: (("ReceivedDate" >= '2020-09-01 00:00:00'::timestamp without time zone) AND ("ReceivedDate" <= '2021-01-01 00:00:00'::timestamp without time zone))
                    Buffers: shared hit=444920 read=76764

最后，根据常识，我们只需删除暂存服务器上的索引，并发现它在4秒内返回数据(就像我们的本地机器有索引一样) https://explain.tensor.ru/archive/explain/486512e19b45d5cbe4b893fdecc434b8:0:2021-12-08。

Gather  (cost=1000.00..200395.65 rows=177078 width=4) (actual time=2.556..4695.986 rows=173668 loops=1)
  Workers Planned: 2
  Workers Launched: 2
  Buffers: shared read=41868
  ->  Parallel Seq Scan on processedorder o  (cost=0.00..181687.85 rows=73782 width=4) (actual time=0.796..4663.511 rows=57889 loops=3)
        Filter: (("ReceivedDate" >= '2020-09-01 00:00:00'::timestamp without time zone) AND ("ReceivedDate" <= '2021-01-01 00:00:00'::timestamp without time zone))
        Rows Removed by Filter: 642941
        Buffers: shared read=151028

在每次查询运行之前，我从SSH：echo 3 > /proc/sys/vm/drop_caches; service postgresql restart;执行

我们还在测试之前运行了一个vacuum full; analyze;。

有人能解释为什么会发生这种情况，因为这对我们来说毫无意义--考虑到我们正在查询一小部分数据(订单记录跨度为9年，我们只选择了3个月)，我预计查询对索引的执行速度最快。

服务器本身是Postgres 10.4，运行在Amazon E2 i3.2xlarge实例上，在承载psql数据的RAID 0中运行两个io2 EBS块存储驱动器。

work_mem is 150MB
shared_buffers is set to 15Gb (60gb server total ram)
effective_io_concurrency = 256 
effective_cache_size=45GB

-按照弗兰克的建议，我们试图添加一个似乎没有帮助的新索引

Gather  (cost=1000.86..218445.54 rows=201063 width=19) (actual time=4.412..61616.676 rows=198552 loops=1)
  Output: poi."SKUId", poi."ConvertedLineTotal", poi."TotalDispachCost", poi."Quantity"
  Workers Planned: 2
  Workers Launched: 2
  Buffers: shared hit=239331 read=45293
  ->  Nested Loop  (cost=0.86..197339.24 rows=83776 width=19) (actual time=3.214..61548.378 rows=66184 loops=3)
        Output: poi."SKUId", poi."ConvertedLineTotal", poi."TotalDispachCost", poi."Quantity"
        Buffers: shared hit=748815 read=136548
        Worker 0: actual time=2.494..61658.415 rows=65876 loops=1
          Buffers: shared hit=247252 read=45606
        Worker 1: actual time=3.033..61667.490 rows=69100 loops=1
          Buffers: shared hit=262232 read=45649
        ->  Parallel Index Only Scan using ix_processedorder_btree_receieveddate_orderid on public.processedorder o  (cost=0.43..112293.34 rows=72359 width=4) (actual time=1.811..40429.474 rows=57889 loops=3)
              Output: o."ReceivedDate", o."OrderId"
              Index Cond: ((o."ReceivedDate" >= '2020-09-01 00:00:00'::timestamp without time zone) AND (o."ReceivedDate" <= '2021-01-01 00:00:00'::timestamp without time zone))
              Buffers: shared hit=97131 read=76571
              Worker 0: actual time=1.195..40625.809 rows=57420 loops=1
                Buffers: shared hit=31847 read=25583
              Worker 1: actual time=1.850..40463.813 rows=60469 loops=1
                Buffers: shared hit=34898 read=25584
        ->  Index Scan using ix_processedorder_item_orderid on public.processedorder_item poi  (cost=0.43..1.12 rows=2 width=23) (actual time=0.316..0.361 rows=1 loops=173668)
              Output: poi."OrderItemId", poi."OrderId", poi."SKUId", poi."Quantity", poi."ConvertedLineTotal", poi."ConvertedLineSubtotal", poi."ConvertedLineTax", poi."ConvertedLineDiscount", poi."ConversionRate", poi."OriginalLineTotal", poi."OriginalLineSubtotal", poi."OriginalLineTax", poi."OriginalLineDiscount", poi."Currency", poi."ContainsComposites", poi."TotalDispachCost", poi."TagId", poi."timestamp"
              Index Cond: (poi."OrderId" = o."OrderId")
              Buffers: shared hit=651684 read=59977
              Worker 0: actual time=0.316..0.362 rows=1 loops=57420
                Buffers: shared hit=215405 read=20023
              Worker 1: actual time=0.303..0.347 rows=1 loops=60469
                Buffers: shared hit=227334 read=20065

-更新2

我们重新运行了这两个查询中较大的一个

explain (analyze, buffers, verbose, costs, format json)
        select
            poi."SKUId" as "SKUId",
            poi."ConvertedLineTotal" as "totalrevenue",
            poi."TotalDispachCost" as "totaldispachcost",
            poi."Quantity" as "quantitysold"
        from public.processedorder o
        join public.processedorder_item poi on poi."OrderId" = o."OrderId"
        WHERE o."ReceivedDate" >= '2020-09-01' and o."ReceivedDate" <= '2021-01-01';

这一次，在启用了预定时间的情况下--我们运行了两次，一次是索引，另一次是没有索引--这就是计划；

与索引：https://explain.tensor.ru/archive/explain/d763d7e1754c4ddac8bb61e403b135d2:0:2021-12-09

Gather  (cost=111153.17..252254.44 rows=201029 width=19) (actual time=36978.100..39083.705 rows=198552 loops=1)
  Output: poi."SKUId", poi."ConvertedLineTotal", poi."TotalDispachCost", poi."Quantity"
  Workers Planned: 2
  Workers Launched: 2
  Buffers: shared hit=138354 read=60159
  I/O Timings: read=16453.227
  ->  Hash Join  (cost=110153.17..231151.54 rows=83762 width=19) (actual time=36974.257..39044.116 rows=66184 loops=3)
        Output: poi."SKUId", poi."ConvertedLineTotal", poi."TotalDispachCost", poi."Quantity"
        Hash Cond: (poi."OrderId" = o."OrderId")
        Buffers: shared hit=444230 read=160640
        I/O Timings: read=37254.816
        Worker 0: actual time=36972.516..39140.086 rows=79787 loops=1
          Buffers: shared hit=155175 read=48507
          I/O Timings: read=9382.648
        Worker 1: actual time=36972.438..39138.754 rows=77496 loops=1
          Buffers: shared hit=150701 read=51974
          I/O Timings: read=11418.941
        ->  Parallel Seq Scan on public.processedorder_item poi  (cost=0.00..114682.68 rows=1014089 width=23) (actual time=0.262..1212.102 rows=811409 loops=3)
              Output: poi."OrderItemId", poi."OrderId", poi."SKUId", poi."Quantity", poi."ConvertedLineTotal", poi."ConvertedLineSubtotal", poi."ConvertedLineTax", poi."ConvertedLineDiscount", poi."ConversionRate", poi."OriginalLineTotal", poi."OriginalLineSubtotal", poi."OriginalLineTax", poi."OriginalLineDiscount", poi."Currency", poi."ContainsComposites", poi."TotalDispachCost", poi."TagId", poi."timestamp"
              Buffers: shared read=84260
              I/O Timings: read=1574.316
              Worker 0: actual time=0.073..1258.453 rows=870378 loops=1
                Buffers: shared read=30133
                I/O Timings: read=535.914
              Worker 1: actual time=0.021..1256.769 rows=841299 loops=1
                Buffers: shared read=29126
                I/O Timings: read=538.814
        ->  Hash  (cost=104509.15..104509.15 rows=173662 width=4) (actual time=36972.511..36972.511 rows=173668 loops=3)
              Output: o."OrderId"
              Buckets: 262144 (originally 262144)  Batches: 1 (originally 1)  Memory Usage: 8154kB
              Buffers: shared hit=444149 read=76380
              I/O Timings: read=35680.500
              Worker 0: actual time=36970.996..36970.996 rows=173668 loops=1
                Buffers: shared hit=155136 read=18374
                I/O Timings: read=8846.735
              Worker 1: actual time=36970.783..36970.783 rows=173668 loops=1
                Buffers: shared hit=150662 read=22848
                I/O Timings: read=10880.127
              ->  Index Scan using ix_processedorder_btree_receieveddate on public.processedorder o  (cost=0.43..104509.15 rows=173662 width=4) (actual time=0.617..36736.881 rows=173668 loops=3)
                    Output: o."OrderId"
                    Index Cond: ((o."ReceivedDate" >= '2020-09-01 00:00:00'::timestamp without time zone) AND (o."ReceivedDate" <= '2021-01-01 00:00:00'::timestamp without time zone))
                    Buffers: shared hit=444149 read=76380
                    I/O Timings: read=35680.500
                    Worker 0: actual time=0.018..36741.158 rows=173668 loops=1
                      Buffers: shared hit=155136 read=18374
                      I/O Timings: read=8846.735
                    Worker 1: actual time=0.035..36733.684 rows=173668 loops=1
                      Buffers: shared hit=150662 read=22848
                      I/O Timings: read=10880.127

然后，在没有索引的情况下运行得更快；https://explain.tensor.ru/archive/explain/d0815f4b0c9baf3bdd512bc94051e768:0:2021-12-09

Gather  (cost=231259.20..372360.47 rows=201029 width=19) (actual time=4829.302..7920.614 rows=198552 loops=1)
  Output: poi."SKUId", poi."ConvertedLineTotal", poi."TotalDispachCost", poi."Quantity"
  Workers Planned: 2
  Workers Launched: 2
  Buffers: shared hit=106720 read=69983
  I/O Timings: read=2431.673
  ->  Hash Join  (cost=230259.20..351257.57 rows=83762 width=19) (actual time=4825.285..7877.981 rows=66184 loops=3)
        Output: poi."SKUId", poi."ConvertedLineTotal", poi."TotalDispachCost", poi."Quantity"
        Hash Cond: (poi."OrderId" = o."OrderId")
        Buffers: shared hit=302179 read=235288
        I/O Timings: read=7545.729
        Worker 0: actual time=4823.181..7978.501 rows=81394 loops=1
          Buffers: shared hit=87613 read=93424
          I/O Timings: read=2556.079
        Worker 1: actual time=4823.645..7979.241 rows=76022 loops=1
          Buffers: shared hit=107846 read=71881
          I/O Timings: read=2557.977
        ->  Parallel Seq Scan on public.processedorder_item poi  (cost=0.00..114682.68 rows=1014089 width=23) (actual time=0.267..2122.529 rows=811409 loops=3)
              Output: poi."OrderItemId", poi."OrderId", poi."SKUId", poi."Quantity", poi."ConvertedLineTotal", poi."ConvertedLineSubtotal", poi."ConvertedLineTax", poi."ConvertedLineDiscount", poi."ConversionRate", poi."OriginalLineTotal", poi."OriginalLineSubtotal", poi."OriginalLineTax", poi."OriginalLineDiscount", poi."Currency", poi."ContainsComposites", poi."TotalDispachCost", poi."TagId", poi."timestamp"
              Buffers: shared read=84260
              I/O Timings: read=4135.860
              Worker 0: actual time=0.034..2171.677 rows=865244 loops=1
                Buffers: shared read=29949
                I/O Timings: read=1394.407
              Worker 1: actual time=0.068..2174.408 rows=827170 loops=1
                Buffers: shared read=28639
                I/O Timings: read=1395.723
        ->  Hash  (cost=224615.18..224615.18 rows=173662 width=4) (actual time=4823.318..4823.318 rows=173668 loops=3)
              Output: o."OrderId"
              Buckets: 262144 (originally 262144)  Batches: 1 (originally 1)  Memory Usage: 8154kB
              Buffers: shared hit=302056 read=151028
              I/O Timings: read=3409.869
              Worker 0: actual time=4820.884..4820.884 rows=173668 loops=1
                Buffers: shared hit=87553 read=63475
                I/O Timings: read=1161.672
              Worker 1: actual time=4822.104..4822.104 rows=173668 loops=1
                Buffers: shared hit=107786 read=43242
                I/O Timings: read=1162.254
              ->  Seq Scan on public.processedorder o  (cost=0.00..224615.18 rows=173662 width=4) (actual time=0.744..4644.291 rows=173668 loops=3)
                    Output: o."OrderId"
                    Filter: ((o."ReceivedDate" >= '2020-09-01 00:00:00'::timestamp without time zone) AND (o."ReceivedDate" <= '2021-01-01 00:00:00'::timestamp without time zone))
                    Rows Removed by Filter: 1928823
                    Buffers: shared hit=302056 read=151028
                    I/O Timings: read=3409.869
                    Worker 0: actual time=0.040..4651.203 rows=173668 loops=1
                      Buffers: shared hit=87553 read=63475
                      I/O Timings: read=1161.672
                    Worker 1: actual time=0.035..4637.415 rows=173668 loops=1
                      Buffers: shared hit=107786 read=43242
                      I/O Timings: read=1162.254

amazon-ebs

postgresql

amazon-ec2

indexing

query-optimization

回答 2

Stack Overflow用户

回答已采纳

发布于 2021-12-09 16:00:42

从您最近的两个执行计划中，我收集到以下内容：

使用索引的慢速计划从磁盘读取processedorder的22848个块，耗时10.88秒
使用顺序扫描的快速计划为processedorder从磁盘读取43242个块，这需要1.162秒

所以可以有两种解释：

在快速情况下，这些块实际上是在内核缓存中缓存的，因此这是一种缓存效果。通过清除内核缓存进行实验。
随机I/O比优化器认为的更昂贵。在这种情况下，您可以考虑提高random_page_cost。

票数 2

Stack Overflow用户

发布于 2021-12-09 15:03:09

第一个查询将受益于对数据的索引和单个索引中的id：

CREATE INDEX ix_processedorder_btree_receieveddate ON public.processedorder USING btree ("ReceivedDate" DESC, "OrderId");

这会改变查询计划吗？

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/70291737

复制

相似问题

问理解为什么没有索引的Postgres查询速度更快
EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问理解为什么没有索引的Postgres查询速度更快EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问理解为什么没有索引的Postgres查询速度更快
EN