文章/答案/技术大牛

发布

社区首页 >问答首页 >尽管使用了索引，但查询非常缓慢。

问尽管使用了索引，但查询非常缓慢。
EN

Database Administration用户

提问于 2019-12-22 20:11:05

回答 1查看 247关注 0票数 0

我有一个表，用于存储来自我的设备的大量modbus测量数据：

CREATE TABLE public.tv_smartdevicemeasurement_modbus (
    measurement_id serial NOT NULL,
    insert_time timestamp with time zone NOT NULL,
    data jsonb NOT NULL,
    smart_device_id integer NOT NULL,
    CONSTRAINT tv_smartdevicemeasurement_modbus_pkey PRIMARY KEY (measurement_id),
    CONSTRAINT tv_smartdevicemeasur_smart_device_id_62c12ed0_fk_tv_smartd
        FOREIGN KEY (smart_device_id)
        REFERENCES public.tv_smartdevice_modbus (device_id) MATCH SIMPLE
        ON UPDATE NO ACTION
        ON DELETE NO ACTION
        DEFERRABLE INITIALLY DEFERRED
);

CREATE INDEX idxgin
    ON public.tv_smartdevicemeasurement_modbus USING gin (data);

CREATE INDEX tv_smartdev_insert__0ae03a_idx
    ON public.tv_smartdevicemeasurement_modbus USING btree (insert_time);

CREATE INDEX tv_smartdev_insert__1f0611_idx
    ON public.tv_smartdevicemeasurement_modbus USING btree (insert_time DESC);

CREATE INDEX tv_smartdevicemeasurement_modbus_smart_device_id_62c12ed0
    ON public.tv_smartdevicemeasurement_modbus USING btree (smart_device_id);

现在，这个表大约有2亿行(而且它每天都在增长)。我需要使用像下面这样的一个简单的datetime子句进行选择：

SELECT data
FROM tv_smartdevicemeasurement_modbus
WHERE (tv_smartdevicemeasurement_modbus.insert_time > '2019-08-01'
AND tv_smartdevicemeasurement_modbus.insert_time < '2019-9-25')

但是这个简单且没有麻烦的查询占用了8秒的时间，就像在此报告EXPLAIN ANALYZE VERBOSE一样。

相同的使用track_io_timing已启用。

如何解决这个问题？是否有有效的长期解决办法？

postgresql-performance

postgresql

index

execution-plan

performance

回答 1

Database Administration用户

发布于 2019-12-27 02:55:01

(insert_time)和(insert_time DESC)有两个指标。B树索引可以以几乎相同的速度向后扫描.insert_time是NOT NULL，所以没有任何意义。在任何情况下都要放弃其中的一个。

我做了一些假设信息丢失了：

现任邮政总局12。
您可以重新设计表，并将表单独锁定一段时间。
这张表基本上是“只附加”的。
新行按insert_time顺序添加。
当前表实际上是无序和/或膨胀的(导致查询计划中的索引扫描，而不是位图索引扫描)

我会把桌子改写成这样：

BEGIN;
-- SET maintenance_work_mem = ???   -- set as high as you can afford temporarily

-- drop idx first to free space
DROP INDEX public.idxgin;     -- acquires exclusive lock on the table
DROP INDEX public.tv_smartdev_insert__0ae03a_idx;
DROP INDEX public.tv_smartdevicemeasurement_modbus_smart_device_id_62c12ed0;
-- DROP INDEX public.tv_smartdev_insert__1f0611_idx; -- might help SELECT, drop later

ALTER TABLE public.tv_smartdevicemeasurement_modbus ALTER measurement_id DROP DEFAULT;
DROP SEQUENCE public.tv_smartdevicemeasurement_modbus_measurement_id_seq;  -- drop owned sequence
ALTER TABLE public.tv_smartdevicemeasurement_modbus
         RENAME TO tv_smartdevicemeasurement_modbus_old;    -- free org. name

CREATE TABLE public.tv_smartdevicemeasurement_modbus (
   measurement_id   serial PRIMARY KEY    -- consider IDENTITY column instead, see below
 , smart_device_id  integer NOT NULL      -- reordering saves 4-8 bytes alignment padding per row
 , insert_time      timestamp with time zone NOT NULL
 , data             jsonb NOT NULL
 , CONSTRAINT tv_smartdevicemeasur_smart_device_id_62c12ed0_fk_tv_smartd
      FOREIGN KEY (smart_device_id)
      REFERENCES public.tv_smartdevice_modbus (device_id) DEFERRABLE INITIALLY DEFERRED  -- WHY deferrable / deferred? see below
);

INSERT INTO public.tv_smartdevicemeasurement_modbus
      (measurement_id, smart_device_id, insert_time, data)
SELECT measurement_id, smart_device_id, insert_time, data
FROM   public.tv_smartdevicemeasurement_modbus_old
ORDER  BY insert_time DESC;   -- CLUSTER while rewriting

DROP TABLE tv_smartdevicemeasurement_modbus_old;

CREATE INDEX tv_smartdev_insert__1f0611_idx ON public.tv_smartdevicemeasurement_modbus (insert_time DESC);
-- CREATE INDEX tv_smartdev_insert__0ae03a_idx ON public.tv_smartdevicemeasurement_modbus (insert_time);  -- nope!

CREATE INDEX idxgin ON public.tv_smartdevicemeasurement_modbus USING gin (data);  -- ?
CREATE INDEX tv_smartdevicemeasurement_modbus_smart_device_id_62c12ed0 ON public.tv_smartdevicemeasurement_modbus (smart_device_id); -- ?

COMMIT;

VACUUM ANALYZE public.tv_smartdevicemeasurement_modbus;

这将重写表，节省一些空间(这也有助于性能)。最重要的是，它根据你的主要索引对表进行聚类--并在它的同时消除所有可能的膨胀。这将有助于数据的局部性，并使Postgres读取更少的数据页。除非您的data列很大，否则现在应该在查询计划中看到位图索引扫描。如果data很小，请考虑一个覆盖指数，以获得仅用于索引的扫描。请参见：

带有主键和外键的查询运行速度是否比仅使用主键的查询快？

更改的列顺序如何节省空间？

PostgreSQL中的空间计算与节省

DEFERRABLE INITIALLY DEFERRED？这很少是必要的。而且更便宜。请参见：

设置所有延迟的约束，不按预期工作

serial和IDENTITY？请参见：

如何将表ID从串行更改为标识？

然后像这样查询：

SELECT data
FROM   tv_smartdevicemeasurement_modbus
WHERE  insert_time >= '2019-08-01'  -- included
AND    insert_time <  '2019-09-25'; -- excluded

请注意，这些日期文本是根据本地时区设置来解释的。考虑真正的timestamptz输入是明确的。请参见：

在Rails和PostgreSQL中完全忽略时区

这两个索引与手头的查询正交：idxgin和tv_smartdevicemeasurement_modbus_smart_device_id_62c12ed0。除非需要做不相关的事情。

票数 2

页面原文内容由Database Administration提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://dba.stackexchange.com/questions/256137

复制

相似问题

问尽管使用了索引，但查询非常缓慢。
EN

回答 1

Database Administration用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问尽管使用了索引，但查询非常缓慢。EN

回答 1

Database Administration用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问尽管使用了索引，但查询非常缓慢。
EN