Как ускорить ORDER BY в Postgres?

Question

whats @whats

PostgreSQL

Как ускорить ORDER BY в Postgres?

Приветствую. Подскажите, пожалуйста, есть такая ситуация.
Таблица дерева каталога.
e20b3a1fb3fc20782168cfdaa3f7d2b536603606

Таблица соответствий
f2e687765aa089f2719ce10fbb043721aeff1fc4

Есть запрос, который строит это дерево в табличном виде

SELECT DISTINCT t0.groupname, t1.groupname,  t2.groupname
FROM grouptree as t0 
left join models as m on m.id = t0.idmodel 
left join grouptree as t1 on t1.parent = t0.groupno AND t1.idmodel = t0.idmodel
left join grouptree as t2 on t2.parent = t1.groupno AND t2.idmodel = t1.idmodel
where t0.parent = 0 AND m.typeauto = 0

В условиях мы выбираем первый уровень дерева и отношение к типу автомобилей.
Вот план запроса

"HashAggregate  (cost=144714.35..145034.98 rows=32063 width=117) (actual time=1209.785..1211.691 rows=10716 loops=1)"
"  ->  Nested Loop Left Join  (cost=2379.07..144473.88 rows=32063 width=117) (actual time=28.075..840.266 rows=1391881 loops=1)"
"        ->  Nested Loop Left Join  (cost=2378.64..128016.41 rows=32063 width=86) (actual time=28.066..171.952 rows=264099 loops=1)"
"              ->  Hash Join  (cost=2378.21..25071.47 rows=32063 width=47) (actual time=28.057..69.569 rows=30423 loops=1)"
"                    Hash Cond: (t0.idmodel = m.id)"
"                    ->  Bitmap Heap Scan on grouptree t0  (cost=660.19..22638.32 rows=35066 width=47) (actual time=20.854..35.009 rows=34960 loops=1)"
"                          Recheck Cond: (parent = 0)"
"                          ->  Bitmap Index Scan on "2-parent-to-idmodel"  (cost=0.00..651.42 rows=35066 width=0) (actual time=15.255..15.255 rows=34960 loops=1)"
"                                Index Cond: (parent = 0)"
"                    ->  Hash  (cost=1346.41..1346.41 rows=29729 width=4) (actual time=7.171..7.171 rows=29715 loops=1)"
"                          Buckets: 4096  Batches: 1  Memory Usage: 1045kB"
"                          ->  Seq Scan on models m  (cost=0.00..1346.41 rows=29729 width=4) (actual time=0.004..4.488 rows=29715 loops=1)"
"                                Filter: (typeauto = 0)"
"                                Rows Removed by Filter: 2798"
"              ->  Index Scan using "2-parent-to-idmodel" on grouptree t1  (cost=0.43..3.20 rows=1 width=51) (actual time=0.001..0.002 rows=9 loops=30423)"
"                    Index Cond: ((parent = t0.groupno) AND (idmodel = t0.idmodel))"
"        ->  Index Scan using "2-parent-to-idmodel" on grouptree t2  (cost=0.43..0.50 rows=1 width=47) (actual time=0.001..0.002 rows=5 loops=264099)"
"              Index Cond: ((parent = t1.groupno) AND (idmodel = t1.idmodel))"
"Total runtime: 1212.145 ms"

Все хорошо и работает как нужно. Но хотелось бы отсортировать данные.
Добавляю всего 1 строку в конец

...
order by t0.groupname ASC

Картина в корне меняется. И время выполнения запроса увеличивается в 20 раз.

"Unique  (cost=146873.58..147194.21 rows=32063 width=117) (actual time=21456.087..21864.988 rows=10716 loops=1)"
"  Output: t0.groupname, t1.groupname, t2.groupname"
"  Buffers: shared hit=1240970"
"  ->  Sort  (cost=146873.58..146953.73 rows=32063 width=117) (actual time=21456.085..21594.183 rows=1391881 loops=1)"
"        Output: t0.groupname, t1.groupname, t2.groupname"
"        Sort Key: t0.groupname, t1.groupname, t2.groupname"
"        Sort Method: quicksort  Memory: 312756kB"
"        Buffers: shared hit=1240970"
"        ->  Nested Loop Left Join  (cost=2379.07..144473.88 rows=32063 width=117) (actual time=25.376..867.222 rows=1391881 loops=1)"
"              Output: t0.groupname, t1.groupname, t2.groupname"
"              Buffers: shared hit=1240970"
"              ->  Nested Loop Left Join  (cost=2378.64..128016.41 rows=32063 width=86) (actual time=25.366..172.208 rows=264099 loops=1)"
"                    Output: t0.groupname, t1.groupname, t1.idmodel, t1.groupno"
"                    Buffers: shared hit=162206"
"                    ->  Hash Join  (cost=2378.21..25071.47 rows=32063 width=47) (actual time=25.357..66.067 rows=30423 loops=1)"
"                          Output: t0.groupname, t0.idmodel, t0.groupno"
"                          Hash Cond: (t0.idmodel = m.id)"
"                          Buffers: shared hit=21731"
"                          ->  Bitmap Heap Scan on public.grouptree t0  (cost=660.19..22638.32 rows=35066 width=47) (actual time=17.457..31.343 rows=34960 loops=1)"
"                                Output: t0.idmodel, t0.groupno, t0.parent, t0.groupname, t0.groupnameen, t0.pictureindex, t0.mark, t0.sortorder"
"                                Recheck Cond: (t0.parent = 0)"
"                                Buffers: shared hit=20791"
"                                ->  Bitmap Index Scan on "2-parent-to-idmodel"  (cost=0.00..651.42 rows=35066 width=0) (actual time=14.128..14.128 rows=34960 loops=1)"
"                                      Index Cond: (t0.parent = 0)"
"                                      Buffers: shared hit=98"
"                          ->  Hash  (cost=1346.41..1346.41 rows=29729 width=4) (actual time=7.868..7.868 rows=29715 loops=1)"
"                                Output: m.id"
"                                Buckets: 4096  Batches: 1  Memory Usage: 1045kB"
"                                Buffers: shared hit=940"
"                                ->  Seq Scan on public.models m  (cost=0.00..1346.41 rows=29729 width=4) (actual time=0.003..5.048 rows=29715 loops=1)"
"                                      Output: m.id"
"                                      Filter: (m.typeauto = 0)"
"                                      Rows Removed by Filter: 2798"
"                                      Buffers: shared hit=940"
"                    ->  Index Scan using "2-parent-to-idmodel" on public.grouptree t1  (cost=0.43..3.20 rows=1 width=51) (actual time=0.001..0.002 rows=9 loops=30423)"
"                          Output: t1.idmodel, t1.groupno, t1.parent, t1.groupname, t1.groupnameen, t1.pictureindex, t1.mark, t1.sortorder"
"                          Index Cond: ((t1.parent = t0.groupno) AND (t1.idmodel = t0.idmodel))"
"                          Buffers: shared hit=140475"
"              ->  Index Scan using "2-parent-to-idmodel" on public.grouptree t2  (cost=0.43..0.50 rows=1 width=47) (actual time=0.001..0.002 rows=5 loops=264099)"
"                    Output: t2.idmodel, t2.groupno, t2.parent, t2.groupname, t2.groupnameen, t2.pictureindex, t2.mark, t2.sortorder"
"                    Index Cond: ((t2.parent = t1.groupno) AND (t2.idmodel = t1.idmodel))"
"                    Buffers: shared hit=1078764"
"Total runtime: 21879.380 ms"

Видно из профайлера что на сортировке он застревает. Причем сортирует сразу по 3 полям, хотя указано только 1. Подскажите, что я делаю не так ?

Вопрос задан более трёх лет назад
3355 просмотров

Комментировать

Подписаться 3 Оценить Комментировать

Помогут разобраться в теме Все курсы

Яндекс Практикум

Python-разработчик расширенный

14 месяцев

Далее
Skillbox

Курс Java-разработчик

4 месяца

Далее
Нетология

SQL и получение данных

2 месяца

Далее

Пригласить эксперта

Ответы на вопрос 2

Комментировать

Ваш ответ на вопрос

Войдите, чтобы написать ответ

Похожие вопросы

Windows

+1 ещё

Средний
Почему не создается логическая репликация на Windows 10 и падает воркер?
- 5 подписчиков
- 17 нояб.
- 305 просмотров
1

ответ
PostgreSQL

Простой
Как сделать поиск определенного значение в строке отобранной SELECT * FROM?
- 1 подписчик
- 13 нояб.
- 199 просмотров
2

ответа
SQL

+2 ещё

Средний
Как правильно обрабатывать концевые пробелы в SQL Server и PostgreSQL?
- 2 подписчика
- 11 нояб.
- 186 просмотров
1

ответ
PostgreSQL

+1 ещё

Простой
Как вести историю работы с записями во всех таблицах для всех пользователей?
- 2 подписчика
- 08 нояб.
- 247 просмотров
2

ответа
PostgreSQL

Средний
Как получить данные и обновить записи в таблицах-источниках?
- 1 подписчик
- 05 нояб.
- 109 просмотров
1

ответ
PostgreSQL

Простой
Как отредактировать строку в таблице, не имеющей primary key?
- 1 подписчик
- 10 окт.
- 301 просмотр
5

ответов
PostgreSQL

+1 ещё

Средний
Как установить Postgres в Ubuntu от 1с (c ИТС)?
- 1 подписчик
- 22 сент.
- 228 просмотров
2

ответа
PostgreSQL

Простой
Есть ли способ логировать работу процедуры?
- 1 подписчик
- 12 сент.
- 152 просмотра
2

ответа
PostgreSQL

Простой
Как обновить версию postgres в кластере patroni?
- 1 подписчик
- 11 сент.
- 135 просмотров
1

ответ
PostgreSQL

Средний
VACUUM в PostgreSQL 15 уже несколько суток в фазе vacuuming indexes что можно сделать?
- 6 подписчиков
- 27 авг.
- 934 просмотра
2

ответа
Показать ещё Загружается…

Специалист технической поддержки Oracle Database и PostgreSQL

Омега • Москва

от 90 000 до 120 000 ₽

GO разработчик

SMALL

от 220 000 до 300 000 ₽

Golang-разработчик (CRM)

IT-hunter

от 300 000 ₽

Answer 1 · 2014-04-21 19:35:02

work_mem - 1GB Сортируется в памяти. Какие еще параметры нужно назвать для информативности ?

Кстати если сделать вот так :

select * from (SELECT DISTINCT t0.groupname as s1, t1.groupname,  t2.groupname
FROM grouptree as t0 
left join models as m on m.id = t0.idmodel 
left join grouptree as t1 on t1.parent = t0.groupno AND t1.idmodel = t0.idmodel
left join grouptree as t2 on t2.parent = t1.groupno AND t2.idmodel = t1.idmodel
where t0.parent = 0 AND m.typeauto = 0) as t
order by t.s1 ASC

Что, я считаю, в корне не верно, то запрос отработает за секунды.

"Sort  (cost=147755.31..147835.47 rows=32063 width=117) (actual time=1241.491..1241.930 rows=10716 loops=1)"
"  Output: t0.groupname, t1.groupname, t2.groupname"
"  Sort Key: t0.groupname"
"  Sort Method: quicksort  Memory: 2612kB"
"  Buffers: shared hit=1240970"
"  ->  HashAggregate  (cost=144714.35..145034.98 rows=32063 width=117) (actual time=1228.137..1229.846 rows=10716 loops=1)"
"        Output: t0.groupname, t1.groupname, t2.groupname"
"        Buffers: shared hit=1240970"
"        ->  Nested Loop Left Join  (cost=2379.07..144473.88 rows=32063 width=117) (actual time=29.937..852.961 rows=1391881 loops=1)"
"              Output: t0.groupname, t1.groupname, t2.groupname"
"              Buffers: shared hit=1240970"
"              ->  Nested Loop Left Join  (cost=2378.64..128016.41 rows=32063 width=86) (actual time=29.927..172.365 rows=264099 loops=1)"
"                    Output: t0.groupname, t1.groupname, t1.idmodel, t1.groupno"
"                    Buffers: shared hit=162206"
"                    ->  Hash Join  (cost=2378.21..25071.47 rows=32063 width=47) (actual time=29.915..67.004 rows=30423 loops=1)"
"                          Output: t0.groupname, t0.idmodel, t0.groupno"
"                          Hash Cond: (t0.idmodel = m.id)"
"                          Buffers: shared hit=21731"
"                          ->  Bitmap Heap Scan on public.grouptree t0  (cost=660.19..22638.32 rows=35066 width=47) (actual time=22.059..35.273 rows=34960 loops=1)"
"                                Output: t0.idmodel, t0.groupno, t0.parent, t0.groupname, t0.groupnameen, t0.pictureindex, t0.mark, t0.sortorder"
"                                Recheck Cond: (t0.parent = 0)"
"                                Buffers: shared hit=20791"
"                                ->  Bitmap Index Scan on "2-parent-to-idmodel"  (cost=0.00..651.42 rows=35066 width=0) (actual time=14.175..14.175 rows=34960 loops=1)"
"                                      Index Cond: (t0.parent = 0)"
"                                      Buffers: shared hit=98"
"                          ->  Hash  (cost=1346.41..1346.41 rows=29729 width=4) (actual time=7.824..7.824 rows=29715 loops=1)"
"                                Output: m.id"
"                                Buckets: 4096  Batches: 1  Memory Usage: 1045kB"
"                                Buffers: shared hit=940"
"                                ->  Seq Scan on public.models m  (cost=0.00..1346.41 rows=29729 width=4) (actual time=0.003..5.017 rows=29715 loops=1)"
"                                      Output: m.id"
"                                      Filter: (m.typeauto = 0)"
"                                      Rows Removed by Filter: 2798"
"                                      Buffers: shared hit=940"
"                    ->  Index Scan using "2-parent-to-idmodel" on public.grouptree t1  (cost=0.43..3.20 rows=1 width=51) (actual time=0.001..0.002 rows=9 loops=30423)"
"                          Output: t1.idmodel, t1.groupno, t1.parent, t1.groupname, t1.groupnameen, t1.pictureindex, t1.mark, t1.sortorder"
"                          Index Cond: ((t1.parent = t0.groupno) AND (t1.idmodel = t0.idmodel))"
"                          Buffers: shared hit=140475"
"              ->  Index Scan using "2-parent-to-idmodel" on public.grouptree t2  (cost=0.43..0.50 rows=1 width=47) (actual time=0.001..0.002 rows=5 loops=264099)"
"                    Output: t2.idmodel, t2.groupno, t2.parent, t2.groupname, t2.groupnameen, t2.pictureindex, t2.mark, t2.sortorder"
"                    Index Cond: ((t2.parent = t1.groupno) AND (t2.idmodel = t1.idmodel))"
"                    Buffers: shared hit=1078764"
"Total runtime: 1242.392 ms"

Как правильно отсортировать ?

Answer 2 · 2014-04-25 14:03:04

вот ответ от более опытных коллег (я еще сам учусь))):
конкретно тот запрос не ускоряется ни как... проблему при order by снимать через WITH или offset 0 в подзапросе и только после накладыванием сортировки.
т.е. там сама постановка задачи с 1.4M строк в сортировке и 10k строк на выходе - не рабочая

вот как-то так, совсем безрадостно ((

Как ускорить ORDER BY в Postgres?

Войдите, чтобы написать ответ

Минуточку внимания

Войдите на сайт