[pgsql-jp: 41929] Re: 【返信：質問】サブクエリを使った検索文の実行時間短縮について

2017年 3月 29日 (水) 15:57:00 JST

2017-03-29 14:41 GMT+09:00 Tsunakawa, Takayuki <tsunakawa.takay ＠ jp.fujitsu.com>:
> 綱川といいます。
>
>>　 ここで疑問なのですが、なぜ、ここが、Bitmap Heap Scan なのでしょうか。
>> 検索で使っている video も content もインデックスがついています。
>> なぜ、Index Scan でないのでしょうか。
>
> このような説明があります。
>
> https://www.postgresql.org/docs/devel/static/using-explain.html
>
> Here the planner has decided to use a two-step plan: the child plan node visits an index to find the locations of rows matching the index condition, and then the upper plan node actually fetches those rows from the table itself. Fetching rows separately is much more expensive than reading them sequentially, but because not all the pages of the table have to be visited, this is still cheaper than a sequential scan. (The reason for using two plan levels is that the upper plan node sorts the row locations identified by the index into physical order before reading them, to minimize the cost of separate fetches. The “bitmap” mentioned in the node names is the mechanism that does the sorting.)
>
> つまり、検索条件に該当する行がたくさんありそうだからでしょう。
> Index Scanによりインデックスで該当レコードを１件見つけるごとにテーブルから読むと、
> テーブルへのランダムアクセスが多くなります。
> それよりも、Bitmap Index Scanでインデックスから該当レコードのTIDをまとめて取り出し、
> Bitmap Heap ScanでそのTIDをディスク上のブロック順に並べてからTIDごとにテーブルを読めば、
> ディスクのシーク時間も減って速くなる、とプランナが考えたのだと思います。
>
>

澤田と申します。

一般的には綱川さんのおっしゃるとおりですが、pg_bigmが利用しているginインデックスはIndex Scanをサポートしていなかったと思います。
ginインデックスを利用してできるのはBitmap Index Scan→Bitmap Heap Scanで、Bitmap Index
Scanではインデックスを見て条件にヒットする行のID（TID）をビットマップに記録します。そして後続のBitmap Heap
Scanではそのビットマップ（今回の例では２つのビットマップのANDを取ったビットマップ）を使って、実際の行を取得します。

> こちらは In 句に比べて、実行時間が、若干、遅くなりました。

キャッシュヒットとかも関連しているように見えました。EXPLAINにBUFFERSオプションも付けて実行すると良いかもです。

--
Masahiko Sawada