[pgsql-jp: 41595] Re: order by での indexの利用について

2014年 3月 14日 (金) 16:31:48 JST

prodです。

 高塚様アドバイスありがとうございます。

早速試してみました。


Limit  (cost=619010.46..619011.71 rows=500 width=422) (actual time=22451.444..22452.316 rows=500 loops=1)
  ->  Sort  (cost=619010.46..621444.27 rows=973526 width=422) (actual time=22451.441..22451.748 rows=500 loops=1)
        Sort Key: a.col4, a.col5, b.col6
        Sort Method: top-N heapsort  Memory: 145kB
        ->  Merge Join  (cost=543857.35..570500.71 rows=973526 width=422) (actual time=12595.953..19267.417 rows=1007366 loops=1)
              Merge Cond: (a.col1 = b.col1)
              ->  Index Scan using pk_table_a on table_a a  (cost=0.00..8643.31 rows=143940 width=27) (actual time=0.018..296.207 rows=143942 loops=1)
                    Filter: ((col4 >= 20110313) AND (col4 <= 20140313) AND ((col2)::text = '000001'::text) AND ((col3)::text = '1'::text))
              ->  Materialize  (cost=543857.22..549127.31 rows=1054017 width=401) (actual time=12595.902..15956.599 rows=1054298 loops=1)
                    ->  Sort  (cost=543857.22..546492.26 rows=1054017 width=401) (actual time=12595.897..14551.292 rows=1054298 loops=1)
                          Sort Key: b.col1
                          Sort Method: external merge  Disk: 176920kB
                          ->  Seq Scan on table_b b  (cost=0.00..52933.17 rows=1054017 width=401) (actual time=0.007..2574.585 rows=1054298 loops=1)
Total runtime: 22490.104 ms


残念ながら、改善はされず、Hash Join を選んだオプティマイザの正解
といった感じです。

reindex tableとanalyze 
も試しましたが、改善はありませんでした。


----- Original Message -----
> From: TAKATSUKA Haruka <harukat ＠ postgresql.jp>
> To: PostgreSQL Japanese Mailing List <pgsql-jp ＠ ml.postgresql.jp>
> Cc: 
> Date: 2014/3/14, Fri 15:32
> Subject: [pgsql-jp: 41594] Re: order by での indexの利用について
> 
> 高塚です。
> 
> とりあえず、set enable_hashjoin TO off; で merge join にさせると
> どうなるかを試したいですね。
> 
> On Fri, 14 Mar 2014 15:00:06 +0900 (JST)
> prod2011 ＠ yahoo.co.jp wrote:
> 
>> 
>> 
>>  こんにちは。Prodです。
>> 
>>  花田様、高塚様
>>  ご返信ありがとうございます。
>> 
>>  こちらからの返信が遅くなり、申し訳ございません。
>> 
>>  また、提示したSQLですが、簡略した時に、
>>  簡略化しすぎてしまい、別物の状態になってしまっておりました。
>>  あらためて、SQLを張り付けさせていただきます。
>>  大変申し訳ございません。深くお詫びいたします。
>> 
>>  --SQL------------------------------------
>> 
>> 
>>    SELECT
>>          a.col1
>>      ,   a.col2
>>      ,   a.col3
>>      ,   a.col4
>>      ,   a.col5
>>      ,   b.col6
>>      ,   b.col7 ～ b.col39
>>    FROMtable_a a
>> 
>>  INNER JOIN table_b b ON a.col1 = b.col1
>>   where a.col2 = '000001'
>>          AND a.col3 = '1'
>>          AND a.col4 >= '20110313'
>>          AND a.col4 <= '20140313'
>>   ORDER BY
>>          a.col4
>>          ,a.col5
>>          ,b.col6
>>  offset 0 limit 500
>> 
>>  --複合インデックスで以下を設定しております。
>>  table_a  :  table_a_index2 (col2,col3,col4,col5)
>>  table_b  :  pk_table_b     (col1,col6) ※こちらはテーブルのキー
>> 
>>  ここで、order by のb.col6があるとき、ないとき、で全然速度がことなります。
>>  目標としている速度は、2以内秒 です。
>> 
>>  explain analyze 結果
>> 
>>  -- order by に b.col6 があるとき
>>  QUERY PLAN
>>  Limit  (cost=238444.56..238445.81 rows=500 width=422) (actual 
> time=11188.332..11189.201 rows=500 loops=1)
>>    ->  Sort  (cost=238444.56..240878.37 rows=973526 width=422) (actual 
> time=11188.329..11188.627 rows=500 loops=1)
>>          Sort Key: a.col4, a.col5, b.col6
>>          Sort Method: top-N heapsort  Memory: 145kB
>>          ->  Hash Join  (cost=8309.25..189934.81 rows=973526 width=422) 
> (actual time=412.471..8010.280 rows=1007366 loops=1)
>>                Hash Cond: (b.col1 = a.col1)
>>                ->  Seq Scan on table_b b  (cost=0.00..52933.17 
> rows=1054017 width=401) (actual time=0.008..2496.485 rows=1054298 loops=1)
>>                ->  Hash  (cost=5596.00..5596.00 rows=143940 width=27) 
> (actual time=412.280..412.280 rows=143942 loops=1)
>>                      Buckets: 8192  Batches: 4  Memory Usage: 1851kB
>>                      ->  Seq Scan on table_a a  (cost=0.00..5596.00 
> rows=143940 width=27) (actual time=0.007..222.787 rows=143942 loops=1)
>>                            Filter: ((col4 >= 20110313) AND (col4 <= 
> 20140313) AND ((col2)::text = '000001'::text) AND ((col3)::text = 
> '1'::text))
>>  Total runtime: 11191.317 ms
>> 
>> 
>>  -- order by に b.col6 がないとき
>> 
>>  QUERY PLAN
>>  Limit  (cost=0.00..1045.56 rows=500 width=422) (actual time=0.137..3.933 
> rows=500 loops=1)
>>    ->  Nested Loop  (cost=0.00..2035757.68 rows=973526 width=422) (actual 
> time=0.134..3.262 rows=500 loops=1)
>>          ->  Index Scan using table_a_index2 on table_a a 
>  (cost=0.00..169939.99 rows=143940 width=27) (actual time=0.110..0.205 rows=72 
> loops=1)
>>                Index Cond: (((col2)::text = '000001'::text) AND 
> ((col3)::text = '1'::text) AND (col4 >= 20110313) AND (col4 <= 
> 20140313))
>>          ->  Index Scan using pk_table_b on table_b b  (cost=0.00..12.87 
> rows=7 width=401) (actual time=0.009..0.018 rows=7 loops=72)
>>                Index Cond: (col1 = a.col1)
>>  Total runtime: 4.541 ms
>> 
>> 
>>  テーブルの物理サイズです。
>> 
>>  table_a 22257664
>> 
>>  table_b 347283456
>> 
> (後略)
> ______________________________________________________________________
> 日本PostgreSQLユーザ会  高塚 遙  http://www.postgresql.jp 
> 〒171-0022 東京都豊島区南池袋2-32-8 8F  SRA OSS, Inc. 内
> TEL: 03-5979-2729  FAX: 03-5979-2702  E-Mail:harukat ＠ postgresql.jp
>