2005-12-11

Table partitioning

Postgresql could benefit from an implementation of table partitioning. Also see: http://groups.google.com/group/pgsql.general/tree/browse_frm/thread/e72ead84aa16bc8e/64d61f78425b119b?rnum=1&q=partition&_done=%2Fgroup%2Fpgsql.general%2Fbrowse_frm%2Fthread%2Fe72ead84aa16bc8e%2F121bab4de037ce6a%3Fq%3Dpartition%26rnum%3D7%26#doc_64d61f78425b119b

All in one table

dms3_test=> select count(doc.id) from document as doc, da da_client, da
da_group, da da_type, da da_subtype where doc.id=da_client.doc_id and
doc.id=da_group.doc_id and doc.id=da_type.doc_id and
doc.id=da_subtype.doc_id and da_client.an='ssis_client' and
da_group.an='ssis_group' and da_type.an='ssis_type' and
da_subtype.an='ssis_subtype' and da_client.av='client1' and
da_group.av='group1' and da_type.av='type1' and da_subtype.av='subtype1';

 count 
-------
   140
(1 row)

Time: 60882.964 ms


Separated out
 
 
dms3_test=> select count(doc.id) from document as doc, ssis_client as sc,
ssis_group as sg, ssis_type as st, ssis_subtype as sst where
doc.id=sc.doc_id and doc.id=sg.doc_id and doc.id=st.doc_id and
doc.id=sst.doc_id and sc.value='client1' and sg.value='group1' and
st.value='type1' and sst.value='subtype1';

 count 
-------
   140
(1 row)

Time: 3912.358 ms

Explain analyze of all in one table
 
 
dms3_test=> explain analyze select doc.id from document as doc, da
da_client, da da_group, da da_type, da da_subtype where
doc.id=da_client.doc_id and doc.id=da_group.doc_id and
doc.id=da_type.doc_id and doc.id=da_subtype.doc_id and
da_client.an='ssis_client' and da_group.an='ssis_group' and
da_type.an='ssis_type' and da_subtype.an='ssis_subtype' and
da_client.av='client1' and da_group.av='group1' and da_type.av='type1' and
da_subtype.av='subtype1'

                                                                             QUERY PLAN                                                                             
--------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=40335.69..53871.76 rows=1 width=4) (actual time=17319.458..50292.429 rows=140 loops=1)
   ->  Nested Loop  (cost=40335.69..53864.64 rows=2 width=16) (actual time=17319.392..50260.780 rows=1072 loops=1)
         ->  Nested Loop  (cost=40335.69..53330.49 rows=150 width=12) (actual time=17289.476..35508.467 rows=10153 loops=1)
               ->  Merge Join  (cost=40335.69..40494.44 rows=3558 width=8) (actual time=17263.290..17594.400 rows=10153 loops=1)
                     Merge Cond: ("outer".doc_id = "inner".doc_id)
                     ->  Sort  (cost=20817.14..20849.19 rows=12818 width=4) (actual time=10145.947..10314.729 rows=101159 loops=1)
                           Sort Key: da_client.doc_id
                           ->  Bitmap Heap Scan on da da_client  (cost=155.91..19942.58 rows=12818 width=4) (actual time=1165.917..9800.025 rows=101159 loops=1)
                                 Recheck Cond: ((an = 'ssis_client'::text) AND (av = 'client1'::text))
                                 ->  Bitmap Index Scan on da_an_av_idx  (cost=0.00..155.91 rows=12818 width=0) (actual time=1156.362..1156.362 rows=101159 loops=1)
                                       Index Cond: ((an = 'ssis_client'::text) AND (av = 'client1'::text))
                     ->  Sort  (cost=19518.54..19548.09 rows=11817 width=4) (actual time=7117.330..7188.340 rows=101190 loops=1)
                           Sort Key: da_subtype.doc_id
                           ->  Bitmap Heap Scan on da da_subtype  (cost=143.90..18719.21 rows=11817 width=4) (actual time=913.707..6798.627 rows=101190 loops=1)
                                 Recheck Cond: ((an = 'ssis_subtype'::text) AND (av = 'subtype1'::text))
                                 ->  Bitmap Index Scan on da_an_av_idx  (cost=0.00..143.90 rows=11817 width=0) (actual time=904.720..904.720 rows=101190 loops=1)
                                       Index Cond: ((an = 'ssis_subtype'::text) AND (av = 'subtype1'::text))
               ->  Index Scan using document_pkey on document doc  (cost=0.00..3.60 rows=1 width=4) (actual time=1.762..1.763 rows=1 loops=10153)
                     Index Cond: (doc.id = "outer".doc_id)
         ->  Index Scan using da_pkey on da da_type  (cost=0.00..3.55 rows=1 width=4) (actual time=1.451..1.452 rows=0 loops=10153)
               Index Cond: (("outer".id = da_type.doc_id) AND (da_type.an = 'ssis_type'::text) AND (da_type.av = 'type1'::text))
   ->  Index Scan using da_pkey on da da_group  (cost=0.00..3.55 rows=1 width=4) (actual time=0.028..0.028 rows=0 loops=1072)
         Index Cond: (("outer".id = da_group.doc_id) AND (da_group.an = 'ssis_group'::text) AND (da_group.av = 'group1'::text))
 Total runtime: 50296.288 ms
(24 rows)

Time: 50301.853 ms

Explain analyze of separated out
 
 
dms3_test=> explain analyze select doc.id from document as doc, ssis_client
as sc, ssis_group as sg, ssis_type as st, ssis_subtype as sst where
doc.id=sc.doc_id and doc.id=sg.doc_id and doc.id=st.doc_id and
doc.id=sst.doc_id and sc.value='client1' and sg.value='group1' and
st.value='type1' and sst.value='subtype1';

                                                                           QUERY PLAN                                                                           
----------------------------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=48763.53..76529.21 rows=108 width=4) (actual time=3947.647..5401.517 rows=140 loops=1)
   ->  Nested Loop  (cost=48763.53..76139.59 rows=108 width=16) (actual time=3947.592..5377.496 rows=140 loops=1)
         ->  Hash Join  (cost=48763.53..71836.53 rows=1019 width=12) (actual time=3947.454..5027.993 rows=993 loops=1)
               Hash Cond: ("outer".doc_id = "inner".doc_id)
               ->  Seq Scan on ssis_group sg  (cost=0.00..22537.39 rows=105085 width=4) (actual time=169.102..1161.228 rows=100924 loops=1)
                     Filter: (value = 'group1'::text)
               ->  Hash  (cost=48739.02..48739.02 rows=9802 width=8) (actual time=3777.630..3777.630 rows=10153 loops=1)
                     ->  Hash Join  (cost=23168.26..48739.02 rows=9802 width=8) (actual time=2108.108..3770.001 rows=10153 loops=1)
                           Hash Cond: ("outer".doc_id = "inner".doc_id)
                           ->  Seq Scan on ssis_client sc  (cost=0.00..22537.39 rows=100706 width=4) (actual time=201.259..1662.239 rows=101159 loops=1)
                                 Filter: (value = 'client1'::text)
                           ->  Hash  (cost=22537.39..22537.39 rows=98349 width=4) (actual time=1904.210..1904.210 rows=101190 loops=1)
                                 ->  Seq Scan on ssis_subtype sst  (cost=0.00..22537.39 rows=98349 width=4) (actual time=234.704..1828.887 rows=101190 loops=1)
                                       Filter: (value = 'subtype1'::text)
         ->  Index Scan using ssis_type_pkey on ssis_type st  (cost=0.00..4.21 rows=1 width=4) (actual time=0.351..0.351 rows=0 loops=993)
               Index Cond: ("outer".doc_id = st.doc_id)
               Filter: (value = 'type1'::text)
   ->  Index Scan using document_pkey on document doc  (cost=0.00..3.60 rows=1 width=4) (actual time=0.169..0.169 rows=1 loops=140)
         Index Cond: (doc.id = "outer".doc_id)
 Total runtime: 5401.958 ms
(20 rows)

Time: 5406.568 ms 
 
 
(originally from http://microjet.ath.cx/WebWiki/2005.12.11_TablePartitioning.html)

No comments: