[pgsql-jp: 41366] トランザクションID周回後の事象につきまして

MIDORI Ikegami prjcer8 @ mistymode.com
2013年 3月 8日 (金) 11:01:06 JST


池上と申します。

pg_dump以外のバックアップ方法につきましてアドバイス頂きたく、投稿させて頂きました。

以下のpostgreSQLのバージョンで、トランザクションIDが周回したクラスタがあります。
psql (PostgreSQL) 7.4.30 LInux

●トランザクションID周回後の状況
template1=# SELECT datname, datvacuumxid, datfrozenxid,
age(datfrozenxid) from pg_database;
      datname       | datvacuumxid | datfrozenxid |     age
--------------------+--------------+--------------+-------------
 aaaaaaaaa          |          464 |          464 | -1139367901
 bb                 |          464 |          464 | -1139367901
 ccccccc            |          464 |          464 | -1139367901
 ddddddd            |          464 |          464 | -1139367901
 eeeeeee            |          464 |          464 | -1139367901
 hoge               |          464 |          464 | -1139367901
 hogefuga           |          464 |          464 | -1139367901

(7 rows)

この状況から、2度ほどVACUUMをかけて、以下のような状態となりました。
bash-3.00$ vacuumdb -a -v -z

●VACUUM実施後
template1=# SELECT datname,datvacuumxid,datfrozenxid,age(datfrozenxid)
FROM pg_database;
        datname        | datvacuumxid | datfrozenxid |     age
-----------------------+--------------+--------------+-------------
 test                  |   3155603132 |   2081861309 |  1073749791
 testtest-com          |   3155603221 |   2081861398 |  1073749702
 test000               |   3155603307 |   2081861484 |  1073749616
 test001               |   3155603398 |   2081861575 |  1073749525
 template1             |   3155603533 |   2081861710 |  1073749390
 template0             |          464 |          464 | -1139356660
 testtest000000        |   3155603619 |   2081861796 |  1073749304
 testtest001           |   3155603715 |   2081861892 |  1073749208
 test00002             |   3155603834 |   2081862011 |  1073749089
 aa                    |   3155604087 |   2081862264 |  1073748836
 aaaaaa                |   3155604487 |   2081862664 |  1073748436
 kkk                   |   3155606078 |   2081864255 |  1073746845
 0000                  |   3155607520 |   2081865697 |  1073745403
 0001                  |   3155607906 |   2081866083 |  1073745017
 aaaaaaaaa             |   3155608292 |   2081866469 |  1073744631
 bb                    |   3155608411 |   2081866588 |  1073744512
 ccccccc               |   3155608506 |   2081866683 |  1073744417
 ddddddd               |   3155608908 |   2081867085 |  1073744015
 eeeeeee               |   3155608994 |   2081867171 |  1073743929
 hoge                  |   3155609080 |   2081867257 |  1073743843
 hogefuga              |   3155609166 |   2081867343 |  1073743757

(21 rows)

VACUUM実施後は、それまでシステム的に不可視状態であったデータベースのIDや、
オーナー情報等が取得出来る状態にはなったのですが、この状態でpg_dumpallを実施すると
以下のエラーが出て中断してしまいます。

bash-3.00$ pg_dumpall -Upostgres -v > testdump.`date +%Y%m%d`
pg_dumpall: executing SELECT usename, usesysid, passwd, usecreatedb,
usesuper, valuntil FROM pg_shadow WHERE usesysid <> (SELECT datdba
FROM pg_database WHERE datname = 'template0')
pg_dumpall: executing SELECT useconfig[1] FROM pg_shadow WHERE usename
= '000000';
pg_dumpall: executing SELECT useconfig[1] FROM pg_shadow WHERE usename
= '000001';
pg_dumpall: executing SELECT useconfig[1] FROM pg_shadow WHERE usename
= '000002';
pg_dumpall: executing SELECT useconfig[1] FROM pg_shadow WHERE usename
= '000003';
pg_dumpall: executing SELECT useconfig[1] FROM pg_shadow WHERE usename
= '000004';
pg_dumpall: executing SELECT useconfig[1] FROM pg_shadow WHERE usename
= '000005';
pg_dumpall: executing SELECT useconfig[1] FROM pg_shadow WHERE usename
= '000006';
pg_dumpall: executing SELECT useconfig[1] FROM pg_shadow WHERE usename
= '000007';
pg_dumpall: executing SELECT useconfig[1] FROM pg_shadow WHERE usename
= '000008';


pg_dumpall: executing SELECT groname, grosysid, grolist FROM pg_group
pg_dumpall: executing SELECT datname, coalesce(usename, (select
usename from pg_shadow where usesysid=(select datdba from pg_database
where datname='template0'))), pg_encoding_to_char(d.encoding),
datistemplate, datpath, datacl FROM pg_database d LEFT JOIN pg_shadow
u ON (datdba = usesysid) WHERE datallowconn ORDER BY 1
pg_dumpall: executing SELECT datconfig[1] FROM pg_database WHERE
datname = 'test';
pg_dumpall: executing SELECT datconfig[1] FROM pg_database WHERE
datname = 'testtest-com ';
pg_dumpall: executing SELECT datconfig[1] FROM pg_database WHERE
datname = 'test000';
pg_dumpall: executing SELECT datconfig[1] FROM pg_database WHERE
datname = 'test001';
pg_dumpall: executing SELECT datconfig[1] FROM pg_database WHERE
datname = 'testtest000000';
pg_dumpall: executing SELECT datconfig[1] FROM pg_database WHERE
datname = 'testtest001';
pg_dumpall: executing SELECT datconfig[1] FROM pg_database WHERE
datname = 'test00002';
pg_dumpall: executing SELECT datconfig[1] FROM pg_database WHERE datname = 'aa';
pg_dumpall: executing SELECT datconfig[1] FROM pg_database WHERE
datname = 'aaaaaa';
pg_dumpall: executing SELECT datconfig[1] FROM pg_database WHERE
datname = 'kkk';
pg_dumpall: executing SELECT datconfig[1] FROM pg_database WHERE
datname = '0000';
pg_dumpall: executing SELECT datconfig[1] FROM pg_database WHERE
datname = '0001';
pg_dumpall: executing SELECT datconfig[1] FROM pg_database WHERE
datname = 'aaaaaaaaa';
pg_dumpall: executing SELECT datconfig[1] FROM pg_database WHERE datname = 'bb';
pg_dumpall: executing SELECT datconfig[1] FROM pg_database WHERE
datname = 'ccccccc';
pg_dumpall: executing SELECT datconfig[1] FROM pg_database WHERE
datname = 'ddddddd';
pg_dumpall: executing SELECT datconfig[1] FROM pg_database WHERE
datname = 'eeeeeee';
pg_dumpall: executing SELECT datconfig[1] FROM pg_database WHERE
datname = 'hoge';
pg_dumpall: executing SELECT datconfig[1] FROM pg_database WHERE
datname = 'hogefuga';
pg_dumpall: executing SELECT datname FROM pg_database WHERE
datallowconn ORDER BY 1
pg_dumpall: dumping database "test"...
pg_dumpall: running "pg_dump  -U 'postgres' -v -Fp 'test'"
pg_dump: saving encoding
pg_dump: saving database definition
pg_dump: reading schemas
pg_dump: query to obtain list of schemas failed: ERROR:  more than one
row returned by a subquery used as an expression
pg_dump: *** aborted because of error
pg_dumpall: pg_dump failed on database "test", exiting


pg_dumpallではなく、任意のデータベースに対してpg_dumpを実施した際も結果は同じでした。

-bash-3.00$ pg_dump -Upostgres -v hogefuga > hogefugadump.`date +%Y%m%d`
pg_dump: saving encoding
pg_dump: saving database definition
pg_dump: reading schemas
pg_dump: query to obtain list of schemas failed: ERROR:  more than one
row returned by a subquery used as an expression
pg_dump: *** aborted because of error

原因を調査していたところ、以下の事象が発生している事が発覚致しました。
どうも、同じ名称のスキーマが2つずつ存在しているようなのです。

hogefuga=# \dn
        List of schemas
        Name        |  Owner
--------------------+----------
 information_schema | postgres
 information_schema | postgres
 pg_catalog         | postgres
 pg_catalog         | postgres
 pg_temp_1          | postgres
 pg_temp_1          | postgres
 pg_toast           | postgres
 pg_toast           | postgres
 public             | postgres
 public             | postgres
(10 rows)

更に、postgresユーザーも2ユーザー出来てしまっています。
いずれもusesysidは同一となっております。
postgresユーザーを削除しようと思いDROP USER postgres;を実行しても、
★印のほうはお化けのように残ってしまい、削除が出来ません。

hogefuga=# SELECT * FROM pg_user where usename='postgres';
 usename  | usesysid | usecreatedb | usesuper | usecatupd |  passwd  |
valuntil | useconfig
----------+----------+-------------+----------+-----------+----------+----------+-----------
 postgres |        1 | t           | t        | t         | ******** |
         | ★
 postgres |        1 | t           | t        | t         | ******** |
infinity |
(2 rows)

VACUUMによって強制的に周回を戻した代償なのかな、とは思っているのですが、
オンラインでバックアップが取れない事が問題です。
postgreSQL7系という事もありますので、PITRも利用出来ません。
何かバックアップで他の代替方法等ご存知の方がいらっしゃいましたら、
お知恵を拝借頂けましたら幸いです。
(最悪、コールドバックアップしかない事は覚悟しております)


宜しくお願い致します。


pgsql-jp メーリングリストの案内