[pgsql-jp: 36131] pg_clogディレクトリ内のログが参照できずPANIC
松本 貴臣
matsumoto_takaomi @ nextcom.co.jp
2005年 10月 14日 (金) 08:45:59 JST
お世話になります。
松本と申します。
readhat+java+postgres7.3.6を採用し、システムを構築しております。
以下のようなPANICが頻発し、原因の究明にあたっておりますが、
行き詰まっております。
皆様のお知恵を頂戴できればと思い、ご連絡差し上げた次第でございます。
[事象]
以下のようなログが発生し、コネクションがクローズされる。
=========================================================
postgres[28659]: [3] PANIC: open of /home/postgres/data/pg_clog/001B
failed: そのようなファイルやディレクトリはありません
postgres[28659]: [4] LOG: statement: select * from pgstattuple('tbl')
postgres[2279]: [1] LOG: server process (pid 28659) was terminated by
signal 6
postgres[2279]: [2] LOG: terminating any other active server processes
postgres[28026]: [1202-1] WARNING: Message from PostgreSQL backend:
(省略)
=========================================================
ログを遡ってみたところ、pg_clogの001Bは、以下のような形で
removeされているようでした。
========================================================
postgres[28582]: [106] LOG: recycled transaction log file 0000028600000075
postgres[28582]: [107] LOG: recycled transaction log file 0000028600000074
postgres[28582]: [108] LOG: removing commit log file 001B
========================================================
今回の処理としては、rootのcronにて以下のような形にしております。
(vacuumの前後の状態を統計として蓄積しようとしております)
========================================================
1. select * from pgstattuple('tbl')
2. vacuumdb -a -U postgres -z
3. select * from pgstattuple('tbl')
========================================================
「1」では問題なく処理が終了しているようです。
「2」の最後で「removing commit log file 001B」されているようでした。
「3」1とまったく同じ処理を実施しているのですが、
以下のようなログが出力されました。
(長くなってしまい、申し訳ございません。)
以後、PANICが出るたびにコネクションが切断され、
サービスに影響が出てしまっております。
vacuum処理の中で、実施された「removing commit log file 001B」が
原因と考えておりますが、色々とWebで検索したもののこれといったものが
見つからず、投稿させていただいた次第です。
(postgresの後の[ ]内のPIDからvacuumが実施した処理で
あると考えております。)
[ご質問]
* これはDBが破損した状態で復旧はできないのでしょうか。
* pg_resetxlogを使えるかもと思ったのですが、
「最後の手段」的な記述があったため、躊躇っております。
他にリスクの低い方法はございませんでしょうか。
* なぜ消してはいけないと思われるコミットログをvacuumは
消してしまったのでしょうか・・・・。
乱文で申し訳ございません。
以上、よろしくお願いいたします。
========================================================
postgres[28659]: [3] PANIC: open of /home/postgres/data/pg_clog/001B
failed: そのようなファイルやディレクトリはありません
postgres[28659]: [4] LOG: statement: select * from pgstattuple('tbl')
postgres[2279]: [1] LOG: server process (pid 28659) was terminated by
signal 6
postgres[2279]: [2] LOG: terminating any other active server processes
postgres[28026]: [1202-1] WARNING: Message from PostgreSQL backend:
postgres[28026]: [1202-2] ^IThe Postmaster has informed me that some
other backend
postgres[28026]: [1202-3] ^Idied abnormally and possibly corrupted
shared memory.
postgres[28026]: [1202-4] ^II have rolled back the current transaction
and am
postgres[28026]: [1202-5] ^Igoing to terminate your database system
connection and exit.
postgres[28663]: [3] LOG: connection received: host=[local]
postgres[28177]: [392-1] WARNING: Message from PostgreSQL backend:
postgres[28027]: [586-1] WARNING: Message from PostgreSQL backend:
postgres[28172]: [388-1] WARNING: Message from PostgreSQL backend:
postgres[28026]: [1202-6] ^IPlease reconnect to the database system and
repeat your query.
postgres[28663]: [4] FATAL: The database system is in recovery mode
postgres[28177]: [392-2] ^IThe Postmaster has informed me that some
other backend
postgres[28027]: [586-2] ^IThe Postmaster has informed me that some
other backend
postgres[28172]: [388-2] ^IThe Postmaster has informed me that some
other backend
postgres[28667]: [3] LOG: connection received: host=[local]
postgres[28177]: [392-3] ^Idied abnormally and possibly corrupted shared
memory.
postgres[28027]: [586-3] ^Idied abnormally and possibly corrupted shared
memory.
postgres[28172]: [388-3] ^Idied abnormally and possibly corrupted shared
memory.
postgres[28667]: [4] FATAL: The database system is in recovery mode
postgres[28177]: [392-4] ^II have rolled back the current transaction and am
postgres[28027]: [586-4] ^II have rolled back the current transaction and am
postgres[28172]: [388-4] ^II have rolled back the current transaction and am
postgres[28671]: [3] LOG: connection received: host=[local]
postgres[28177]: [392-5] ^Igoing to terminate your database system
connection and exit.
postgres[28027]: [586-5] ^Igoing to terminate your database system
connection and exit.
postgres[28172]: [388-5] ^Igoing to terminate your database system
connection and exit.
postgres[28671]: [4] FATAL: The database system is in recovery mode
postgres[28177]: [392-6] ^IPlease reconnect to the database system and
repeat your query.
postgres[28027]: [586-6] ^IPlease reconnect to the database system and
repeat your query.
postgres[28172]: [388-6] ^IPlease reconnect to the database system and
repeat your query.
postgres[28675]: [3] LOG: connection received: host=[local]
postgres[28675]: [4] FATAL: The database system is in recovery mode
postgres[28679]: [3] LOG: connection received: host=[local]
postgres[28679]: [4] FATAL: The database system is in recovery mode
postgres[28683]: [3] LOG: connection received: host=[local]
postgres[28683]: [4] FATAL: The database system is in recovery mode
postgres[28687]: [3] LOG: connection received: host=[local]
postgres[28687]: [4] FATAL: The database system is in recovery mode
postgres[28691]: [3] LOG: connection received: host=[local]
postgres[28691]: [4] FATAL: The database system is in recovery mode
postgres[28695]: [3] LOG: connection received: host=[local]
postgres[28695]: [4] FATAL: The database system is in recovery mode
postgres[2279]: [3] LOG: all server processes terminated;
reinitializing shared memory and semaphores
postgres[28700]: [4] LOG: connection received: host=[local]
postgres[28700]: [5] FATAL: The database system is starting up
postgres[28704]: [4] LOG: connection received: host=[local]
postgres[28704]: [5] FATAL: The database system is starting up
postgres[28699]: [4] LOG: database system was interrupted at 2005-10-13
01:09:16 JST
postgres[28699]: [5] LOG: checkpoint record is at 286/77FD28A4
postgres[28699]: [6] LOG: redo record is at 286/77FD28A4; undo record
is at 0/0; shutdown FALSE
postgres[28699]: [7] LOG: next transaction id: 29988496; next oid: 3678961
postgres[28699]: [8] LOG: database system was not properly shut down;
automatic recovery in progress
postgres[28708]: [4] LOG: connection received: host=[local]
postgres[28708]: [5] FATAL: The database system is starting up
postgres[28712]: [4] LOG: connection received: host=[local]
postgres[28712]: [5] FATAL: The database system is starting up
postgres[28716]: [4] LOG: connection received: host=[local]
postgres[28699]: [9] LOG: ReadRecord: record with zero length at
286/77FD28E4
postgres[28716]: [5] FATAL: The database system is starting up
postgres[28699]: [10] LOG: redo is not required
postgres[28720]: [4] LOG: connection received: host=[local]
postgres[28720]: [5] FATAL: The database system is starting up
postgres[28724]: [4] LOG: connection received: host=[local]
postgres[28724]: [5] FATAL: The database system is starting up
postgres[28728]: [4] LOG: connection received: host=[local]
postgres[28728]: [5] FATAL: The database system is starting up
postgres[28732]: [4] LOG: connection received: host=[local]
postgres[28732]: [5] FATAL: The database system is starting up
postgres[28736]: [4] LOG: connection received: host=[local]
postgres[28736]: [5] FATAL: The database system is starting up
postgres[28740]: [4] LOG: connection received: host=[local]
postgres[28740]: [5] FATAL: The database system is starting up
postgres[28744]: [4] LOG: connection received: host=[local]
postgres[28744]: [5] FATAL: The database system is starting up
postgres[28748]: [4] LOG: connection received: host=[local]
postgres[28748]: [5] FATAL: The database system is starting up
postgres[28752]: [4] LOG: connection received: host=[local]
postgres[28752]: [5] FATAL: The database system is starting up
postgres[28756]: [4] LOG: connection received: host=[local]
postgres[28756]: [5] FATAL: The database system is starting up
postgres[28760]: [4] LOG: connection received: host=[local]
postgres[28760]: [5] FATAL: The database system is starting up
postgres[28764]: [4] LOG: connection received: host=[local]
postgres[28764]: [5] FATAL: The database system is starting up
postgres[28699]: [11] LOG: recycled transaction log file 0000028600000076
postgres[28699]: [12] LOG: database system is ready
========================================================
pgsql-jp メーリングリストの案内