[pgcluster: 730] リカバリーに失敗する

山本 翔 yamamoto_90jp @ yahoo.co.jp
2005年 3月 11日 (金) 19:09:06 JST


はじめまして、山本と申します。

Linux 2.6.9 のカーネルで、PGCluster 1.1.1a をダウンロー
ドして
以下の環境で、リカバリーモードを評価しております。
2台のクラスターDBうち、一台を復旧モード立ち上げると、ク
ラスターDB と
レプリケーションサーバのログにエラーが表示され、正常に復
旧しませんでした。

原因がわからないので、ご助言をお願いします。

レプリケーションサーバ  2台
ロードバランサー     2台
クラスターDB       2台

----(クラスターDB 設定)
#
# Replicate Server Information
#
<Replicate_Server_Info>
    <Host_Name>replicate1</Host_Name>
    <Port>8001</Port>
    <Recovery_Port>8101</Recovery_Port>
    <LifeCheck_Port>8201</LifeCheck_Port>
</Replicate_Server_Info>
<Replicate_Server_Info>
    <Host_Name>replicate2</Host_Name>
    <Port>8001</Port>
    <Recovery_Port>8101</Recovery_Port>
    <LifeCheck_Port>8201</LifeCheck_Port>
</Replicate_Server_Info>
#
# Cluster Server Information
#
<Recovery_Port>7101</Recovery_Port>
<LifeCheck_Port>7201</LifeCheck_Port>
<Rsync_Path>/usr/local/bin/rsync</Rsync_Path>
<Rsync_Option>/usr/local/openssh/bin/ssh -1</Rsync_Option>
<When_Stand_Alone>read_write</When_Stand_Alone>
<Status_Log_File>/tmp/cluster.sts</Status_Log_File>
<Error_Log_File>/tmp/cluster.log</Error_Log_File>

----(クラスターDB ログ)

$ /usr/local/pgsql/bin/pg_ctl start -o "-R" -D
/usr/local/pgsql/data
Start in recovery mode!
Please wait until a data synchronization finishes from
Master DB...
postmaster successfully started
1st recovery step of [global] directory...OK
1st recovery step of [base] directory...OK
1st recovery step of [pg_clog] directory...OK
1st recovery step of [pg_xlog] directory...OK
2nd recovery step of [global] directory...OK
2nd recovery step of [base] directory...OK
2nd recovery step of [pg_clog] directory...OK
2nd recovery step of [pg_xlog] directory...OK
LOG:  could not bind IPv4 socket: Address already in use
HINT:  Is another postmaster already running on port 5432?
If not, wait a few seconds and retry.
LOG:  database system was interrupted at 2005-03-11
18:55:31 JST
LOG:  could not open file
"/usr/local/pgsql/data/pg_xlog/0000000000000001" (log file
0, segment 1): No such f
ile or directory
LOG:  invalid primary checkpoint record
LOG:  could not open file
"/usr/local/pgsql/data/pg_xlog/0000000000000001" (log file
0, segment 1): No such f
ile or directory
LOG:  invalid secondary checkpoint record
PANIC:  could not locate a valid checkpoint record
LOG:  startup process (PID 1737) was terminated by signal
6
LOG:  aborting startup due to startup process failure
rename "/usr/local/pgsql/data/base/1/.16612.bqPWJV" ->
"base/1/16612": No such file or directory
rename
"/usr/local/pgsql/data/pg_xlog/.0000000000000001.PCXXNZ"
-> "pg_xlog/0000000000000001": No such file o
r directory
rsync error: some files could not be transferred (code 23)
at main.c(1048)


----(レプリケーション設定)
#
# Cluster Server Information
#
<Cluster_Server_Info>
    <Host_Name>clusterdb1</Host_Name>
    <Port>5432</Port>
    <Recovery_Port>7101</Recovery_Port>
    <LifeCheck_Port>7201</LifeCheck_Port>
</Cluster_Server_Info>
<Cluster_Server_Info>
    <Host_Name>clusterdb2</Host_Name>
    <Port>5432</Port>
    <Recovery_Port>7101</Recovery_Port>
    <LifeCheck_Port>7201</LifeCheck_Port>
</Cluster_Server_Info>
#
# Replicate Server Information
#
#
# LoadBalancer Server Information
#
<LoadBalance_Server_Info>
    <Host_Name>certification1</Host_Name>
    <Recovery_Port>6101</Recovery_Port>
    <LifeCheck_Port>6201</LifeCheck_Port>
</LoadBalance_Server_Info>
<LoadBalance_Server_Info>
    <Host_Name>certification2</Host_Name>
    <Recovery_Port>6101</Recovery_Port>
    <LifeCheck_Port>6201</LifeCheck_Port>
</LoadBalance_Server_Info>
#
# Replicate Server Information
#
<Status_Log_File>/tmp/pgreplicate.sts</Status_Log_File>
<Error_Log_File>/tmp/pgreplicate.log</Error_Log_File>
<Replication_Port>8001</Replication_Port>
<Recovery_Port>8101</Recovery_Port>
<LifeCheck_Port>8201</LifeCheck_Port>
<RLOG_Port>8301</RLOG_Port>
<Response_Mode>normal</Response_Mode>
<Use_Replication_Log>no</Use_Replication_Log>
<Reserved_Connections>1</Reserved_Connections>


----(レプリケーションログ)
DEBUG:replicate_main():replicate main 8001 port bind OK
DEBUG:PGRreplicate_packet_send():cmdSts=N
DEBUG:PGRreplicate_packet_send():cmdType=
DEBUG:PGRreplicate_packet_send():rlog=0
DEBUG:PGRreplicate_packet_send():request_id=0
DEBUG:PGRreplicate_packet_send():replicate_id=0
DEBUG:PGRreplicate_packet_send():port=0
DEBUG:PGRreplicate_packet_send():pid=0
DEBUG:PGRreplicate_packet_send():from_host=replicate1
DEBUG:PGRreplicate_packet_send():dbName=template1
DEBUG:PGRreplicate_packet_send():userName=postgres
DEBUG:PGRreplicate_packet_send():recieve sec=0
DEBUG:PGRreplicate_packet_send():recieve usec=0
DEBUG:PGRreplicate_packet_send():query_size=65
DEBUG:PGRreplicate_packet_send():query=SELECT
PGR_SYSTEM_COMMAND_FUNCTION(1,'replicate1',8001,8101,8201)
DEBUG:sem_lock[1]
DEBUG:pgr_createConn():PQsetdbLogin host[clusterdb1]
port[5432] db[template1] user[postgres]
DEBUG:pgr_createConn():PQsetdbLogin host[clusterdb2]
port[5432] db[template1] user[postgres]
ERROR:pgr_createConn():PQsetdbLogin failed. close socket
ERROR:pgr_createConn():PQsetdbLogin failed. close socket
ERROR:pgr_createConn():PQsetdbLogin failed. close socket
ERROR:pgr_createConn():PQsetdbLogin failed. close socket
ERROR:pgr_createConn():PQsetdbLogin failed. close socket
ERROR:pgr_createConn():PQsetdbLogin  timeout
ERROR:setTransactionTbl():New Transaction but
pgr_createConn5432 @ clusterdb2 failed
DEBUG:deleteTransactionTbl(): getTransactionTbl failed
DEBUG:pgr_createConn():PQsetdbLogin ok
DEBUG:sem_unlock[1]
DEBUG:pgrecovery_loop():receive packet no:1
DEBUG:first_setup_recovery():1st setup target clusterdb2
DEBUG:first_setup_recovery():1st setup port 5432
DEBUG:pgr_createConn():PQsetdbLogin host[clusterdb1]
port[5432] db[template1] user[postgres]
DEBUG:pgr_createConn():PQsetdbLogin ok
DEBUG:send_sync_data():sync_command(SELECT
PGR_SYSTEM_COMMAND_FUNCTION(3,0,0,0,1) )
DEBUG:pgrecovery_loop():1st master clusterdb1 - 5432
DEBUG:pgrecovery_loop():1st target clusterdb2 - 5432
DEBUG:pgrecovery_loop():receive packet no:5
DEBUG:send_sync_data():sync_command(SELECT
PGR_SYSTEM_COMMAND_FUNCTION(3,0,0,0,1) )
DEBUG:pgrecovery_loop():2nd master clusterdb1 - 5432
DEBUG:pgrecovery_loop():2nd target clusterdb2 - 5432
DEBUG:pgrecovery_loop():second_setup_recovery end :1
DEBUG:pgrecovery_loop():receive packet no:9
DEBUG:pgrecovery_loop():last master clusterdb1 - 5432
DEBUG:pgrecovery_loop():last target clusterdb2 - 5432
DEBUG:PGRsend_queue():master clusterdb1 - 5432

DEBUG:PGRsend_queue():target clusterdb2 - 5432
ERROR:PGRget_recovery_queue_file_for_read():could not open
recovery queue file as
/usr/local/pgsql/tmp/.pgr_recovery.1. r
eason: No such file or directory
DEBUG:pgrecovery_loop():PGRsend_queue ok


__________________________________
Let's Celebrate Together!
Yahoo! JAPAN
http://pr.mail.yahoo.co.jp/so2005/




pgcluster メーリングリストの案内