[pgcluster: 340] 1.0.7RC2が上手く動いてくれません。
Kouji Hamada
hamada @ hamlock.jp
2004年 6月 9日 (水) 19:15:23 JST
浜田と申します。長い文章になってしまってすみません。
今まで1.0.6bフルパッケージ版を使用していて正常稼動していたのですが
LOCK管理に問題があるという事だったので、今回から最新版の1.0.7RC2フ
ルパッケージ版をインストールして動かそうとしてるのですが、psql -l
を入力して結果が返ってくるまで10分程かかったりと、上手く動いてく
れません。
【構成及び設定】
クラスタサーバ:2台
レプリケーションサーバ:1台
PGClusterバージョン:1.0.7RC2フルパッケージ版
OSバージョン:RedHat7.3(3台ともOSインストールからやりました)
IPアドレスは以下のように振っています。
192.168.0.210 test0.aaa.jp(レプリケーションサーバ)
192.168.0.211 test1.aaa.jp(クラスタサーバ(マスタDB))
192.168.0.212 test2.aaa.jp(クラスタサーバ(2号機))
クラスタサーバ2台
/usr/local/pgsql/data/cluster.confの内容は以下の通りです。
クラスタサーバ2台同じ設定内容です。
<Replicate_Server_Info>
<Host_Name> test0.aaa.jp </Host_Name>
<Port> 8777 </Port>
<Recovery_Port> 7778 </Recovery_Port>
</Replicate_Server_Info>
#<Replicate_Server_Info>
# <Host_Name> replicate2.postgres.jp </Host_Name>
# <Port> 8777 </Port>
# <Recovery_Port> 7778 </Recovery_Port>
#</Replicate_Server_Info>
#<Replicate_Server_Info>
# <Host_Name> replicate3.postgres.jp </Host_Name>
# <Port> 8777 </Port>
# <Recovery_Port> 7778 </Recovery_Port>
#</Replicate_Server_Info>
<Recovery_Port> 7779 </Recovery_Port>
<Rsync_Path> /usr/bin/rsync </Rsync_Path>
<Rsync_Option> ssh -1 </Rsync_Option>
<When_Stand_Alone> read_only </When_Stand_Alone>
<Not_Replicate_Info>
<DB_Name> test_db </DB_Name>
<Table_Name> log_table </Table_Name>
</Not_Replicate_Info>
/usr/local/pgsql/data/pg_hda.confの内容は以下の通りです。
クラスタサーバ2台同じ設定内容です。
local all all trust
host all all 127.0.0.1 255.255.255.255 trust
host all all 192.168.0.0 255.255.255.0 trust
/usr/local/pgsql/data/postgresql.confの内容は以下の通りです。
クラスタサーバ2台同じ設定内容です。
tcpip_socket = true
silent_mode = true
レプリケーションサーバの設定
/usr/local/pgsql/data/pgreplicate.confの内容は以下の通りです。
<Cluster_Server_Info>
<Host_Name> test1.aaa.jp </Host_Name>
<Port> 5432 </Port>
<Recovery_Port> 7779 </Recovery_Port>
</Cluster_Server_Info>
<Cluster_Server_Info>
<Host_Name> test2.aaa.jp </Host_Name>
<Port> 5432 </Port>
<Recovery_Port> 7779 </Recovery_Port>
</Cluster_Server_Info>
#<Cluster_Server_Info>
# <Host_Name> cluster3.postgres.jp </Host_Name>
# <Port> 5432 </Port>
# <Recovery_Port> 7779 </Recovery_Port>
#</Cluster_Server_Info>
#<LoadBalance_Server_Info>
# <Host_Name> loadbalancer.postgres.jp </Host_Name>
# <Recovery_Port> 7780 </Recovery_Port>
#</LoadBalance_Server_Info>
#<Replicate_Server_Info>
# <Host_Name> test0.aaa.jp </Host_Name>
# <Port> 8777 </Port>
# <Recovery_Port> 7778 </Recovery_Port>
#</Replicate_Server_Info>
<Replication_Port> 8777 </Replication_Port>
<Recovery_Port> 7778 </Recovery_Port>
<Response_Mode> normal </Response_Mode>
/usr/local/pgsql/data/pg_hda.confの内容は変更していません。
local all all trust
host all all 127.0.0.1 255.255.255.255 trust
host all all 192.168.0.0 255.255.255.0 trust
/usr/local/pgsql/data/postgresql.confの内容は以下の通りです。
tcpip_socket = true
silent_mode = true
rsyncの鍵もサーバー間で交換済みですパスワード無しで相互にファイル交換
出来るか動作確認済みなのでFQDN形式による名前の解決は出来てるように見えます。
(rsync コマンドを叩いてテストしました。)
【発生した現象1】
1.0.6bの頃と同じようにレプリケーションサーバー、クラスタサーバ(マスタDB)
クラスタサーバ(2号機)の順に立ち上げると、レプリケーションサーバーの
pgreplicate.stsにエラーが出ます。
Tue Jun 9 14:11:46 2004 port(5432) host:test1.aaa.jp start use
Tue Jun 9 14:11:46 2004 port(5432) host:test2.aaa.jp start use
Tue Jun 9 14:11:46 2004 cascade(test0.aaa.jp) port(8777) start use
Tue Jun 9 14:11:46 2004 cascade(test0.aaa.jp) port(8777) become top
Tue Jun 9 14:11:46 2004 port(5432) host:test1.aaa.jp error
Tue Jun 9 14:11:46 2004 port(5432) host:test2.aaa.jp error
1.0.7RC2は立ち上げ方が違うんでしょうか。それとも設定方法に誤りがあるので
しょうか。
【発生した現象2】
【発生した現象1】のエラーを回避する為に、クラスタサーバ(マスタDB)
クラスタサーバ(2号機)、レプリケーションサーバーの順に立ち上げます。
レプリケーションサーバーのpgreplicate.stsもOKです。
Wed Jun 9 14:45:01 2004 port(5432) host:test1.aaa.jp start use
Wed Jun 9 14:45:01 2004 port(5432) host:test2.aaa.jp start use
Wed Jun 9 14:45:01 2004 cascade(test0.aaa.jp) port(8777) start use
Wed Jun 9 14:45:01 2004 cascade(test0.aaa.jp) port(8777) become top
マスターDBのコンソールでpsql -lと入力して結果が返ってくるまで10分程
掛かります。クラスタ2台とも同じ動作をします。
List of databases
Name | Owner | Encoding
-----------+----------+----------
template0 | postgres | UNICODE
template1 | postgres | UNICODE
(2 rows)
レプリケーションサーバーのpgreplicate.logです。
Wed Jun 9 14:15:31 2004 PQsetdbLogin failed. close socket!!
Wed Jun 9 14:15:31 2004 PQsetdbLogin failed. close socket!!
Wed Jun 9 14:15:31 2004 PQsetdbLogin failed. close socket!!
Wed Jun 9 14:15:31 2004 PQsetdbLogin failed. close socket!!
Wed Jun 9 14:15:31 2004 PQsetdbLogin failed. close socket!!
Wed Jun 9 14:15:31 2004 dbPersistLogin timeout
Wed Jun 9 14:15:31 2004 New Transaction but pgr_createConn failed
Wed Jun 9 14:15:31 2004 setTransactionTbl failed
Wed Jun 9 14:15:31 2004 PQsetdbLogin failed. close socket!!
Wed Jun 9 14:15:31 2004 PQsetdbLogin failed. close socket!!
Wed Jun 9 14:15:31 2004 PQsetdbLogin failed. close socket!!
Wed Jun 9 14:15:31 2004 PQsetdbLogin failed. close socket!!
Wed Jun 9 14:15:31 2004 PQsetdbLogin failed. close socket!!
Wed Jun 9 14:15:31 2004 dbPersistLogin timeout
Wed Jun 9 14:15:31 2004 New Transaction but pgr_createConn failed
Wed Jun 9 14:15:31 2004 setTransactionTbl failed
Wed Jun 9 14:25:58 2004 session closed
Wed Jun 9 15:15:01 2004 session closed
Wed Jun 9 15:25:01 2004 session closed
Wed Jun 9 15:35:01 2004 session closed
Wed Jun 9 15:45:00 2004 session closed
Wed Jun 9 15:45:01 2004 session closed
Wed Jun 9 15:55:02 2004 session closed
Wed Jun 9 16:05:02 2004 session closed
Wed Jun 9 17:26:40 2004 session closed
Wed Jun 9 17:36:40 2004 session closed
Wed Jun 9 17:36:40 2004 session closed
Wed Jun 9 17:46:40 2004 session closed
[postgres @ test0 data]$ pgreplicate -D /usr/local/pgsql/data -W /usr/local/pgsql/
data -l -n -v
DEBUG(init_server_tbl): /usr/local/pgsql/data/pgreplicate.log open ok
DEBUG(init_server_tbl): /usr/local/pgsql/data/pgreplicate.sts open ok
DEBUG(init_server_tbl): PGR_Get_Conf_Data ok
DEBUG(init_server_tbl): LoadBalanceTbl allocate ok
DEBUG(init_server_tbl): CascadeTbl shmget ok
DEBUG(init_server_tbl): CascadeTbl shmat ok
DEBUG(init_server_tbl): CascadeInf shmget ok
DEBUG(init_server_tbl): CascadeInf shmat ok
DEBUG(init_server_tbl): CommitLog shmget ok
DEBUG(init_server_tbl): Commit_Log_Tbl shmat ok
DEBUG(init_server_tbl): Conf data read ok
DEBUG(init_server_tbl): HostTbl shmget ok
DEBUG(init_server_tbl): HostTbl shmat ok
DEBUG(write_log_file): LockWaitTbl shmget ok
DEBUG(write_log_file): LockWaitTbl shmat ok
DEBUG(replicate_main): replicate main 8777 port bind OK
DEBUG(PGRreplicate_packet_send): cmdSts=N
DEBUG(PGRreplicate_packet_send): cmdType=
DEBUG(PGRreplicate_packet_send): port=0
DEBUG(PGRreplicate_packet_send): pid=0
DEBUG(PGRreplicate_packet_send): except_host=
DEBUG(PGRreplicate_packet_send): from_host=test0.aaa.jp
DEBUG(PGRreplicate_packet_send): dbName=template1
DEBUG(PGRreplicate_packet_send): userName=postgres
DEBUG(PGRreplicate_packet_send): recieve sec=0
DEBUG(PGRreplicate_packet_send): recieve usec=0
DEBUG(PGRreplicate_packet_send): query_size=62
DEBUG(PGRreplicate_packet_send): query=SELECT PGR_SYSTEM_COMMAND_FUNCTION(1,'tes
t0.aaa.jp',8777,7778)
DEBUG(PGRreplicate_packet_send): useFlag[2]
DEBUG(PGRis_same_host): not same host:
DEBUG(is_need_response): sem_lock[1]
DEBUG(PGRreplicate_packet_send_each_server): except:0@ host:5432 @ test1.aaa.jp
DEBUG(PGRreplicate_packet_send_each_server): send replicate to:test1.aaa.jp
DEBUG(PGRsend_replicate_packet_to_server): host(test1.aaa.jp) : port(5432)
DEBUG(getTransactionTbl): not found in getTransactionTbl
DEBUG(pgr_createConn): PQsetdbLogin host[test1.aaa.jp] port[5432] db[template1]
user[postgres]
DEBUG(PGRrecovery_main): PGRrecovery_main bind port 7778
DEBUG(pgr_createConn): PQsetdbLogin ok!!
DEBUG(insertTransactionTbl): db:template1 port:5432 user:postgres host:test1.aaa
.jp query:SELECT PGR_SYSTEM_COMMAND_FUNCTION(1,'test0.aaa.jp',8777,7778)
DEBUG(insertTransactionTbl): sem_lock[2]
DEBUG(PGRsem_lock): sem_unlock[1]
DEBUG(getTransactionTbl): hit !! transaction tbl host test1.aaa.jp db:template1
pid:0
DEBUG(PGRis_same_host): not same host:
DEBUG(PGRreplicate_packet_send_each_server): except:0@ host:5432 @ test2.aaa.jp
DEBUG(PGRreplicate_packet_send_each_server): send replicate to:test2.aaa.jp
DEBUG(PGRsend_replicate_packet_to_server): host(test2.aaa.jp) : port(5432)
DEBUG(getTransactionTbl): not found in getTransactionTbl
DEBUG(pgr_createConn): PQsetdbLogin host[test2.aaa.jp] port[5432] db[template1]
user[postgres]
DEBUG(pgr_createConn): PQsetdbLogin ok!!
DEBUG(insertTransactionTbl): db:template1 port:5432 user:postgres host:test2.aaa
.jp query:SELECT PGR_SYSTEM_COMMAND_FUNCTION(1,'test0.aaa.jp',8777,7778)
DEBUG(insertTransactionTbl): sem_lock[3]
DEBUG(PGRsem_lock): sem_unlock[2]
DEBUG(getTransactionTbl): hit !! transaction tbl host test2.aaa.jp db:template1
pid:0
DEBUG(getTransactionTbl): sem_unlock[3]
DEBUG(PGRsem_unlock): PGRreplicate_packet_send end
DEBUG(PGRsem_unlock): wait replicate
DEBUG(PGRsem_unlock): replicate main: selected
DEBUG(replicate_loop): wait replicate
DEBUG(replicate_loop): replicate main: selected
DEBUG(replicate_loop): wait replicate
DEBUG(replicate_loop): replicate main: selected
DEBUG(replicate_loop): wait replicate
DEBUG(replicate_loop): replicate_loop selected
DEBUG(PGRread_packet): query size=31
DEBUG(PGRread_packet): read[31] query[SET client_encoding = 'UNICODE']
DEBUG(PGRread_packet): query :: SET client_encoding = 'UNICODE'
DEBUG(PGRreplicate_packet_send): cmdSts=Q
DEBUG(PGRreplicate_packet_send): cmdType=T
DEBUG(PGRreplicate_packet_send): port=5432
DEBUG(PGRreplicate_packet_send): pid=2481
DEBUG(PGRreplicate_packet_send): except_host=test1.aaa.jp
DEBUG(PGRreplicate_packet_send): from_host=test1.aaa.jp
DEBUG(PGRreplicate_packet_send): dbName=template1
DEBUG(PGRreplicate_packet_send): userName=postgres
DEBUG(PGRreplicate_packet_send): recieve sec=1086769000
DEBUG(PGRreplicate_packet_send): recieve usec=305849
DEBUG(PGRreplicate_packet_send): query_size=31
DEBUG(PGRreplicate_packet_send): query=SET client_encoding = 'UNICODE'
DEBUG(PGRreplicate_packet_send): useFlag[2]
DEBUG(PGRis_same_host): 5432 @ test1.aaa.jp return trigger
DEBUG(is_need_sync_time): sem_lock[1]
DEBUG(PGRreturn_result): PGRreturn_result[]
DEBUG(PGRreturn_result): 128 send
DEBUG(PGRreturn_result): status of PGRreturn_result[0]
DEBUG(PGRreturn_result): sem_lock[2]
DEBUG(PGRsem_lock): sem_unlock[1]
DEBUG(PGRreplicate_packet_send_each_server): except:5432 @ test1.aaa.jp host:5432@
test2.aaa.jp
DEBUG(PGRreplicate_packet_send_each_server): send replicate to:test2.aaa.jp
DEBUG(PGRsend_replicate_packet_to_server): host(test2.aaa.jp) : port(5432)
DEBUG(getTransactionTbl): not found in getTransactionTbl
DEBUG(pgr_createConn): PQsetdbLogin host[test2.aaa.jp] port[5432] db[template1]
user[postgres]
DEBUG(replicate_loop): replicate_loop selected
DEBUG(PGRread_packet): query size=31
DEBUG(PGRread_packet): read[31] query[SET client_encoding = 'UNICODE']
DEBUG(PGRread_packet): query :: SET client_encoding = 'UNICODE'
DEBUG(PGRreplicate_packet_send): cmdSts=Q
DEBUG(PGRreplicate_packet_send): cmdType=T
DEBUG(PGRreplicate_packet_send): port=5432
DEBUG(PGRreplicate_packet_send): pid=2483
DEBUG(PGRreplicate_packet_send): except_host=test1.aaa.jp
DEBUG(PGRreplicate_packet_send): from_host=test1.aaa.jp
DEBUG(PGRreplicate_packet_send): dbName=template1
DEBUG(PGRreplicate_packet_send): userName=postgres
DEBUG(PGRreplicate_packet_send): recieve sec=1086769000
DEBUG(PGRreplicate_packet_send): recieve usec=311010
DEBUG(PGRreplicate_packet_send): query_size=31
DEBUG(PGRreplicate_packet_send): query=SET client_encoding = 'UNICODE'
DEBUG(PGRreplicate_packet_send): useFlag[2]
DEBUG(PGRis_same_host): 5432 @ test1.aaa.jp return trigger
DEBUG(is_need_sync_time): sem_lock[1]
DEBUG(PGRreturn_result): PGRreturn_result[]
DEBUG(PGRreturn_result): 128 send
DEBUG(PGRreturn_result): status of PGRreturn_result[0]
DEBUG(PGRreturn_result): sem_lock[2]
DEBUG(replicate_loop): replicate_loop selected
DEBUG(PGRread_packet): query size=31
DEBUG(PGRread_packet): read[31] query[SET client_encoding = 'UNICODE']
DEBUG(PGRread_packet): query :: SET client_encoding = 'UNICODE'
DEBUG(PGRreplicate_packet_send): cmdSts=Q
DEBUG(PGRreplicate_packet_send): cmdType=T
DEBUG(PGRreplicate_packet_send): port=5432
DEBUG(PGRreplicate_packet_send): pid=2405
DEBUG(PGRreplicate_packet_send): except_host=test2.aaa.jp
DEBUG(PGRreplicate_packet_send): from_host=test2.aaa.jp
DEBUG(PGRreplicate_packet_send): dbName=template1
DEBUG(PGRreplicate_packet_send): userName=postgres
DEBUG(PGRreplicate_packet_send): recieve sec=1086769000
DEBUG(PGRreplicate_packet_send): recieve usec=313834
DEBUG(PGRreplicate_packet_send): query_size=31
DEBUG(PGRreplicate_packet_send): query=SET client_encoding = 'UNICODE'
DEBUG(PGRreplicate_packet_send): useFlag[2]
DEBUG(is_need_response): sem_lock[1]
DEBUG(replicate_loop): replicate main: selected
DEBUG(replicate_loop): wait replicate
DEBUG(replicate_loop): replicate_loop selected
DEBUG(PGRread_packet): query size=31
DEBUG(PGRread_packet): read[31] query[SET client_encoding = 'UNICODE']
DEBUG(PGRread_packet): query :: SET client_encoding = 'UNICODE'
DEBUG(PGRreplicate_packet_send): cmdSts=Q
DEBUG(PGRreplicate_packet_send): cmdType=T
DEBUG(PGRreplicate_packet_send): port=5432
DEBUG(PGRreplicate_packet_send): pid=2415
DEBUG(PGRreplicate_packet_send): except_host=test2.aaa.jp
DEBUG(PGRreplicate_packet_send): from_host=test2.aaa.jp
DEBUG(PGRreplicate_packet_send): dbName=template1
DEBUG(PGRreplicate_packet_send): userName=postgres
DEBUG(PGRreplicate_packet_send): recieve sec=1086769000
DEBUG(PGRreplicate_packet_send): recieve usec=321690
DEBUG(PGRreplicate_packet_send): query_size=31
DEBUG(PGRreplicate_packet_send): query=SET client_encoding = 'UNICODE'
DEBUG(PGRreplicate_packet_send): useFlag[2]
DEBUG(is_need_response): sem_lock[1]
DEBUG(replicate_loop): wait replicate
DEBUG(replicate_loop): wait replicate
DEBUG(replicate_loop): wait replicate
DEBUG(replicate_loop): wait replicate
DEBUG(replicate_loop): wait replicate
DEBUG(replicate_loop): wait replicate
DEBUG(replicate_loop): wait replicate
DEBUG(replicate_loop): wait replicate
DEBUG(replicate_loop): wait replicate
DEBUG(pgr_createConn): PQsetdbLogin ok!!
DEBUG(insertTransactionTbl): db:template1 port:5432 user:postgres host:test2.aaa
.jp query:SET client_encoding = 'UNICODE'
DEBUG(insertTransactionTbl): sync_command(SELECT PGR_SYSTEM_COMMAND_FUNCTION(3,1
086769000,305849) )
DEBUG(insertTransactionTbl): PQexec send :SET client_encoding = 'UNICODE'
DEBUG(insertTransactionTbl): PQexec end
DEBUG(getTransactionTbl): hit !! transaction tbl host test2.aaa.jp db:template1
pid:2481
DEBUG(getTransactionTbl): sem_lock[3]
DEBUG(PGRsem_lock): sem_unlock[2]
DEBUG(getTransactionTbl): sem_unlock[3]
DEBUG(PGRsem_unlock): PGRreplicate_packet_send end
DEBUG(PGRsem_unlock): replicate_loop selected
DEBUG(PGRread_packet): connection closed
ERROR(PGRread_packet): session closed
DEBUG(PGRsem_lock): sem_unlock[1]
DEBUG(PGRreplicate_packet_send_each_server): except:5432 @ test1.aaa.jp host:5432@
test2.aaa.jp
DEBUG(PGRreplicate_packet_send_each_server): send replicate to:test2.aaa.jp
DEBUG(PGRsend_replicate_packet_to_server): host(test2.aaa.jp) : port(5432)
DEBUG(getTransactionTbl): not found in getTransactionTbl
DEBUG(pgr_createConn): PQsetdbLogin host[test2.aaa.jp] port[5432] db[template1]
user[postgres]
DEBUG(child_wait): replicate main: selected
DEBUG(PGRreplicate_packet_send_each_server): except:5432 @ test2.aaa.jp host:5432@
test1.aaa.jp
DEBUG(PGRreplicate_packet_send_each_server): send replicate to:test1.aaa.jp
DEBUG(PGRsend_replicate_packet_to_server): host(test1.aaa.jp) : port(5432)
DEBUG(getTransactionTbl): not found in getTransactionTbl
DEBUG(pgr_createConn): PQsetdbLogin host[test1.aaa.jp] port[5432] db[template1]
user[postgres]
DEBUG(replicate_loop): wait replicate
DEBUG(replicate_loop): replicate main: selected
DEBUG(replicate_loop): wait replicate
DEBUG(replicate_loop): replicate_loop selected
この後も同じようなログがずっと繰り返し続いています。
この時点でも十分長いですが、これ以下は意図的にカットしました。
現時点ではpostgresqlのチューニング等一切行なっておりません。
各種設定ファイルは1.0.6bと同じような感じで設定しました。
1.0.7RC2になって新たに設定する項目などありましたら教えて下さい。
よろしくお願いします。
pgcluster メーリングリストの案内