[pgcluster: 258] Re: レプリケーションがうまく動作しません。

miyazawa makoto miyazawa_pgcluster_mailling @ yahoo.co.jp
2004年 4月 20日 (火) 17:29:22 JST


宮澤@東京です。
早速のリプライ、ありがとうございます。

--- mitani <mitani @ sraw.co.jp> からのメッセージ:
> 三谷@広島です.
> 
> >
> (1)0系でinsertしたレコードが1系でレプリケーションされ
な
> > い根本的な原因はなんでしょうか?
> LOGを見ると,レプリケーションサーバの中で,1系のサー
バの名前解決はでき
> たようですが,0系のサーバの名前解決ができなかったよう
に見えます.
> 
>
なお,基本的に,サーバ名の指定はホスト名だけではなく,FQDN
形式で記述してください.
> 
> 
> ネットワークをIPV6有効にしているとよく分からないのです
が,名前解決で失敗
> して接続できないエラーが発生します.(要はIPV6に対応し
ていないことが原因
>
なのですが).ですので,ネットワークの設定でIPV6は必ずOFF
にしておいてく
> ださい.
>
あと,1.0.6cv9でお試し頂いたようですが,いくつかバグをFIX
していますので,
> 最新版の1.0.6cv13でもお試しください..

まず、サーバでIPv6が動作していないことを ifconfig で確認
しました。

0系
$ /sbin/ifconfig -a
lo0: flags=1000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4> mtu
8232 index 1
        inet 127.0.0.1 netmask ff000000
eri0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4>
mtu 1500 index 2
        inet xx.xx.xx.xx netmask ffffff00 broadcast
xx.xx.xx.255

1系
$ /sbin/ifconfig -a
lo0: flags=1000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4> mtu
8232 index 1
        inet 127.0.0.1 netmask ff000000
hme0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4>
mtu 1500 index 2
        inet xx.xx.xx.xx netmask ffffff00 broadcast
xx.xx.xx.255

その上で、設定ファイルのサーバ名指定をFQDN形式に修正し、

    (1)1系のクラスタDB
    (2)0系のレプリケーションサーバ
    (3)0系のクラスタDB

の順で昨日と同様の確認を行いましたが、
やはり昨日と同じ現象が発生しております。
三谷さまのご指摘のあった、名前解決ができていないようです
。
結局何が問題なのかよくわからない状態になってしまいました
ので最新版で試してみようかとおもいます。

以上、ご報告まで。

(2)の時点でのログは以下の通り。

$ /usr/local/pgsql/bin/pgreplicate -D /usr/local/pgsql/etc
-nv
DEBUG(init_server_tbl):
/usr/local/pgsql/etc/pgreplicate.log open ok
DEBUG(init_server_tbl): PGR_Get_Conf_Data ok
DEBUG(init_server_tbl): LoadBalanceTbl allocate ok
DEBUG(init_server_tbl): Conf data read ok
DEBUG(init_server_tbl): HostTbl shmget ok
DEBUG(init_server_tbl): HostTbl shmat ok
DEBUG(pgr_set_log): LockWaitTbl shmget ok
DEBUG(pgr_set_log): LockWaitTbl shmat ok
DEBUG(PGRrecovery_main): PGRrecovery_main bind port 7778
DEBUG(PGRrecovery_main): wait recovery
DEBUG(replicate_main): replicate main 8777 port bind OK
DEBUG(replicate_packet_send): cmdSts=N
DEBUG(replicate_packet_send): cmdType=
DEBUG(replicate_packet_send): port=0
DEBUG(replicate_packet_send): pid=0
DEBUG(replicate_packet_send): except_host=
DEBUG(replicate_packet_send): from_host=test1
DEBUG(replicate_packet_send): dbName=template1
DEBUG(replicate_packet_send): userName=postgres
DEBUG(replicate_packet_send): recieve sec=0
DEBUG(replicate_packet_send): recieve usec=0
DEBUG(replicate_packet_send): query_size=54
DEBUG(replicate_packet_send): query=SELECT
PGR_SYSTEM_COMMAND_FUNCTION(1,test1,8777,7778)
DEBUG(replicate_packet_send): useFlag[2]
DEBUG(PGRis_same_host): not same host:
DEBUG(is_need_response): same_host[0] mode[1] current[0]
DEBUG(is_need_response): sem_lock[1]
DEBUG(replicate_packet_send_each_server): except:0@
host:5432 @ test1.adtest.foo.co.jp
DEBUG(replicate_packet_send_each_server): send replicate
to:test1.adtest.foo.co.jp
DEBUG(PGRsend_replicate_packet_to_server):
host(test1.adtest.foo.co.jp) : port(5432)
DEBUG(getDBServerTbl): search host is
(test1.adtest.foo.co.jp)
DEBUG(get_ip_by_name): not found
DEBUG(setDBServerTbl): host:test1.adtest.foo.co.jp
dbName:template1
DEBUG(getDBServerTbl): search host is
(test1.adtest.foo.co.jp)
DEBUG(get_ip_by_name): not found
ERROR(pgr_createConn): PQsetdbLogin failed. close socket!!
ERROR(pgr_createConn): PQsetdbLogin failed. close socket!!
ERROR(pgr_createConn): PQsetdbLogin failed. close socket!!
ERROR(pgr_createConn): PQsetdbLogin failed. close socket!!
ERROR(pgr_createConn): PQsetdbLogin failed. close socket!!
ERROR(pgr_createConn): dbPersistLogin  timeout
ERROR(pgr_createConn): pgr_createConn failed
ERROR(pgr_createConn): setDBServerTbl failed
DEBUG(pgr_createConn): sem_unlock[1]
DEBUG(getDBServerTbl): search host is
(test1.adtest.foo.co.jp)
DEBUG(get_ip_by_name): not found
DEBUG(PGRis_same_host): not same host:
DEBUG(is_need_response): same_host[0] mode[1] current[1]
DEBUG(is_need_response): sem_lock[2]
DEBUG(replicate_packet_send_each_server): except:0@
host:5432 @ test2.adtest.foo.co.jp
DEBUG(replicate_packet_send_each_server): send replicate
to:test2.adtest.foo.co.jp
DEBUG(PGRsend_replicate_packet_to_server):
host(test2.adtest.foo.co.jp) : port(5432)
DEBUG(getDBServerTbl): search host is
(test2.adtest.foo.co.jp)
DEBUG(get_ip_by_name): not found
DEBUG(setDBServerTbl): host:test2.adtest.foo.co.jp
dbName:template1
DEBUG(getDBServerTbl): search host is
(test2.adtest.foo.co.jp)
DEBUG(get_ip_by_name): not found
DEBUG(pgr_createConn): PQsetdbLogin ok!!
DEBUG(get_ip_by_name): send_replicate_packet_to_server
query=SELECT
PGR_SYSTEM_COMMAND_FUNCTION(1,test1,8777,7778)
DEBUG(get_ip_by_name): db:template1 port:5432
user:postgres host:test2.adtest.foo.co.jp query:SELECT
PGR_SYSTEM_COMMAND_FUNCTION(1,test1,8777,7778)
ERROR(get_ip_by_name): PQexec error
DEBUG(get_ip_by_name): sem_unlock[2]
DEBUG(getDBServerTbl): search host is
(test2.adtest.foo.co.jp)
DEBUG(get_ip_by_name): search
host(1112880650):port(5432):db(template1)
DEBUG(get_ip_by_name): found
DEBUG(get_ip_by_name): replicate_packet_send end
DEBUG(get_ip_by_name): wait replicate

次に0系で以下のコマンドを実施したところ、以下のログを出
力

psql -U postgres template1
(10〜20分程度無反応)

DEBUG(get_ip_by_name): replicate main: selected
DEBUG(replicate_loop): replicate_loop selected
DEBUG(read_packet): query size=5
DEBUG(read_packet): read[5] query[BEGIN]
DEBUG(read_packet): query :: BEGIN
DEBUG(replicate_packet_send): cmdSts=T
DEBUG(replicate_packet_send): cmdType=B
DEBUG(replicate_packet_send): port=5432
DEBUG(replicate_packet_send): pid=6786
DEBUG(replicate_packet_send): except_host=test1
DEBUG(replicate_packet_send): from_host=test1
DEBUG(replicate_packet_send): dbName=template1
DEBUG(replicate_packet_send): userName=postgres
DEBUG(replicate_packet_send): recieve sec=1082437309
DEBUG(replicate_packet_send): recieve usec=395971
DEBUG(replicate_packet_send): query_size=5
DEBUG(replicate_packet_send): query=BEGIN
DEBUG(replicate_packet_send): useFlag[2]
DEBUG(get_ip_by_name): same host:5432 @ 4155360a -
5432 @ 4155360a
DEBUG(get_ip_by_name): 5432 @ test1.adtest.foo.co.jp
return trigger
DEBUG(PGRis_need_sync_time): sem_lock[1]
DEBUG(return_result): 128[3,1082437309,395971]
DEBUG(return_result): return_result[3,1082437309,395971]
DEBUG(return_result): 128 send
DEBUG(return_result): wait for answer
DEBUG(return_result): read_answer selected
DEBUG(read_packet): query size=25
DEBUG(read_packet): read[25]
query[PGR_QUERY_DONE_NOTICE_CMD]
DEBUG(read_packet): answer[PGR_QUERY_DONE_NOTICE_CMD]
DEBUG(read_packet): QUERY DONE
DEBUG(read_packet): status of return_result[0]
DEBUG(read_packet): sem_unlock[1]
DEBUG(get_ip_by_name): not same host:5432 @ 4155360a -
5432 @ 4255360a
DEBUG(get_ip_by_name): sem_lock[2]
DEBUG(replicate_packet_send_each_server):
except:5432 @ test1 host:5432 @ test2.adtest.foo.co.jp
DEBUG(replicate_packet_send_each_server): send replicate
to:test2.adtest.foo.co.jp
DEBUG(PGRsend_replicate_packet_to_server):
host(test2.adtest.foo.co.jp) : port(5432)
DEBUG(get_ip_by_name): not found in transaction tbl host
test1 db:template1 pid:6786
DEBUG(get_ip_by_name): not found in getTransactionTbl
DEBUG(get_ip_by_name): not found in transaction tbl host
test1 db:template1 pid:6786
DEBUG(replicate_loop): wait replicate
DEBUG(pgr_createConn): PQsetdbLogin ok!!
DEBUG(insertTransactionTbl): db:template1 port:5432
user:postgres host:test2.adtest.foo.co.jp query:BEGIN
DEBUG(insertTransactionTbl): sync_command(SELECT
PGR_SYSTEM_COMMAND_FUNCTION(3,1082437309,395971) )
DEBUG(insertTransactionTbl): PQexec send :BEGIN
DEBUG(insertTransactionTbl): PQstatus(0)
DEBUG(child_wait): replicate main: selected
DEBUG(replicate_loop): replicate_loop selected
DEBUG(read_packet): query size=75
DEBUG(read_packet): read[75] query[SELECT
PGR_SYSTEM_COMMAND_FUNCTION(2,test1.adtest.foo.co.jp,8777,7778)]
DEBUG(read_packet): query :: SELECT
PGR_SYSTEM_COMMAND_FUNCTION(2,test1.adtest.foo.co.jp,8777,7778)
DEBUG(replicate_packet_send): cmdSts=N
DEBUG(replicate_packet_send): cmdType=
DEBUG(replicate_packet_send): port=4194305
DEBUG(replicate_packet_send): pid=1
DEBUG(replicate_packet_send): except_host=
DEBUG(replicate_packet_send): from_host=
DEBUG(replicate_packet_send): dbName=
DEBUG(replicate_packet_send): userName=
DEBUG(replicate_packet_send): recieve sec=1082437309
DEBUG(replicate_packet_send): recieve usec=654003
DEBUG(replicate_packet_send): query_size=75
DEBUG(replicate_packet_send): query=SELECT
PGR_SYSTEM_COMMAND_FUNCTION(2,test1.adtest.foo.co.jp,8777,7778)
DEBUG(replicate_packet_send): useFlag[2]
DEBUG(PGRis_same_host): not same host:
DEBUG(is_need_response): same_host[0] mode[1] current[0]
DEBUG(is_need_response): sem_lock[1]
DEBUG(replicate_packet_send_each_server): except:4194305@
host:5432 @ test1.adtest.foo.co.jp
DEBUG(replicate_packet_send_each_server): send replicate
to:test1.adtest.foo.co.jp
DEBUG(PGRsend_replicate_packet_to_server):
host(test1.adtest.foo.co.jp) : port(5432)
DEBUG(getDBServerTbl): search host is
(test1.adtest.foo.co.jp)
DEBUG(get_ip_by_name): not found
DEBUG(setDBServerTbl): host:test1.adtest.foo.co.jp dbName:
DEBUG(getDBServerTbl): search host is
(test1.adtest.foo.co.jp)
DEBUG(get_ip_by_name): not found
ERROR(pgr_createConn): PQsetdbLogin failed. close socket!!
DEBUG(replicate_loop): wait replicate
ERROR(pgr_createConn): PQsetdbLogin failed. close socket!!
ERROR(pgr_createConn): PQsetdbLogin failed. close socket!!
ERROR(pgr_createConn): PQsetdbLogin failed. close socket!!
ERROR(pgr_createConn): PQsetdbLogin failed. close socket!!
ERROR(pgr_createConn): dbPersistLogin  timeout
ERROR(pgr_createConn): pgr_createConn failed
ERROR(pgr_createConn): setDBServerTbl failed
DEBUG(pgr_createConn): sem_unlock[1]
DEBUG(PGRis_same_host): not same host:
DEBUG(is_need_response): same_host[0] mode[1] current[1]
DEBUG(is_need_response): sem_lock[2]

さらに1系で以下のコマンドを実施したところ、以下のログを
出力
psql -U postgres template1
(10〜20分程度無反応)

EBUG(replicate_loop): replicate main: selected
DEBUG(replicate_loop): replicate_loop selected
DEBUG(read_packet): query size=5
DEBUG(read_packet): read[5] query[BEGIN]
DEBUG(read_packet): query :: BEGIN
DEBUG(replicate_packet_send): cmdSts=T
DEBUG(replicate_packet_send): cmdType=B
DEBUG(replicate_packet_send): port=5432
DEBUG(replicate_packet_send): pid=1752
DEBUG(replicate_packet_send): except_host=test2
DEBUG(replicate_packet_send): from_host=test2
DEBUG(replicate_packet_send): dbName=template1
DEBUG(replicate_packet_send): userName=postgres
DEBUG(replicate_packet_send): recieve sec=1082437981
DEBUG(replicate_packet_send): recieve usec=686821
DEBUG(replicate_packet_send): query_size=5
DEBUG(replicate_packet_send): query=BEGIN
DEBUG(replicate_packet_send): useFlag[2]
DEBUG(get_ip_by_name): not same host:5432 @ 4255360a -
5432 @ 4155360a
DEBUG(is_need_response): same_host[0] mode[1] current[0]
DEBUG(is_need_response): sem_lock[1]
DEBUG(replicate_packet_send_each_server):
except:5432 @ test2 host:5432 @ test1.adtest.foo.co.jp
DEBUG(replicate_packet_send_each_server): send replicate
to:test1.adtest.foo.co.jp
DEBUG(PGRsend_replicate_packet_to_server):
host(test1.adtest.foo.co.jp) : port(5432)
DEBUG(get_ip_by_name): not found in transaction tbl host
test2 db:template1 pid:1752
DEBUG(get_ip_by_name): not found in getTransactionTbl
DEBUG(get_ip_by_name): not found in transaction tbl host
test2 db:template1 pid:1752
DEBUG(pgr_createConn): PQsetdbLogin ok!!
DEBUG(insertTransactionTbl): db:template1 port:5432
user:postgres host:test1.adtest.foo.co.jp query:BEGIN
DEBUG(insertTransactionTbl): sync_command(SELECT
PGR_SYSTEM_COMMAND_FUNCTION(3,1082437981,686821) )
DEBUG(insertTransactionTbl): PQexec send :BEGIN
DEBUG(insertTransactionTbl): PQstatus(0)
DEBUG(replicate_loop): wait replicate
DEBUG(replicate_loop): replicate main: selected
DEBUG(replicate_loop): replicate_loop selected
DEBUG(read_packet): query size=54
DEBUG(read_packet): read[54] query[SELECT
PGR_SYSTEM_COMMAND_FUNCTION(2,test1,8777,7779)]
DEBUG(read_packet): query :: SELECT
PGR_SYSTEM_COMMAND_FUNCTION(2,test1,8777,7779)
DEBUG(replicate_packet_send): cmdSts=N
DEBUG(replicate_packet_send): cmdType=
DEBUG(replicate_packet_send): port=4194305
DEBUG(replicate_packet_send): pid=1
DEBUG(replicate_packet_send): except_host=
DEBUG(replicate_packet_send): from_host=
DEBUG(replicate_packet_send): dbName=
DEBUG(replicate_packet_send): userName=
DEBUG(replicate_packet_send): recieve sec=1082437981
DEBUG(replicate_packet_send): recieve usec=962491
DEBUG(replicate_packet_send): query_size=54
DEBUG(replicate_packet_send): query=SELECT
PGR_SYSTEM_COMMAND_FUNCTION(2,test1,8777,7779)
DEBUG(replicate_packet_send): useFlag[2]
DEBUG(PGRis_same_host): not same host:
DEBUG(is_need_response): same_host[0] mode[1] current[0]
DEBUG(is_need_response): sem_lock[1]
DEBUG(replicate_loop): wait replicate

ここで、0系で以下のコマンドを投入すると、
insert into shinamono values('alily',54321);
1系にレプリケーションされていませんでした。


__________________________________________________
Do You Yahoo!?
http://bb.yahoo.co.jp/




pgcluster メーリングリストの案内