[pgcluster: 549] Re: FreeBSD 5.2.1 + PGCluster 1.0.7 createdb 失敗

me477494 @ members.interq.or.jp me477494 @ members.interq.or.jp
2004年 9月 18日 (土) 23:32:15 JST


幾田です。
お世話になっております。

三谷様 ご回答ありがとうございました。

===============================================================================
>> ▼/etc/hosts
>> 192.168.0.21            lb1.xxxxxx.com lb1
>> 192.168.0.21            lb1.xxxxxx.com.
>すみません,よく知らないので教えてください.
>2行目の記述はどういう意味があるでしょうか.

OSインストール時に自動設定されていたので
そのままにしておりました。

特に利用されない設定だと思われますので
該当行を削除致しました。

================================================================================
試験の容易さから
1.クラスタDBを1台構成にする
2.クラスタDBとレプリケーションサーバを同一サーバ内で動作させる
の順番で試験を行いました。

結果から申し上げますと
状態は改善されませんでした。

どちらの方法であっても
psql コマンドを実行した後
かなり時間が経過してから
レプリケーションサーバのログに
「DEBUG(pgr_createConn): PQsetdbLogin ok!!」
と出力されます。

================================================================================
>Cluster DBを1台だけにして,cl1からcl3まで1台づつ接続してみてはどうでしょうか.

前回のメールでお伝えした環境から
cl2 と cl3 を除いて試験を行いました。

▼試験サーバの構成
lb1.xxxxxx.com 192.168.0.21
cl1.xxxxxx.com 192.168.0.31
rp1.xxxxxx.com 192.168.0.41

▼環境設定ファイル内容
--<pglb.conf>--------------------------------------------------------------------
<Cluster_Server_Info>
    <Host_Name> cl1.xxxxxx.com </Host_Name>
    <Port> 5432 </Port>
    <Max_Connect> 125 </Max_Connect>
</Cluster_Server_Info>

<Backend_Socket_Dir>    /tmp     </Backend_Socket_Dir>
<Receive_Port>          5432     </Receive_Port>
<Recovery_Port>         7780     </Recovery_Port>
<Max_Cluster_Num>        350     </Max_Cluster_Num>
<Use_Connection_Pooling>  no     </Use_Connection_Pooling>
--------------------------------------------------------------------------------

--<pgreplicate.conf>------------------------------------------------------------
<Cluster_Server_Info>
    <Host_Name> cl1.xxxxxx.com </Host_Name>
    <Port> 5432 </Port>
    <Recovery_Port> 7779 </Recovery_Port>
</Cluster_Server_Info>

<LoadBalance_Server_Info>
        <Host_Name> lb1.xxxxxx.com </Host_Name>
        <Recovery_Port> 7780 </Recovery_Port>
</LoadBalance_Server_Info>

<Replication_Port> 8777 </Replication_Port>
<Recovery_Port> 7778 </Recovery_Port>
--------------------------------------------------------------------------------

※cluster.conf は修正しておりません。

▼レプリケーションサーバのログ
--<レプリケーションサーバ起動時>------------------------------------------------
DEBUG(init_server_tbl): /usr/local/pgsql/data/pgreplicate.sts open ok

DEBUG(init_server_tbl): PGR_Get_Conf_Data ok
DEBUG(init_server_tbl): LoadBalanceTbl allocate ok
DEBUG(init_server_tbl): CascadeTbl shmget ok
DEBUG(init_server_tbl): CascadeTbl shmat ok
DEBUG(init_server_tbl): CascadeInf shmget ok
DEBUG(init_server_tbl): CascadeInf shmat ok
DEBUG(init_server_tbl): CommitLog shmget ok
DEBUG(init_server_tbl): Commit_Log_Tbl shmat ok
DEBUG(init_server_tbl): Conf data read ok
DEBUG(init_server_tbl): HostTbl shmget ok
DEBUG(init_server_tbl): HostTbl shmat ok
DEBUG(write_log_file): LockWaitTbl shmget ok
DEBUG(write_log_file): LockWaitTbl shmat ok
DEBUG(replicate_main): replicate main 8777 port bind OK

DEBUG(PGRreplicate_packet_send): cmdSts=N

DEBUG(PGRreplicate_packet_send): cmdType=

DEBUG(PGRreplicate_packet_send): port=0

DEBUG(PGRreplicate_packet_send): pid=0

DEBUG(PGRreplicate_packet_send): except_host=

DEBUG(PGRreplicate_packet_send): from_host=rp1.xxxxxx.com

DEBUG(PGRreplicate_packet_send): dbName=template1

DEBUG(PGRreplicate_packet_send): userName=postgres

DEBUG(PGRreplicate_packet_send): recieve sec=0

DEBUG(PGRreplicate_packet_send): recieve usec=0

DEBUG(PGRreplicate_packet_send): query_size=64

DEBUG(PGRreplicate_packet_send): query=SELECT PGR_SYSTEM_COMMAND_FUNCTION(1,'rp1.xxxxxx.com',8777,7778)

DEBUG(PGRreplicate_packet_send): useFlag[2]
DEBUG(PGRis_same_host): not same host:
DEBUG(is_need_response): sem_lock[1]
DEBUG(PGRreplicate_packet_send_each_server): except:0@ host:5432 @ cl1.xxxxxx.com

DEBUG(PGRreplicate_packet_send_each_server): send replicate to:cl1.xxxxxx.com

DEBUG(PGRsend_replicate_packet_to_server): host(cl1.xxxxxx.com) : port(5432)
DEBUG(getTransactionTbl): not found in getTransactionTbl
DEBUG(pgr_createConn): PQsetdbLogin host[cl1.xxxxxx.com] port[5432] db[template1] user[postgres]
DEBUG(PGRrecovery_main): PGRrecovery_main bind port 7778
--------------------------------------------------------------------------------

--<ロードバランサから psql を用いた接続時>--------------------------------------
DEBUG(pgr_createConn): PQsetdbLogin ok!!
DEBUG(insertTransactionTbl): db:template1 port:5432 user:postgres host:cl1.xxxxxx.com query:SELECT 
PGR_SYSTEM_COMMAND_FUNCTION(1,'rp1.xxxxxx.com',8777,7778)
DEBUG(insertTransactionTbl): sem_lock[2]
DEBUG(PGRsem_lock): sem_unlock[1]
DEBUG(getTransactionTbl): hit !! transaction tbl host cl1.xxxxxx.com db:template1 pid:0
DEBUG(getTransactionTbl): sem_unlock[2]
DEBUG(PGRsem_unlock): PGRreplicate_packet_send end
DEBUG(PGRsem_unlock): wait replicate

DEBUG(PGRsem_unlock): replicate main: selected

DEBUG(replicate_loop): replicate_loop selected

DEBUG(PGRread_packet): query size=28
DEBUG(replicate_loop): wait replicate

DEBUG(replicate_loop): replicate main: selected

DEBUG(PGRread_packet): read[28] query[SET client_encoding = 'SJIS']
DEBUG(PGRread_packet): query :: SET client_encoding = 'SJIS'

DEBUG(PGRreplicate_packet_send): cmdSts=Q

DEBUG(PGRreplicate_packet_send): cmdType=T

DEBUG(PGRreplicate_packet_send): port=5432

DEBUG(PGRreplicate_packet_send): pid=43897
DEBUG(replicate_loop): replicate_loop selected

DEBUG(PGRread_packet): query size=28
DEBUG(PGRread_packet): read[28] query[SET client_encoding = 'SJIS']
DEBUG(PGRread_packet): query :: SET client_encoding = 'SJIS'

DEBUG(PGRreplicate_packet_send): cmdSts=Q

DEBUG(PGRreplicate_packet_send): cmdType=T

DEBUG(PGRreplicate_packet_send): port=5432

DEBUG(PGRreplicate_packet_send): pid=43901

DEBUG(PGRreplicate_packet_send): except_host=cl1.xxxxxx.com

DEBUG(PGRreplicate_packet_send): from_host=cl1.xxxxxx.com

DEBUG(replicate_loop): wait replicate


DEBUG(PGRreplicate_packet_send): except_host=cl1.xxxxxx.com

DEBUG(PGRreplicate_packet_send): from_host=cl1.xxxxxx.com

DEBUG(PGRreplicate_packet_send): dbName=template1

DEBUG(PGRreplicate_packet_send): userName=postgres

DEBUG(PGRreplicate_packet_send): recieve sec=1095473625

DEBUG(PGRreplicate_packet_send): recieve usec=899030

DEBUG(PGRreplicate_packet_send): query_size=28

DEBUG(PGRreplicate_packet_send): query=SET client_encoding = 'SJIS'

DEBUG(PGRreplicate_packet_send): useFlag[2]
DEBUG(PGRreplicate_packet_send): dbName=mc

DEBUG(PGRreplicate_packet_send): userName=postgres

DEBUG(PGRreplicate_packet_send): recieve sec=1095473625

DEBUG(PGRreplicate_packet_send): recieve usec=899272

DEBUG(PGRreplicate_packet_send): query_size=28

DEBUG(PGRreplicate_packet_send): query=SET client_encoding = 'SJIS'

DEBUG(PGRreplicate_packet_send): useFlag[2]
DEBUG(PGRis_same_host): 5432 @ cl1.xxxxxx.com return trigger
DEBUG(is_need_sync_time): sem_lock[1]
DEBUG(PGRreturn_result): PGRreturn_result[]
DEBUG(PGRreturn_result): 128 send
DEBUG(PGRreturn_result): status of PGRreturn_result[0]
DEBUG(PGRreturn_result): sem_lock[2]
DEBUG(PGRsem_lock): sem_unlock[1]
DEBUG(PGRis_same_host): 5432 @ cl1.xxxxxx.com return trigger
DEBUG(is_need_sync_time): sem_lock[1]
DEBUG(PGRreturn_result): PGRreturn_result[]
DEBUG(PGRreturn_result): 128 send
DEBUG(PGRreturn_result): status of PGRreturn_result[0]
DEBUG(PGRreturn_result): sem_lock[2]
DEBUG(delete_template): sem_unlock[2]
DEBUG(PGRsem_lock): sem_unlock[1]
DEBUG(getTransactionTbl): sem_unlock[2]
DEBUG(PGRsem_unlock): PGRreplicate_packet_send end
DEBUG(PGRsem_unlock): PGRreplicate_packet_send end
DEBUG(PGRsem_unlock): replicate_loop selected

ERROR(PGRread_packet): recv failed: (Connection reset by peer)
ERROR(PGRread_packet): session closed
DEBUG(child_wait): replicate main: selected

DEBUG(PGRsem_unlock): replicate_loop selected

DEBUG(PGRread_packet): query size=25
DEBUG(PGRread_packet): read[25] query[PGR_QUERY_DONE_NOTICE_CMD]
DEBUG(PGRread_packet): replicate_loop selected

DEBUG(PGRread_packet): query size=25
DEBUG(PGRread_packet): read[25] query[PGR_QUERY_DONE_NOTICE_CMD]
DEBUG(PGRread_packet): replicate_loop selected

DEBUG(PGRread_packet): query size=25
DEBUG(PGRread_packet): read[25] query[PGR_QUERY_DONE_NOTICE_CMD]
--------------------------------------------------------------------------------

================================================================================
>ためしに,1台のマシンにCluster DB×3とレプリケーションサーバを稼動させてみてください.

前回のメールでお伝えした環境のうち
クラスタDBとして稼動していた cl1 サーバ に
レプリケーションサーバをインストールし
cl1 サーバのみで試験を行いました。

▼試験サーバの構成
cl1.xxxxxx.com 192.168.0.31

▼環境設定ファイル内容
--<pgreplicate.conf>------------------------------------------------------------
<Cluster_Server_Info>
    <Host_Name>   cl1.xxxxxx.com  </Host_Name>
    <Port>        5432                </Port>
    <Recovery_Port>       7779        </Recovery_Port>
</Cluster_Server_Info>

<Replication_Port>    8777            </Replication_Port>
<Recovery_Port>       7778            </Recovery_Port>
--------------------------------------------------------------------------------

--<cluster.conf>----------------------------------------------------------------
<Replicate_Server_Info>
        <Host_Name> cl1.xxxxxx.com </Host_Name>
        <Port> 8777 </Port>
        <Recovery_Port> 7778 </Recovery_Port>
</Replicate_Server_Info>

<Recovery_Port> 7779 </Recovery_Port>
<Rsync_Path> /usr/local/bin/rsync </Rsync_Path>
<Rsync_Option> ssh -1 </Rsync_Option>
<When_Stand_Alone> read_only  </When_Stand_Alone>
--------------------------------------------------------------------------------

▼レプリケーションサーバのログ
--<レプリケーションサーバ起動時>------------------------------------------------
DEBUG(init_server_tbl): /usr/local/pgsql/data/pgreplicate.sts open ok

DEBUG(init_server_tbl): PGR_Get_Conf_Data ok
DEBUG(init_server_tbl): LoadBalanceTbl allocate ok
DEBUG(init_server_tbl): CascadeTbl shmget ok
DEBUG(init_server_tbl): CascadeTbl shmat ok
DEBUG(init_server_tbl): CascadeInf shmget ok
DEBUG(init_server_tbl): CascadeInf shmat ok
DEBUG(init_server_tbl): CommitLog shmget ok
DEBUG(init_server_tbl): Commit_Log_Tbl shmat ok
DEBUG(init_server_tbl): Conf data read ok
DEBUG(init_server_tbl): HostTbl shmget ok
DEBUG(init_server_tbl): HostTbl shmat ok
DEBUG(write_log_file): LockWaitTbl shmget ok
DEBUG(write_log_file): LockWaitTbl shmat ok
DEBUG(replicate_main): replicate main 8777 port bind OK

DEBUG(PGRreplicate_packet_send): cmdSts=N

DEBUG(PGRreplicate_packet_send): cmdType=

DEBUG(PGRreplicate_packet_send): port=0

DEBUG(PGRreplicate_packet_send): pid=0

DEBUG(PGRreplicate_packet_send): except_host=

DEBUG(PGRreplicate_packet_send): from_host=cl1.xxxxxx.com

DEBUG(PGRreplicate_packet_send): dbName=template1

DEBUG(PGRreplicate_packet_send): userName=postgres

DEBUG(PGRreplicate_packet_send): recieve sec=0

DEBUG(PGRreplicate_packet_send): recieve usec=0

DEBUG(PGRreplicate_packet_send): query_size=64

DEBUG(PGRreplicate_packet_send): query=SELECT PGR_SYSTEM_COMMAND_FUNCTION(1,'cl1.xxxxxx.com',8777,7778)

DEBUG(PGRreplicate_packet_send): useFlag[2]
DEBUG(PGRis_same_host): not same host:
DEBUG(is_need_response): sem_lock[1]
DEBUG(PGRreplicate_packet_send_each_server): except:0@ host:5432 @ cl1.xxxxxx.com

DEBUG(PGRreplicate_packet_send_each_server): send replicate to:cl1.xxxxxx.com

DEBUG(PGRsend_replicate_packet_to_server): host(cl1.xxxxxx.com) : port(5432)
DEBUG(getTransactionTbl): not found in getTransactionTbl
DEBUG(pgr_createConn): PQsetdbLogin host[cl1.xxxxxx.com] port[5432] db[template1] user[postgres]
DEBUG(PGRrecovery_main): PGRrecovery_main bind port 7778
--------------------------------------------------------------------------------

--<マスタDBから psql を用いた接続時>--------------------------------------------
DEBUG(pgr_createConn): PQsetdbLogin ok!!
DEBUG(insertTransactionTbl): db:template1 port:5432 user:postgres host:cl1.xxxxxx.com query:SELECT 
PGR_SYSTEM_COMMAND_FUNCTION(1,'cl1.xxxxxx.com',8777,7778)
DEBUG(insertTransactionTbl): sem_lock[2]
DEBUG(PGRsem_lock): sem_unlock[1]
DEBUG(getTransactionTbl): hit !! transaction tbl host cl1.xxxxxx.com db:template1 pid:0
DEBUG(getTransactionTbl): sem_unlock[2]
DEBUG(PGRsem_unlock): PGRreplicate_packet_send end
DEBUG(PGRsem_unlock): wait replicate

DEBUG(PGRsem_unlock): replicate main: selected

DEBUG(replicate_loop): wait replicate

DEBUG(replicate_loop): replicate main: selected

DEBUG(replicate_loop): wait replicate

DEBUG(replicate_loop): replicate_loop selected

DEBUG(PGRread_packet): query size=28
DEBUG(PGRread_packet): read[28] query[SET client_encoding = 'SJIS']
DEBUG(PGRread_packet): query :: SET client_encoding = 'SJIS'

DEBUG(PGRreplicate_packet_send): cmdSts=Q

DEBUG(PGRreplicate_packet_send): cmdType=T

DEBUG(PGRreplicate_packet_send): port=5432

DEBUG(PGRreplicate_packet_send): pid=46234

DEBUG(PGRreplicate_packet_send): except_host=cl1.xxxxxx.com

DEBUG(PGRreplicate_packet_send): from_host=cl1.xxxxxx.com

DEBUG(PGRreplicate_packet_send): dbName=template1

DEBUG(PGRreplicate_packet_send): userName=postgres

DEBUG(PGRreplicate_packet_send): recieve sec=1095511546

DEBUG(PGRreplicate_packet_send): recieve usec=257212

DEBUG(PGRreplicate_packet_send): query_size=28

DEBUG(PGRreplicate_packet_send): query=SET client_encoding = 'SJIS'

DEBUG(PGRreplicate_packet_send): useFlag[2]
DEBUG(PGRis_same_host): 5432 @ cl1.xxxxxx.com return trigger
DEBUG(is_need_sync_time): sem_lock[1]
DEBUG(PGRreturn_result): PGRreturn_result[]
DEBUG(replicate_loop): replicate_loop selected

DEBUG(PGRread_packet): query size=28
DEBUG(PGRread_packet): read[28] query[SET client_encoding = 'SJIS']
DEBUG(PGRread_packet): query :: SET client_encoding = 'SJIS'

DEBUG(PGRreplicate_packet_send): cmdSts=Q

DEBUG(PGRreplicate_packet_send): cmdType=T

DEBUG(PGRreplicate_packet_send): port=5432

DEBUG(PGRreplicate_packet_send): pid=46237

DEBUG(PGRreplicate_packet_send): except_host=cl1.xxxxxx.com

DEBUG(PGRreplicate_packet_send): from_host=cl1.xxxxxx.com

DEBUG(PGRreplicate_packet_send): dbName=mc

DEBUG(PGRreplicate_packet_send): userName=postgres

DEBUG(PGRreplicate_packet_send): recieve sec=1095511546

DEBUG(PGRreplicate_packet_send): recieve usec=257670

DEBUG(PGRreplicate_packet_send): query_size=28

DEBUG(PGRreplicate_packet_send): query=SET client_encoding = 'SJIS'

DEBUG(PGRreplicate_packet_send): useFlag[2]
DEBUG(PGRis_same_host): 5432 @ cl1.xxxxxx.com return trigger
DEBUG(is_need_sync_time): sem_lock[1]
DEBUG(PGRreturn_result): 128 send
DEBUG(PGRreturn_result): status of PGRreturn_result[0]
DEBUG(PGRreturn_result): sem_lock[2]
DEBUG(PGRsem_lock): sem_unlock[1]
DEBUG(PGRreturn_result): PGRreturn_result[]
DEBUG(getTransactionTbl): sem_unlock[2]
DEBUG(PGRsem_unlock): PGRreplicate_packet_send end
DEBUG(PGRsem_unlock): replicate_loop selected

ERROR(PGRread_packet): recv failed: (Connection reset by peer)
ERROR(PGRread_packet): session closed
DEBUG(PGRreturn_result): 128 send
DEBUG(PGRreturn_result): status of PGRreturn_result[0]
DEBUG(PGRreturn_result): sem_lock[2]
DEBUG(PGRsem_lock): sem_unlock[1]
DEBUG(delete_template): sem_unlock[2]
DEBUG(PGRsem_unlock): PGRreplicate_packet_send end
DEBUG(PGRsem_unlock): replicate_loop selected

DEBUG(PGRread_packet): query size=25
DEBUG(PGRread_packet): read[25] query[PGR_QUERY_DONE_NOTICE_CMD]
DEBUG(child_wait): replicate main: selected

DEBUG(PGRread_packet): replicate_loop selected

DEBUG(PGRread_packet): query size=25
DEBUG(PGRread_packet): read[25] query[PGR_QUERY_DONE_NOTICE_CMD]
DEBUG(PGRread_packet): replicate_loop selected

DEBUG(PGRread_packet): query size=25
DEBUG(PGRread_packet): read[25] query[PGR_QUERY_DONE_NOTICE_CMD]
--------------------------------------------------------------------------------

================================================================================
同一サーバ内での接続に時間が掛かっている事から
ネットワーク機器に問題があるのではなく
サーバ設定そのものに問題があるのではないか?と疑っております。

ただ、心当たりは IPfilter くらいしかないので
下記コマンドを実行し IPfilter を停止し
クラスタDBとレプリケーションサーバを同一サーバで動作させましたが
特に状態は改善されませんでした。

----------------------------------------
/etc/rc.d/ipfilter stop
----------------------------------------

以上です。

何かお気づきの点がございましたら
ご指摘頂ければ幸いに思います。




pgcluster メーリングリストの案内