[pgcluster: 950] Re: ロードバランサが固まります

a.mitani @ sra-europe.com a.mitani @ sra-europe.com
2006年 12月 27日 (水) 17:06:37 JST


篠崎さん,こんにちは.

ロードバランサがロックしたときのnetstatの状態が分かりましたら教えてください.
あと,ロードバランサのホスト名にlocalhostを指定されているようですが,ここを
ホスト名にしてみてください.
それから,設定ファイルの内容がPGClusterのバージョンと合っていないようですの
で,念のため,他のバージョンのライブラリなどが混在していないか確認してみてく
ださい.
なお,1.5.0の最新版はrc12ですので,こちらもお試しください.

三谷@オランダ

>
> 今回は、また別の件でメールさせていただきます。
>
> システムの安定度を確認するために、
> PHPからselectとinsertをトランザクションを張らずに1回呼ぶプログラムを
> 外部から(http経由で)毎秒40回程度回すベンチを流してみたのですが、
> だいたい20分から30分程度でロードバランサが固まってしまいます。
>
> ロードバランサをリスタートすると復活するのですが、
> 書き込みが間に合ってない感じでしょうか?
>
> なにか対策等ありましたら、ご教授ください。
> よろしくお願いいたします。
>
> PGCluster-1.5.0rc10
>
> pglb.conf
>
> <Cluster_Server_Info>
>     <Host_Name>   db01             </Host_Name>
>     <Port>        5432                </Port>
>     <Max_Connect> 128                </Max_Connect>
> </Cluster_Server_Info>
> <Cluster_Server_Info>
>     <Host_Name>   db02             </Host_Name>
>     <Port>        5432                </Port>
>     <Max_Connect> 128                </Max_Connect>
> </Cluster_Server_Info>
> <Cluster_Server_Info>
>     <Host_Name>   db03             </Host_Name>
>     <Port>        5432                </Port>
>     <Max_Connect> 128                </Max_Connect>
> </Cluster_Server_Info>
>
> <Backend_Socket_Dir>    /tmp     </Backend_Socket_Dir>
> <Receive_Port>          5432     </Receive_Port>
> <Recovery_Port>         6001     </Recovery_Port>
> <Max_Cluster_Num>        4     </Max_Cluster_Num>
> <Use_Connection_Pooling>  yes     </Use_Connection_Pooling>
>
> pgreplicate.conf
>
> <Cluster_Server_Info>
>     <Host_Name>           db01              </Host_Name>
>     <Port>                5432                 </Port>
>     <Recovery_Port>       7001                 </Recovery_Port>
> </Cluster_Server_Info>
> <Cluster_Server_Info>
>     <Host_Name>           db02              </Host_Name>
>     <Port>                5432                 </Port>
>     <Recovery_Port>       7001                 </Recovery_Port>
> </Cluster_Server_Info>
> <Cluster_Server_Info>
>     <Host_Name>           db03              </Host_Name>
>     <Port>                5432                 </Port>
>     <Recovery_Port>       7001                 </Recovery_Port>
> </Cluster_Server_Info>
>
> <LoadBalance_Server_Info>
>     <Host_Name>       localhost                   </Host_Name>
>     <Recovery_Port>   6001                      </Recovery_Port>
> </LoadBalance_Server_Info>
>
> <LoadBalance_Server_Info>
>     <Host_Name>       lb02                   </Host_Name>
>     <Recovery_Port>   6001                      </Recovery_Port>
> </LoadBalance_Server_Info>
>
> <Replication_Port>    8001            </Replication_Port>
> <Recovery_Port>       8101            </Recovery_Port>
> <Response_Mode>       normal          </Response_Mode>
> <Status_Log_File>     /dbdata/pgreplicate.sts </Status_Log_File>
> <Error_Log_File>      /dbdata/pgreplicate.log </Error_Log_File>
> <Use_Replication_Log> yes              </Use_Replication_Log>
> <RLOG_Port>           8301            </RLOG_Port>
>
> 止まった時あたりの
> ロードバランサログ以下のログが繰り返されます。
>
> 2006-12-26 11:52:51 [29260] DEBUG:pool_process_query():read kind from
> backend C
> 2006-12-26 11:52:51 [29260] DEBUG:pool_process_query():read kind from
> backend pending data Z len: 5 po: 11
> 2006-12-26 11:52:51 [29260] DEBUG:ReadyForQuery(): message length: 5
> 2006-12-26 11:52:51 [29260] DEBUG:ReadyForQuery(): transaction state: I
> 2006-12-26 11:52:51 [29260]
> DEBUG:pool_connection_pool_timer():pool_connection_pool_timer: called
> 2006-12-26 11:52:51 [29222] DEBUG:pool_process_query():read kind from
> backend C
> 2006-12-26 11:52:51 [29222] DEBUG:pool_process_query():read kind from
> backend pending data Z len: 5 po: 11
> 2006-12-26 11:52:51 [29222] DEBUG:ReadyForQuery(): message length: 5
> 2006-12-26 11:52:51 [29222] DEBUG:ReadyForQuery(): transaction state: I
> 2006-12-26 11:52:51 [29222]
> DEBUG:pool_connection_pool_timer():pool_connection_pool_timer: called
> 2006-12-26 11:52:51 [29343] DEBUG:pool_process_query():read kind from
> backend C
> 2006-12-26 11:52:51 [29343] DEBUG:pool_process_query():read kind from
> backend pending data Z len: 5 po: 11
> 2006-12-26 11:52:51 [29343] DEBUG:ReadyForQuery(): message length: 5
> 2006-12-26 11:52:51 [29343] DEBUG:ReadyForQuery(): transaction state: I
> 2006-12-26 11:52:51 [29343]
> DEBUG:pool_connection_pool_timer():pool_connection_pool_timer: called
> 2006-12-26 11:52:51 [29391] DEBUG:pool_process_query():read kind from
> backend C
> 2006-12-26 11:52:51 [29391] DEBUG:pool_process_query():read kind from
> backend pending data Z len: 5 po: 11
> 2006-12-26 11:52:51 [29391] DEBUG:ReadyForQuery(): message length: 5
> 2006-12-26 11:52:51 [29391] DEBUG:ReadyForQuery(): transaction state: I
> 2006-12-26 11:52:51 [29391]
> DEBUG:pool_connection_pool_timer():pool_connection_pool_timer: called
> 2006-12-26 11:52:51 [29470] DEBUG:pool_process_query():read kind from
> backend C
> 2006-12-26 11:52:51 [29470] DEBUG:pool_process_query():read kind from
> backend pending data Z len: 5 po: 11
> 2006-12-26 11:52:51 [29470] DEBUG:ReadyForQuery(): message length: 5
> 2006-12-26 11:52:51 [29470] DEBUG:ReadyForQuery(): transaction state: I
> 2006-12-26 11:52:51 [29470]
> DEBUG:pool_connection_pool_timer():pool_connection_pool_timer: called
> 2006-12-26 11:52:51 [29471] DEBUG:pool_process_query():read kind from
> backend C
> 2006-12-26 11:52:51 [29471] DEBUG:pool_process_query():read kind from
> backend pending data Z len: 5 po: 11
> 2006-12-26 11:52:51 [29471] DEBUG:ReadyForQuery(): message length: 5
> 2006-12-26 11:52:51 [29471] DEBUG:ReadyForQuery(): transaction state: I
> 2006-12-26 11:52:51 [29471]
> DEBUG:pool_connection_pool_timer():pool_connection_pool_timer: called
>
>
>
> レプリログ以下のログ51分台が止まった時、ロードバランサを停止した時に53分
> のログが一気に出力されました。
>
> 2006-12-26 11:52:51 [29746] DEBUG:same host
> 2006-12-26 11:52:51 [30169] DEBUG:PGRread_packet():PG_read_query returns
> PGR_QUERY_DONE_NOTICE_CMD.
> 2006-12-26 11:52:51 [29184] DEBUG:start thread_send_cluster()
> 2006-12-26 11:52:51 [29184]
> DEBUG:send_replicate_packet_to_server():sync_command(SELECT
> PGR_SYSTEM_COMMAND_FUNCTION(8,9574,0,1) )
> 2006-12-26 11:52:51 [29184] DEBUG:start thread_send_cluster()
> 2006-12-26 11:52:51 [29184]
> DEBUG:send_replicate_packet_to_server():sync_command(SELECT
> PGR_SYSTEM_COMMAND_FUNCTION(8,9574,0,1) )
> 2006-12-26 11:52:51 [29184]
> DEBUG:send_replicate_packet_to_server():sync_command returns
> SYSTEM_COMMAND
> 2006-12-26 11:52:51 [29184]
> DEBUG:send_replicate_packet_to_server():sync_command returns
> SYSTEM_COMMAND
> 2006-12-26 11:52:51 [29184] DEBUG:send_replicate_packet_to_server():PQexec
> send :RESET ALL
> 2006-12-26 11:52:51 [29184] DEBUG:send_replicate_packet_to_server():PQexec
> returns :RESET
> 2006-12-26 11:52:51 [29184] DEBUG:thread_send_cluster():return value from
> send_replicate_packet_to_server() is 0
> 2006-12-26 11:52:51 [29184] DEBUG:thread_send_cluster():pthread_exit[0]
> 2006-12-26 11:52:51 [29184] DEBUG:send_replicate_packet_to_server():PQexec
> send :RESET ALL
> 2006-12-26 11:52:51 [29184] DEBUG:send_replicate_packet_to_server():PQexec
> returns :RESET
> 2006-12-26 11:52:51 [29184] DEBUG:thread_send_cluster():return value from
> send_replicate_packet_to_server() is 0
> 2006-12-26 11:52:51 [29184] DEBUG:thread_send_cluster():pthread_exit[1]
> 2006-12-26 11:52:51 [29746]
> DEBUG:PGRreturn_result():PGRreturn_result[8,9574,0,1]
> 2006-12-26 11:52:51 [29746] DEBUG:PGRreturn_result():128 send
> 2006-12-26 11:52:51 [29746] DEBUG:end thread_send_source()
> 2006-12-26 11:52:51 [29746] DEBUG:sem_unlock[1]
> 2006-12-26 11:52:51 [29738] DEBUG:sem_lock [1] got it
> 2006-12-26 11:52:51 [29738] DEBUG:same host
> 2006-12-26 11:52:51 [29184] DEBUG:start thread_send_cluster()
> 2006-12-26 11:52:51 [29184]
> DEBUG:send_replicate_packet_to_server():sync_command(SELECT
> PGR_SYSTEM_COMMAND_FUNCTION(8,9575,0,1) )
> 2006-12-26 11:52:51 [29184] DEBUG:start thread_send_cluster()
> 2006-12-26 11:52:51 [29184]
> DEBUG:send_replicate_packet_to_server():sync_command(SELECT
> PGR_SYSTEM_COMMAND_FUNCTION(8,9575,0,1) )
> 2006-12-26 11:52:51 [29184]
> DEBUG:send_replicate_packet_to_server():sync_command returns
> SYSTEM_COMMAND
> 2006-12-26 11:52:51 [29184]
> DEBUG:send_replicate_packet_to_server():sync_command returns
> SYSTEM_COMMAND
> 2006-12-26 11:52:51 [29184] DEBUG:send_replicate_packet_to_server():PQexec
> send :RESET ALL
> 2006-12-26 11:52:51 [29184] DEBUG:send_replicate_packet_to_server():PQexec
> returns :RESET
> 2006-12-26 11:52:51 [29184] DEBUG:thread_send_cluster():return value from
> send_replicate_packet_to_server() is 0
> 2006-12-26 11:52:51 [29184] DEBUG:thread_send_cluster():pthread_exit[0]
> 2006-12-26 11:52:51 [29184] DEBUG:send_replicate_packet_to_server():PQexec
> send :RESET ALL
> 2006-12-26 11:52:51 [29184] DEBUG:send_replicate_packet_to_server():PQexec
> returns :RESET
> 2006-12-26 11:52:51 [29184] DEBUG:thread_send_cluster():return value from
> send_replicate_packet_to_server() is 0
> 2006-12-26 11:52:51 [29184] DEBUG:thread_send_cluster():pthread_exit[1]
> 2006-12-26 11:52:51 [29738]
> DEBUG:PGRreturn_result():PGRreturn_result[8,9575,0,1]
> 2006-12-26 11:52:51 [29738] DEBUG:PGRreturn_result():128 send
> 2006-12-26 11:52:51 [29738] DEBUG:end thread_send_source()
> 2006-12-26 11:52:51 [29738] DEBUG:sem_unlock[1]
> 2006-12-26 11:52:51 [29738] DEBUG:PGRread_packet():PG_read_query returns
> PGR_QUERY_DONE_NOTICE_CMD.
> 2006-12-26 11:52:51 [29746] DEBUG:PGRread_packet():PG_read_query returns
> PGR_QUERY_DONE_NOTICE_CMD.
> 2006-12-26 11:58:53 [29673] DEBUG:PGRread_packet():PG_read_query returns .
> 2006-12-26 11:58:53 [29717] DEBUG:PGRread_packet():PG_read_query returns .
> 2006-12-26 11:58:53 [30007] DEBUG:PGRread_packet():PG_read_query returns .
> 2006-12-26 11:58:53 [29697] DEBUG:PGRread_packet():PG_read_query returns .
> 2006-12-26 11:58:53 [30010] DEBUG:PGRread_packet():PG_read_query returns .
> 2006-12-26 11:58:53 [29704] DEBUG:PGRread_packet():PG_read_query returns .
> 2006-12-26 11:58:53 [30021] DEBUG:PGRread_packet():PG_read_query returns .
> 2006-12-26 11:58:53 [29720] DEBUG:PGRread_packet():PG_read_query returns .
> 2006-12-26 11:58:53 [30032] DEBUG:PGRread_packet():PG_read_query returns .
> 2006-12-26 11:58:53 [29185] DEBUG:PGRread_packet():PG_read_query returns .
> 2006-12-26 11:58:53 [29185] DEBUG:do_rlog():got result:cmdSys='S'
> 2006-12-26 11:58:53 [29185] DEBUG:do_rlog(): CMD_STS_TRANSACTION_ABORT
> 2006-12-26 11:58:53 [29185] DEBUG:reconfirm_commit():PGR_NOT_YET_COMMIT
> 2006-12-26 11:58:53 [29185] DEBUG:do_rlog(): SYS_CALL process done
> 2006-12-26 11:58:53 [29185] DEBUG:do_rlog():process result done:cmdSys='S'
> 2006-12-26 11:58:53 [29185] DEBUG:PGRread_packet():PG_read_query returns .
> 2006-12-26 11:58:53 [29185] DEBUG:do_rlog():got result:cmdSys='S'
> 2006-12-26 11:58:53 [29185] DEBUG:do_rlog(): CMD_STS_TRANSACTION_ABORT
> 2006-12-26 11:58:53 [29185] DEBUG:reconfirm_commit():PGR_NOT_YET_COMMIT
> 2006-12-26 11:58:53 [29185] DEBUG:do_rlog(): SYS_CALL process done
> 2006-12-26 11:58:53 [29185] DEBUG:do_rlog():process result done:cmdSys='S'
> 2006-12-26 11:58:53 [29185] DEBUG:PGRread_packet():PG_read_query returns .
> 2006-12-26 11:58:53 [29185] DEBUG:do_rlog():got result:cmdSys='S'
> 2006-12-26 11:58:53 [29185] DEBUG:do_rlog(): CMD_STS_TRANSACTION_ABORT
> 2006-12-26 11:58:53 [29185] DEBUG:reconfirm_commit():PGR_NOT_YET_COMMIT
> 2006-12-26 11:58:53 [29185] DEBUG:do_rlog(): SYS_CALL process done
> 2006-12-26 11:58:53 [29185] DEBUG:do_rlog():process result done:cmdSys='S'
> 2006-12-26 11:58:53 [29185] DEBUG:PGRread_packet():PG_read_query returns .
> 2006-12-26 11:58:53 [29185] DEBUG:do_rlog():got result:cmdSys='S'
> 2006-12-26 11:58:53 [29185] DEBUG:do_rlog(): CMD_STS_TRANSACTION_ABORT
> 2006-12-26 11:58:53 [29185] DEBUG:reconfirm_commit():PGR_NOT_YET_COMMIT
> 2006-12-26 11:58:53 [29185] DEBUG:do_rlog(): SYS_CALL process done
> 2006-12-26 11:58:53 [29185] DEBUG:do_rlog():process result done:cmdSys='S'
> 2006-12-26 11:58:53 [29185] DEBUG:PGRread_packet():PG_read_query returns .
> 2006-12-26 11:58:53 [29185] DEBUG:do_rlog():got result:cmdSys='S'
> 2006-12-26 11:58:53 [29185] DEBUG:do_rlog(): CMD_STS_TRANSACTION_ABORT
> 2006-12-26 11:58:53 [29185] DEBUG:reconfirm_commit():PGR_NOT_YET_COMMIT
> 2006-12-26 11:58:53 [29185] DEBUG:do_rlog(): SYS_CALL process done
> 2006-12-26 11:58:53 [29185] DEBUG:do_rlog():process result done:cmdSys='S'
> 2006-12-26 11:58:53 [29185] DEBUG:PGRread_packet():PG_read_query returns .
> 2006-12-26 11:58:53 [29185] DEBUG:do_rlog():got result:cmdSys='S'
> 2006-12-26 11:58:53 [29185] DEBUG:do_rlog(): CMD_STS_TRANSACTION_ABORT
> 2006-12-26 11:58:53 [29185] DEBUG:reconfirm_commit():PGR_NOT_YET_COMMIT
> 2006-12-26 11:58:53 [29185] DEBUG:do_rlog(): SYS_CALL process done
> 2006-12-26 11:58:53 [29185] DEBUG:do_rlog():process result done:cmdSys='S'
> 2006-12-26 11:58:53 [29185] DEBUG:PGRread_packet():PG_read_query returns .
> 2006-12-26 11:58:53 [29185] DEBUG:do_rlog():got result:cmdSys='S'
> 2006-12-26 11:58:53 [29185] DEBUG:do_rlog(): CMD_STS_TRANSACTION_ABORT
> 2006-12-26 11:58:53 [29185] DEBUG:reconfirm_commit():PGR_NOT_YET_COMMIT
> 2006-12-26 11:58:53 [29185] DEBUG:do_rlog(): SYS_CALL process done
> 2006-12-26 11:58:53 [29185] DEBUG:do_rlog():process result done:cmdSys='S'
> 2006-12-26 11:58:53 [29185] DEBUG:PGRread_packet():PG_read_query returns .
> 2006-12-26 11:58:53 [29185] DEBUG:do_rlog():got result:cmdSys='S'
> 2006-12-26 11:58:53 [29185] DEBUG:do_rlog(): CMD_STS_TRANSACTION_ABORT
> 2006-12-26 11:58:53 [29185] DEBUG:reconfirm_commit():PGR_NOT_YET_COMMIT
> 2006-12-26 11:58:53 [29185] DEBUG:do_rlog(): SYS_CALL process done
> 2006-12-26 11:58:53 [29185] DEBUG:do_rlog():process result done:cmdSys='S'
> 2006-12-26 11:58:53 [29185] DEBUG:PGRread_packet():PG_read_query returns .
> 2006-12-26 11:58:53 [29185] DEBUG:do_rlog():got result:cmdSys='S'
> 2006-12-26 11:58:53 [29185] DEBUG:do_rlog(): CMD_STS_TRANSACTION_ABORT
> 2006-12-26 11:58:53 [29185] DEBUG:reconfirm_commit():PGR_NOT_YET_COMMIT
> 2006-12-26 11:58:53 [29185] DEBUG:do_rlog(): SYS_CALL process done
> 2006-12-26 11:58:53 [29185] DEBUG:do_rlog():process result done:cmdSys='S'
> 2006-12-26 11:58:53 [29673] DEBUG:PGRreturn_result():PGRreturn_result[5,0]
> 2006-12-26 11:58:53 [29717] DEBUG:PGRreturn_result():PGRreturn_result[5,0]
> 2006-12-26 11:58:53 [29717] DEBUG:PGRreturn_result():128 send
> 2006-12-26 11:58:53 [29717] DEBUG:replicate_loop():session closed
> 2006-12-26 11:58:53 [29717] DEBUG:PGRdo_replicate():query ::
> PGR_CLOSE_CONNECTION
>




pgcluster メーリングリストの案内