[pgcluster: 597] クラスタDB停止時のロードバランサの動作

鈴木 暢人 suzuki.nobuhito @ kcn.fujitsu.com
2004年 11月 16日 (火) 21:25:29 JST


お世話になります。
はじめまして。鈴木@富士通KCNと申します。

下記の構成でPGClusterの動作検証中です。

構成
 ロードバランサ        :Solaris9 Sparc(×1)
 クラスタDB+レプリケーション:Solaris9 Sparc(×2)
 PGCluster バージョン 1.0.8rc4

                         |
              ((Load Balance Server))
              ( hostname: pglbhost  )
              ( ipaddr:172.16.110.3 )
                         |
 ----------+-------------+------------+----------
           |                          |
  ((  Cluster DB 1    ))    ((  Cluster DB 2    ))
  ((Replication Server))    ((Replication Server))
  ( hostname:clusterdb1)    ( hostname:clusterdb2 )
  (ipaddr:172.16.110.4 )    ( ipaddr:172.16.110.6 )


レプリケーションはうまくいっているようなのですが、
障害状態の検証のためにclusterdb1で クラスタDBを停止
(pg_ctl stop)し、ロードバランサに接続しようとすると
エラーとなりロードバランサが停止してしまいます。
(coreが出力されます。)

原因がわかりません。ご教授いただけますでしょうか?

よろしくお願いいたします。



#clusterdb1を停止
clusterdb1 % pg_ctl stop -m immediate 

#pglbhostで参照
pglbhost %  psql -h 172.16.110.3 -l
DEBUG:PGRscan_cluster:2 ClusterDB can be used
DEBUG:PGRscan_cluster:clusterdb1 [5431],useFlag->2 max->3 use_num->0

DEBUG:PGRdo_child():I am 7412
DEBUG:do_accept():I am 7412 accept fd 7
DEBUG:read_startup_packet():Protocol Major: 2 Minor: 0 database: template1 user:
 postgres
ERROR:connect_inet_domain_socket(): connect() failed: Connection refused
DEBUG:PGRset_status_on_cluster_tbl():host:clusterdb1 port:5431 max:3 use:2 status9
8
DEBUG:PGRset_status_on_cluster_tbl():host:clusterdb1 port:5431 max:3 use:98 status
99
psql: server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.


--- loadbalancer(pglbhost) ---
=== pglb.conf === 
<Cluster_Server_Info>
    <Host_Name>   clusterdb1            </Host_Name>
    <Port>        5431                </Port>
    <Max_Connect> 3                  </Max_Connect>
</Cluster_Server_Info>
<Cluster_Server_Info>
    <Host_Name>   clusterdb2            </Host_Name>
    <Port>        5433                </Port>
    <Max_Connect> 3                  </Max_Connect>
</Cluster_Server_Info>
<Backend_Socket_Dir>    /tmp     </Backend_Socket_Dir>
<Receive_Port>          5432     </Receive_Port>
<Recovery_Port>         7780     </Recovery_Port>
<Max_Cluster_Num>          5     </Max_Cluster_Num>
<Use_Connection_Pooling> yes     </Use_Connection_Pooling>
<Max_Pool_Each_Server>     1     </Max_Pool_Each_Server>

--- cluster+replication 1(clusterdb1) ---
=== postgresql.conf === 
tcpip_socket = true
max_connections = 16
port = 5431
shared_buffers = 32
syslog = 1
syslog_facility = 'LOCAL0'
syslog_ident = 'postgres'
LC_MESSAGES = 'ja'
LC_MONETARY = 'ja'
LC_NUMERIC = 'ja'
LC_TIME = 'ja'

=== cluster.conf ===
<Replicate_Server_Info>
        <Host_Name> clusterdb1 </Host_Name>
        <Port> 8001 </Port>
        <Recovery_Port> 8101 </Recovery_Port>
</Replicate_Server_Info>
<Replicate_Server_Info>
        <Host_Name> clusterdb2  </Host_Name>
        <Port> 8001 </Port>
        <Recovery_Port> 8101 </Recovery_Port>
</Replicate_Server_Info>
<Recovery_Port> 7779 </Recovery_Port>
<Rsync_Path> /usr/local/bin/rsync </Rsync_Path>
<Rsync_Option> ssh -1 </Rsync_Option>
<When_Stand_Alone> read_only  </When_Stand_Alone>

===pgreplicate.conf===
<Cluster_Server_Info>
    <Host_Name>   clusterdb1            </Host_Name>
    <Port>        5431                </Port>
    <Recovery_Port>       7779        </Recovery_Port>
</Cluster_Server_Info>
<Cluster_Server_Info>
    <Host_Name>   clusterdb2            </Host_Name>
    <Port>        5433                </Port>
    <Recovery_Port>       7779        </Recovery_Port>
</Cluster_Server_Info>
<LoadBalance_Server_Info>
        <Host_Name>   pglbhost                </Host_Name>
        <Recovery_Port>       7780            </Recovery_Port>
</LoadBalance_Server_Info>
<Replicate_Server_Info>
        <Host_Name> clusterdb1 </Host_Name>
        <Port> 8001 </Port>
        <Recovery_Port> 8101 </Recovery_Port>
</Replicate_Server_Info>
<Replication_Port>    8001            </Replication_Port>
<Recovery_Port>       8101            </Recovery_Port>
<Response_Mode>       normal          </Response_Mode>


--- cluster+replication 2(clusterdb2) ---
=== postgresql.conf ===
tcpip_socket = true
max_connections = 16
port = 5433
shared_buffers = 32
syslog = 1
syslog_facility = 'LOCAL0'
syslog_ident = 'postgres'
LC_MESSAGES = 'ja'
LC_MONETARY = 'ja'
LC_NUMERIC = 'ja'
LC_TIME = 'ja'

=== cluster.conf ===
<Replicate_Server_Info>
        <Host_Name> clusterdb1 </Host_Name>
        <Port> 8001 </Port>
        <Recovery_Port> 8101 </Recovery_Port>
</Replicate_Server_Info>
<Replicate_Server_Info>
        <Host_Name> clusterdb2  </Host_Name>
        <Port> 8002 </Port>
        <Recovery_Port> 8102 </Recovery_Port>
</Replicate_Server_Info>
<Recovery_Port> 7779 </Recovery_Port>
<Rsync_Path> /usr/local/bin/rsync </Rsync_Path>
<Rsync_Option> ssh -1 </Rsync_Option>
<When_Stand_Alone> read_only  </When_Stand_Alone>

=== pgreplicate.conf ===
<Cluster_Server_Info>
    <Host_Name>   clusterdb1            </Host_Name>
    <Port>        5431                </Port>
    <Recovery_Port>       7779        </Recovery_Port>
</Cluster_Server_Info>
<Cluster_Server_Info>
    <Host_Name>   clusterdb2            </Host_Name>
    <Port>        5433                </Port>
    <Recovery_Port>       7779        </Recovery_Port>
</Cluster_Server_Info>
<LoadBalance_Server_Info>
        <Host_Name>   pglbhost                </Host_Name>
        <Recovery_Port>       7780            </Recovery_Port>
</LoadBalance_Server_Info>
<Replicate_Server_Info>
        <Host_Name> clusterdb1 </Host_Name>
        <Port> 8001 </Port>
        <Recovery_Port> 8101 </Recovery_Port>
</Replicate_Server_Info>
<Replication_Port>    8001            </Replication_Port>
<Recovery_Port>       8101            </Recovery_Port>
<Response_Mode>       normal          </Response_Mode>

--- ロードバランサのコアデバッガ表示
 gdb -c core /usr/local/pgcluster/bin/pglb
GNU gdb 6.3
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "sparc-sun-solaris2.9"...
Core was generated by `/usr/local/pgcluster/bin/pglb -D /usr/local/pgcluster/etc
 -lnv'.
Program terminated with signal 10, Bus error.
Reading symbols from /usr/local/pgcluster/lib/libpq.so.3...done.
Loaded symbols for /usr/local/pgcluster/lib/libpq.so.3
Reading symbols from /usr/local/lib/libreadline.so.4...done.
Loaded symbols for /usr/local/lib/libreadline.so.4
Reading symbols from /usr/lib/libcurses.so.1...done.
Loaded symbols for /usr/lib/libcurses.so.1
Reading symbols from /usr/lib/librt.so.1...done.
Loaded symbols for /usr/lib/librt.so.1
Reading symbols from /usr/lib/libresolv.so.2...done.
Loaded symbols for /usr/lib/libresolv.so.2
Reading symbols from /usr/lib/libgen.so.1...done.
Loaded symbols for /usr/lib/libgen.so.1
Reading symbols from /usr/lib/libsocket.so.1...done.
Loaded symbols for /usr/lib/libsocket.so.1
Reading symbols from /usr/lib/libnsl.so.1...done.
Loaded symbols for /usr/lib/libnsl.so.1
Reading symbols from /usr/lib/libdl.so.1...done.
Loaded symbols for /usr/lib/libdl.so.1
Reading symbols from /usr/lib/libm.so.1...done.
Loaded symbols for /usr/lib/libm.so.1
Reading symbols from /usr/lib/libc.so.1...done.
Loaded symbols for /usr/lib/libc.so.1
Reading symbols from /usr/lib/libaio.so.1...done.
Loaded symbols for /usr/lib/libaio.so.1
Reading symbols from /usr/lib/libmd5.so.1...done.
Loaded symbols for /usr/lib/libmd5.so.1
Reading symbols from /usr/lib/libmp.so.2...done.
Loaded symbols for /usr/lib/libmp.so.2
Reading symbols from /usr/platform/SUNW,Sun-Fire-V240/lib/libc_psr.so.1...done.
Loaded symbols for /usr/platform/SUNW,Sun-Fire-V240/lib/libc_psr.so.1
#0  0xff047a7c in _free_unlocked () from /usr/lib/libc.so.1
(gdb) backtrace
#0  0xff047a7c in _free_unlocked () from /usr/lib/libc.so.1
#1  0xff047a34 in free () from /usr/lib/libc.so.1
#2  0x00017670 in pool_finish () at pool_connection_pool.c:505
#3  0x00013be4 in child_end (sig=15) at child.c:1162
#4  <signal handler called>
#5  0xff09f2bc in _sigsuspend () from /usr/lib/libc.so.1
#6  0xff053990 in _libc_sleep () from /usr/lib/libc.so.1
#7  0x00012ef4 in notice_backend_error () at child.c:524
#8  0x0001750c in create_cp (cp=0x3c4c0, secondary_backend=0)
    at pool_connection_pool.c:440
#9  0x00017580 in new_connection (p=0x3b638) at pool_connection_pool.c:459
#10 0x00016f40 in pool_create_cp () at pool_connection_pool.c:206
#11 0x00013a08 in connect_backend (sp=0x3c4a0, frontend=0x3bc38)
    at child.c:1071
#12 0x00012a80 in PGRdo_child (use_pool=0) at child.c:280
#13 0x00014234 in PGRload_balance () at load_balance.c:124
#14 0x00014fb4 in load_balance_main () at main.c:546
#15 0x00015940 in main (argc=0, argv=0xffbffb64) at main.c:1033
-- 
鈴木 暢人 <suzuki.nobuhito @ kcn.fujitsu.com>




pgcluster メーリングリストの案内