[pgcluster: 475] Re: リカバリー中に更新系クエリを発行するとエラーになる

kazunari takahashi kazunari.takahashi @ ctc-g.co.jp
2004年 8月 10日 (火) 12:04:09 JST


高橋です.

申し訳ありませんが、現象を再現しようとしましたが、できませんでした.
しかし、別の現象が発生しています.

> ロードバランサに対してクエリーを投げた場合,リカバリー中であればServer B
> がクエリーを受けるはずです.

ロードバランサでsnoopを取っているのですが、
serverBに投げている形跡がありません.
serverAに投げています.


> リカバリー中にロードバランサにクエリーを投げた手順を教えて頂けませんでしょ
> うか.

$ psql -h localhost -U test test
で接続し、
insert into 〜を発行したのみです.
別セッションでの接続はしてません.

レプリケーションサーバはreliableになっています.

また、本日発生した別の現象です.

##########################################
#現象
##########################################

・リカバリ中に発行したSQLがmasterDBに2回発行される

####################################
#環境
####################################
pgcluster-1.0.7av12

pglb × 1 (solaris8 sparc) host名:loadblancer
clusterDB ×3 (solaris8 sparc) host名:serverA , serverB , serverC
rgrp × 1 (solaris8 sparc)

##########################################
#環境
##########################################


##########################################
#手順
##########################################

est=> create table test(id serial , name varchar(30));
NOTICE:  CREATE TABLE will create implicit sequence 'test_id_seq' for SERIAL col
umn 'test.id'
CREATE TABLE
test=>
test=>
test=> insert into test (name) values('a');
INSERT 49749 1

※※ここでserverCを落とす

test=> insert into test (name) values('b');
INSERT 49750 1
test=> insert into test (name) values('b');
INSERT 49751 1

※※ここでserverCをリカバリーにする

test=> insert into test (name) values('c');
INSERT 49752 1

※※ OK!のメッセージが出る

test=> insert into test (name) values('c');
INSERT 49753 1
プロンプトがしばらく帰ってこない.しばらくすると帰ってくる.

----------------------
結果
----------------------
serverA:
test=> select * from test ;
 id | name
----+------
  1 | a
  2 | b
  3 | b
  4 | c
  5 | c
  6 | c
(6 rows)

serverB,C:
test=> select * from test;
 id | name
----+------
  1 | a
  2 | b
  3 | b
  4 | c
  5 | c
(5 rows)


------------------------------
pglb.conf
------------------------------
loadbalancer% more pglb.conf
#============================================================
#          Load Balance Server configuration file
#-------------------------------------------------------------
# file: pglb.conf
#-------------------------------------------------------------
# This file controls:
#       o which hosts are db cluster server
#       o which port  use connect to db cluster server
#       o how many connections are allowed on each DB server
#============================================================
#-------------------------------------------------------------
# set cluster DB server information
#               o Host_Name : hostname
#               o Port : connection for postmaster
#               o Max_Connection : maximun number of connection to postmaster
#-------------------------------------------------------------
<Cluster_Server_Info>
    <Host_Name>   serverA  </Host_Name>
    <Port>        5432                </Port>
    <Max_Connect> 32                   </Max_Connect>
</Cluster_Server_Info>
<Cluster_Server_Info>
    <Host_Name>   serverB   </Host_Name>
    <Port>        5432                </Port>
    <Max_Connect> 32                  </Max_Connect>
</Cluster_Server_Info>
<Cluster_Server_Info>
    <Host_Name>   serverC   </Host_Name>
    <Port>        5432                </Port>
    <Max_Connect> 32                  </Max_Connect>
</Cluster_Server_Info>
#-------------------------------------------------------------
# set Load Balance server information
#               o Backend_Socket_Dir : Unix domain socket path for the backend
#               o Receive_Port : connection from client
#               o Recovery_Port : connection for recovery process
#               o Max_Cluster_Num : maximun number of cluster DB servers
#               o Use_Connection_Pooling : use connection pool [yes/no]
#               o Max_Pool_Each_Server : number of pool connections/DB server
#-------------------------------------------------------------
<Backend_Socket_Dir>    /tmp     </Backend_Socket_Dir>
<Receive_Port>          5432     </Receive_Port>
<Recovery_Port>         7780     </Recovery_Port>
<Max_Cluster_Num>        128     </Max_Cluster_Num>
<Use_Connection_Pooling>  no     </Use_Connection_Pooling>


---------------------------------------------
リカバリー中にinsert文を発行した後のpglbでのsnoop結果
---------------------------------------------

   loadbalancer -> serverA    TCP D=5432 S=32853     Ack=2174623320 Seq=2078280968 Len=38 Win=24820
   serverA ->loadbalancer     TCP D=32853 S=5432     Ack=2078281006 Seq=2174623320 Len=24 Win=24820
   loadbalancer -> serverA    TCP D=5432 S=32853     Ack=2174623344 Seq=2078281006 Len=0 Win=24820
   loadbalancer -> serverA    TCP D=5432 S=32853     Ack=2174623344 Seq=2078281006 Len=38 Win=24820
   serverA ->loadbalancer     TCP D=32853 S=5432     Ack=2078281044 Seq=2174623344 Len=0 Win=24820
   serverA ->loadbalancer     TCP D=32853 S=5432     Ack=2078281044 Seq=2174623344 Len=24 Win=24820

----------------------------------------------
pglbのdebugログ
----------------------------------------------
DEBUG(PGRcreate_recv_socket): recovery selected

DEBUG(PGRset_key_of_cluster): received no:101

DEBUG(PGRset_status_on_cluster_tbl): host:serverC port:5432 max:32 use:2 status99
DEBUG(PGRclose_sock): recovery wait

DEBUG(pool_read): read kind from backend P
DEBUG(pool_read_string): pool_read_string: readsize: 1024 readp: 0
DEBUG(pool_read_string): pool_read_string: buf:blank: readlen: 23
DEBUG(mystrlen): strlength: 6
DEBUG(mystrlen): pending data saved: :CINSERT 49751 1: length: 17 return data: :
blank: length: 6
DEBUG(pool_read): read kind from backend pending data C len: 16 po: 1
DEBUG(mystrlen): cp->len(16) cp->po(1)
DEBUG(mystrlen): strlength(15) bufsz(1024)
DEBUG(pool_read): read kind from backend pending data Z len: 0 po: 17
DEBUG(PGRclose_sock): recovery selected

DEBUG(PGRset_key_of_cluster): received no:1

DEBUG(PGRset_key_of_cluster): add_db host:serverC port:5432 max:32
DEBUG(PGRset_status_on_cluster_tbl): host:serverC port:5432 max:32 use:99 status2
DEBUG(PGRset_status_on_cluster_tbl): host:serverC port:5432 max:32 use:2 status3
DEBUG(PGRclose_sock): recovery wait

DEBUG(PGRclose_sock): recovery selected

DEBUG(PGRset_key_of_cluster): received no:3

DEBUG(PGRsearch_cluster_tbl): stop_db host:serverA port:5432 max:32
DEBUG(PGRset_status_on_cluster_tbl): host:serverA port:5432 max:32 use:2 status3
DEBUG(PGRclose_sock): recovery wait

DEBUG(pool_read): read kind from frontend Q(51)
DEBUG(pool_read_string): pool_read_string: readsize: 1024 readp: 0
DEBUG(pool_read_string): pool_read_string: buf:insert into test (name) values('c');: readlen: 37
DEBUG(mystrlen): strlength: 37
DEBUG(mystrlen): pool_read_string: string: :insert into test (name) values('c');
:
DEBUG(mystrlen): pool_read_string: final read 37 string: :insert into test (name) values('c');:
DEBUG(mystrlen): pool_read_string: total result 37 string: :insert into test (name) values('c');:
DEBUG(pool_read): read kind from backend P
DEBUG(pool_read_string): pool_read_string: readsize: 1024 readp: 0
DEBUG(pool_read_string): pool_read_string: buf:blank: readlen: 23
DEBUG(mystrlen): strlength: 6
DEBUG(mystrlen): pending data saved: :CINSERT 49752 1: length: 17 return data: :blank: length: 6
DEBUG(pool_read): read kind from backend pending data C len: 16 po: 1
DEBUG(mystrlen): cp->len(16) cp->po(1)
DEBUG(mystrlen): strlength(15) bufsz(1024)
DEBUG(pool_read): read kind from backend pending data Z len: 0 po: 17
DEBUG(pool_read): read kind from frontend Q(51)
DEBUG(pool_read_string): pool_read_string: readsize: 1024 readp: 0
DEBUG(pool_read_string): pool_read_string: buf:insert into test (name) values('c');: readlen: 37
DEBUG(mystrlen): strlength: 37
DEBUG(mystrlen): pool_read_string: string: :insert into test (name) values('c');
:
DEBUG(mystrlen): pool_read_string: final read 37 string: :insert into test (name) values('c');:
DEBUG(mystrlen): pool_read_string: total result 37 string: :insert into test (name) values('c');:
DEBUG(PGRget_child_status): load balance wait

DEBUG(PGRclose_sock): recovery selected

DEBUG(PGRset_key_of_cluster): received no:9

DEBUG(PGRsearch_cluster_tbl): start_db host:serverA port:5432 max:32
DEBUG(PGRset_status_on_cluster_tbl): host:serverA port:5432 max:32 use:3 status2
DEBUG(PGRclose_sock): recovery wait

DEBUG(PGRclose_sock): recovery selected

DEBUG(PGRset_key_of_cluster): received no:9

DEBUG(PGRsearch_cluster_tbl): start_db host:serverC port:5432 max:32
DEBUG(PGRset_status_on_cluster_tbl): host:serverC port:5432 max:32 use:3 status2
DEBUG(PGRclose_sock): recovery wait

DEBUG(pool_read): read kind from backend P
DEBUG(pool_read_string): pool_read_string: readsize: 1024 readp: 0
DEBUG(pool_read_string): pool_read_string: buf:blank: readlen: 23
DEBUG(mystrlen): strlength: 6
DEBUG(mystrlen): pending data saved: :CINSERT 49753 1: length: 17 return data: :blank: length: 6
DEBUG(pool_read): read kind from backend pending data C len: 16 po: 1
DEBUG(mystrlen): cp->len(16) cp->po(1)
DEBUG(mystrlen): strlength(15) bufsz(1024)
DEBUG(pool_read): read kind from backend pending data Z len: 0 po: 17
DEBUG(PGRget_child_status): load balance wait

DEBUG(PGRclose_sock): recovery wait

DEBUG(PGRget_child_status): load balance wait

DEBUG(PGRclose_sock): recovery wait


---------------------------------------
pgrepのdebuglog
---------------------------------------


DEBUG(pgrecovery_loop): recovery accept port 7778

DEBUG(read_packet): receive packet no:1

DEBUG(first_setup_recovery): 1st setup target serverC

DEBUG(first_setup_recovery): 1st setup port 5432

DEBUG(first_setup_recovery): check another recovery process

DEBUG(PGRsem_unlock): add recovery target to host table

DEBUG(send_recovery_packet): set RECOVERY_PGDATA_REQ packet data

DEBUG(PGRsend_replicate_packet_to_server): host(serverA) : port(5432)
DEBUG(getTransactionTbl): not found in getTransactionTbl
DEBUG(pgr_createConn): PQsetdbLogin host[serverA] port[5432] db[template1] user[postgres]
DEBUG(replicate_loop): wait replicate

DEBUG(pgr_createConn): PQsetdbLogin ok!!
DEBUG(insertTransactionTbl): db:template1 port:5432 user:postgres host:serverA query:VACUUM
DEBUG(insertTransactionTbl): sync_command(SELECT PGR_SYSTEM_COMMAND_FUNCTION(3,0,0,2) )
DEBUG(insertTransactionTbl): PQexec send :VACUUM
DEBUG(insertTransactionTbl): PQexec end
DEBUG(getTransactionTbl): hit !! transaction tbl host serverA db:template1 pid
:642
DEBUG(getTransactionTbl): send packet to master serverA recoveryPort 7779

DEBUG(PGRsem_unlock): wait answer from master server

DEBUG(read_packet_from_master): wait

DEBUG(read_packet): get answer from master

DEBUG(send_recovery_packet): 1st master serverA - 5432
DEBUG(send_recovery_packet): 1st target serverC - 5432
DEBUG(replicate_loop): wait replicate

DEBUG(PGRsem_unlock): replicate_loop selected


→→  クエリの発行

DEBUG(PGRread_packet): query :: insert into test (name) values('c')

DEBUG(PGRreplicate_packet_send): cmdSts=Q

DEBUG(PGRreplicate_packet_send): cmdType=I

DEBUG(PGRreplicate_packet_send): port=5432

DEBUG(PGRreplicate_packet_send): pid=517

DEBUG(PGRreplicate_packet_send): from_host=serverA

DEBUG(PGRreplicate_packet_send): dbName=test

DEBUG(PGRreplicate_packet_send): userName=test

DEBUG(PGRreplicate_packet_send): recieve sec=1092104488

DEBUG(PGRreplicate_packet_send): recieve usec=436371

DEBUG(PGRreplicate_packet_send): query_size=35

DEBUG(PGRreplicate_packet_send): query=insert into test (name) values('c')

DEBUG(PGRreplicate_packet_send): useFlag[2]
DEBUG(PGRis_same_host): 5432 @ serverA return trigger
DEBUG(is_need_sync_time): sem_lock[1]
DEBUG(PGRreturn_result): PGRreturn_result[3,1092104488,436371,2]
DEBUG(PGRreturn_result): [128 / 128] send
DEBUG(PGRreturn_result): 128 send
DEBUG(PGRreturn_result): wait for answer
DEBUG(PGRread_packet): answer[PGR_QUERY_DONE_NOTICE_CMD]
DEBUG(PGRread_packet): QUERY DONE
DEBUG(PGRread_packet): status of PGRreturn_result[0]
DEBUG(PGRread_packet): sem_lock[2]
DEBUG(PGRsem_lock): sem_unlock[1]
DEBUG(PGRreplicate_packet_send_each_server): except:5432 @ serverA host:5432 @ serverB

→→ serverBへレプリケーション

DEBUG(PGRreplicate_packet_send_each_server): send replicate to:serverB

DEBUG(PGRsend_replicate_packet_to_server): host(serverB) : port(5432)
DEBUG(getTransactionTbl): hit !! transaction tbl host serverB db:test pid:517
DEBUG(getTransactionTbl): db:test port:5432 user:test host:serverB query:insert into test (name) values('c')
DEBUG(getTransactionTbl): sync_command(SELECT PGR_SYSTEM_COMMAND_FUNCTION(3,1092104488,436371,2) )
DEBUG(getTransactionTbl): PQexec send :insert into test (name) values('c')
DEBUG(getTransactionTbl): PQexec end
DEBUG(delete_template): sem_lock[3]
DEBUG(PGRsem_lock): sem_unlock[2]
DEBUG(delete_template): PGRreplicate_packet_send end
DEBUG(PGRreturn_result): PGRreturn_result[6]
DEBUG(PGRreturn_result): [128 / 128] send
DEBUG(PGRreturn_result): 128 send
DEBUG(PGRreturn_result): status of PGRreturn_result[0]
DEBUG(delete_template): sem_unlock[3]
DEBUG(replicate_loop): wait replicate

DEBUG(read_packet): receive packet no:5

DEBUG(PGRsend_replicate_packet_to_server): host(serverA) : port(5432)
DEBUG(getTransactionTbl): not found in getTransactionTbl
DEBUG(pgr_createConn): PQsetdbLogin host[serverA] port[5432] db[template1] user[postgres]
DEBUG(pgr_createConn): PQsetdbLogin ok!!
DEBUG(insertTransactionTbl): db:template1 port:5432 user:postgres host:serverA query:VACUUM FULL
DEBUG(insertTransactionTbl): sync_command(SELECT PGR_SYSTEM_COMMAND_FUNCTION(3,0,0,2) )
DEBUG(insertTransactionTbl): PQexec send :VACUUM FULL
DEBUG(insertTransactionTbl): PQexec end
DEBUG(getTransactionTbl): hit !! transaction tbl host serverA db:template1 pid:642
DEBUG(read_packet_from_master): wait

DEBUG(send_recovery_packet): 2nd master serverA - 5432
DEBUG(send_recovery_packet): 2nd target serverC - 5432
DEBUG(send_recovery_packet): second_setup_recovery end :1
DEBUG(PGRsem_unlock): replicate_loop selected

→→  2回目のクエリの発行

DEBUG(PGRread_packet): query :: insert into test (name) values('c')

DEBUG(PGRset_queue): msgsnd header rtn:0
DEBUG(PGRset_queue): msgsnd query[insert into test (name) values('c')] rtn:0
DEBUG(PGRreplicate_packet_send): cmdSts=Q

DEBUG(PGRreplicate_packet_send): cmdType=I

DEBUG(PGRreplicate_packet_send): port=5432

DEBUG(PGRreplicate_packet_send): pid=517

DEBUG(PGRreplicate_packet_send): from_host=serverA

DEBUG(PGRreplicate_packet_send): dbName=test

DEBUG(PGRreplicate_packet_send): userName=test

DEBUG(PGRreplicate_packet_send): recieve sec=1092104499

DEBUG(PGRreplicate_packet_send): recieve usec=799667

DEBUG(PGRreplicate_packet_send): query_size=35

DEBUG(PGRreplicate_packet_send): query=insert into test (name) values('c')

DEBUG(PGRreplicate_packet_send): useFlag[2]
DEBUG(PGRget_master): serverA skipped

DEBUG(is_need_response): sem_lock[1]
DEBUG(PGRreplicate_packet_send_each_server): except:5432 @ serverA host:5432 @ serverB

DEBUG(PGRreplicate_packet_send_each_server): send replicate to:serverB

DEBUG(PGRsend_replicate_packet_to_server): host(serverB) : port(5432)
DEBUG(getTransactionTbl): hit !! transaction tbl host serverB db:test pid:517
DEBUG(getTransactionTbl): db:test port:5432 user:test host:serverB query:insert into test (name) values('c')
DEBUG(getTransactionTbl): sync_command(SELECT PGR_SYSTEM_COMMAND_FUNCTION(3,1092104499,799667,2) )
DEBUG(replicate_loop): wait replicate

DEBUG(getTransactionTbl): PQexec send :insert into test (name) values('c')
DEBUG(getTransactionTbl): PQexec end
DEBUG(delete_template): sem_lock[2]
DEBUG(PGRsem_lock): sem_unlock[1]
DEBUG(delete_template): PGRreplicate_packet_send end
DEBUG(PGRreturn_result): PGRreturn_result[6]
DEBUG(PGRreturn_result): [128 / 128] send
DEBUG(PGRreturn_result): 128 send
DEBUG(PGRreturn_result): status of PGRreturn_result[0]
DEBUG(PGRreturn_result): sem_unlock[2]
DEBUG(PGRsem_unlock): replicate_loop selected

DEBUG(replicate_loop): wait replicate

DEBUG(pgrecovery_loop): recovery accept port 7778

DEBUG(read_packet): receive packet no:9

DEBUG(read_packet): last master serverA - 5432
DEBUG(read_packet): last target serverC - 5432
DEBUG(PGRsend_queue): master serverA - 5432

DEBUG(PGRget_HostTbl): target serverC - 5432

DEBUG(PGRget_HostTbl): msgrcv[insert into test (name) values('c')]
DEBUG(PGRsend_replicate_packet_to_server): host(serverA) : port(5432)
DEBUG(getTransactionTbl): not found in getTransactionTbl
DEBUG(pgr_createConn): PQsetdbLogin host[serverA] port[5432] db[test] user[test]
DEBUG(pgr_createConn): PQsetdbLogin ok!!
DEBUG(insertTransactionTbl): db:test port:5432 user:test host:serverA query:insert into test (name) values('c')
DEBUG(insertTransactionTbl): sync_command(SELECT PGR_SYSTEM_COMMAND_FUNCTION(3,1092104499,799667,2) )
DEBUG(insertTransactionTbl): PQexec send :insert into test (name) values('c')
DEBUG(insertTransactionTbl): PQexec end
DEBUG(PGRsend_replicate_packet_to_server): host(serverC) : port(5432)
DEBUG(getTransactionTbl): not found in transaction tbl host serverA db:test pid:517
DEBUG(getTransactionTbl): not found in getTransactionTbl
DEBUG(getTransactionTbl): not found in transaction tbl host serverA db:test pid:517
DEBUG(pgr_createConn): PQsetdbLogin host[serverC] port[5432] db[test] user[test]
DEBUG(pgr_createConn): PQsetdbLogin ok!!
DEBUG(insertTransactionTbl): db:test port:5432 user:test host:serverC query:insert into test (name) values('c')
DEBUG(insertTransactionTbl): sync_command(SELECT PGR_SYSTEM_COMMAND_FUNCTION(3,1092104499,799667,2) )
DEBUG(insertTransactionTbl): PQexec send :insert into test (name) values('c')
DEBUG(insertTransactionTbl): PQexec end
DEBUG(delete_template): send_queue return status 0

DEBUG(delete_template): PGRsend_queue ok
DEBUG(replicate_loop): wait replicate


以上 宜しくお願いいたします.
高橋 一成 <kazunari.takahashi @ ctc-g.co.jp>




pgcluster メーリングリストの案内