redis4.0.14哨兵自动故障迁移失败
错误情况:哨兵日志信息:-failover-abort-not-elected24617:X 10 Jun 21:44:21.323 # +sdown master mymaster 192.168.141.129 637924617:X 10 Jun 21:44:21.400 # +odown master mymaster 192.168.141.129 6379 #quorum 3/2246
错误情况:
哨兵日志信息:
-failover-abort-not-elected
24617:X 10 Jun 21:44:21.323 # +sdown master mymaster 192.168.141.129 6379
24617:X 10 Jun 21:44:21.400 # +odown master mymaster 192.168.141.129 6379 #quorum 3/2
24617:X 10 Jun 21:44:21.400 # +new-epoch 8
24617:X 10 Jun 21:44:21.400 # +try-failover master mymaster 192.168.141.129 6379
24617:X 10 Jun 21:44:21.402 # +vote-for-leader 0e85c75b573140a05addc38543f18cfbb3dc74f3 8
24617:X 10 Jun 21:44:21.402 # a443f3621568e41b03d1a442a0d0ec02b5c7e6be voted for a443f3621568e41b03d1a442a0d0ec02b5c7e6be 8
24617:X 10 Jun 21:44:21.403 # 1a5a45d59dbddeaf1e30b63b27cee5605c0b7ed5 voted for a443f3621568e41b03d1a442a0d0ec02b5c7e6be 8
24617:X 10 Jun 21:44:32.377 # -failover-abort-not-elected master mymaster 192.168.141.129 6379
24617:X 10 Jun 21:44:32.441 # Next failover delay: I will not start a failover before Wed Jun 10 21:50:22 2020
或者:
没有 vote-for-leader
27333:X 10 Jun 22:07:56.622 # +sdown master mymaster 192.168.141.129 6380
27333:X 10 Jun 22:07:58.201 # +sdown slave 127.0.0.1:6379 127.0.0.1 6379 @ mymaster 192.168.141.129 6380
27333:X 10 Jun 22:07:59.292 # +sdown sentinel f0622ba9d068b62c8cab6ceb70fc55f48e94c602 192.168.141.129 26381 @ mymaster 192.168.141.129 6380
27333:X 10 Jun 22:08:00.485 # +sdown sentinel 4785172698bd33321bab5719faaadee10f41907c 192.168.141.129 26380 @ mymaster 192.168.141.129 6380
redis的配置文件
6379.conf
主要内容:
bind 192.168.141.129 127.0.0.1
protected-mode yes
port 6379
哨兵配置文件:(不正确)
26379.conf
port 26379
sentinel monitor mymaster 192.168.141.129 6379 2
26380.conf
port 26380
sentinel monitor mymaster 192.168.141.129 6379 2
26381.conf
port 26381
sentinel monitor mymaster 192.168.141.129 6379 2
相关命令:
启动6379主服务器:
redis-server 6379.conf
启动6380服务器:
redis-server 6380.conf --slaveof 192.168.141.129 6379
或者
redis-server 6380.conf --slaveof 127.0.0.1 6379
以上两个命令,在哨兵配置正确的情况下都可以。如果哨兵没配置正确(没bind ip),主连接补上后,哨兵的日志会不一样。
启动哨兵:
redis-server 26379.conf --sentinel
正确的配置
哨兵配置文件:
26379.conf
根据redis的配置(6379.conf等),在配置文件里加上bind那一行
port 26379
sentinel monitor mymaster 192.168.141.129 6379 2
bind 192.168.141.129 127.0.0.1
或者将monitor里的ip改成127.0.0.1也可以
port 26379
sentinel monitor mymaster 127.0.0.1 6379 2
bind 192.168.141.129 127.0.0.1
26380.conf等一样
正确的自动故障迁移日志
26379哨兵
[root@oracle test]# redis-server 26379.conf --sentinel
24588:X 10 Jun 21:26:29.624 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
24588:X 10 Jun 21:26:29.624 # Redis version=4.0.14, bits=64, commit=00000000, modified=0, pid=24588, just started
24588:X 10 Jun 21:26:29.624 # Configuration loaded
24588:X 10 Jun 21:26:29.625 * Increased maximum number of open files to 10032 (it was originally set to 1024).
_._
_.-``__ ''-._
_.-`` `. `_. ''-._ Redis 4.0.14 (00000000/0) 64 bit
.-`` .-```. ```\/ _.,_ ''-._
( ' , .-` | `, ) Running in sentinel mode
|`-._`-...-` __...-.``-._|'` _.-'| Port: 26379
| `-._ `._ / _.-' | PID: 24588
`-._ `-._ `-./ _.-' _.-'
|`-._`-._ `-.__.-' _.-'_.-'|
| `-._`-._ _.-'_.-' | http://redis.io
`-._ `-._`-.__.-'_.-' _.-'
|`-._`-._ `-.__.-' _.-'_.-'|
| `-._`-._ _.-'_.-' |
`-._ `-._`-.__.-'_.-' _.-'
`-._ `-.__.-' _.-'
`-._ _.-'
`-.__.-'
24588:X 10 Jun 21:26:29.626 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
24588:X 10 Jun 21:26:29.626 # Sentinel ID is 1a5a45d59dbddeaf1e30b63b27cee5605c0b7ed5
24588:X 10 Jun 21:26:29.626 # +monitor master mymaster 192.168.141.129 6379 quorum 2
24588:X 10 Jun 21:26:59.684 # +sdown sentinel a5bfebbed1ead357c6a285d33dab3fa2e3af5537 192.168.141.129 0 @ mymaster 192.168.141.129 6379
--------------------当主连接不上之后:-------------------
24588:X 10 Jun 21:28:32.689 # +sdown master mymaster 192.168.141.129 6379
24588:X 10 Jun 21:28:32.736 # +new-epoch 6
24588:X 10 Jun 21:28:32.737 # +vote-for-leader a443f3621568e41b03d1a442a0d0ec02b5c7e6be 6
24588:X 10 Jun 21:28:32.790 # +odown master mymaster 192.168.141.129 6379 #quorum 4/2
24588:X 10 Jun 21:28:32.790 # Next failover delay: I will not start a failover before Wed Jun 10 21:34:32 2020
24588:X 10 Jun 21:28:33.458 # +config-update-from sentinel a443f3621568e41b03d1a442a0d0ec02b5c7e6be 192.168.141.129 26381 @ mymaster 192.168.141.129 6379
24588:X 10 Jun 21:28:33.458 # +switch-master mymaster 192.168.141.129 6379 192.168.141.129 6381
24588:X 10 Jun 21:28:33.458 * +slave slave 192.168.141.129:6380 192.168.141.129 6380 @ mymaster 192.168.141.129 6381
24588:X 10 Jun 21:28:33.458 * +slave slave 192.168.141.129:6379 192.168.141.129 6379 @ mymaster 192.168.141.129 6381
24588:X 10 Jun 21:29:03.491 # +sdown slave 192.168.141.129:6379 192.168.141.129 6379 @ mymaster 192.168.141.129 6381
26380哨兵
[root@oracle test]# redis-server 26380.conf --sentinel
24617:X 10 Jun 21:26:46.400 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
24617:X 10 Jun 21:26:46.400 # Redis version=4.0.14, bits=64, commit=00000000, modified=0, pid=24617, just started
24617:X 10 Jun 21:26:46.400 # Configuration loaded
24617:X 10 Jun 21:26:46.401 * Increased maximum number of open files to 10032 (it was originally set to 1024).
_._
_.-``__ ''-._
_.-`` `. `_. ''-._ Redis 4.0.14 (00000000/0) 64 bit
.-`` .-```. ```\/ _.,_ ''-._
( ' , .-` | `, ) Running in sentinel mode
|`-._`-...-` __...-.``-._|'` _.-'| Port: 26380
| `-._ `._ / _.-' | PID: 24617
`-._ `-._ `-./ _.-' _.-'
|`-._`-._ `-.__.-' _.-'_.-'|
| `-._`-._ _.-'_.-' | http://redis.io
`-._ `-._`-.__.-'_.-' _.-'
|`-._`-._ `-.__.-' _.-'_.-'|
| `-._`-._ _.-'_.-' |
`-._ `-._`-.__.-'_.-' _.-'
`-._ `-.__.-' _.-'
`-._ _.-'
`-.__.-'
24617:X 10 Jun 21:26:46.402 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
24617:X 10 Jun 21:26:46.402 # Sentinel ID is 0e85c75b573140a05addc38543f18cfbb3dc74f3
24617:X 10 Jun 21:26:46.402 # +monitor master mymaster 192.168.141.129 6379 quorum 2
24617:X 10 Jun 21:28:32.652 # +sdown master mymaster 192.168.141.129 6379
24617:X 10 Jun 21:28:32.736 # +new-epoch 6
24617:X 10 Jun 21:28:32.737 # +vote-for-leader a443f3621568e41b03d1a442a0d0ec02b5c7e6be 6
24617:X 10 Jun 21:28:33.458 # +config-update-from sentinel a443f3621568e41b03d1a442a0d0ec02b5c7e6be 192.168.141.129 26381 @ mymaster 192.168.141.129 6379
24617:X 10 Jun 21:28:33.458 # +switch-master mymaster 192.168.141.129 6379 192.168.141.129 6381
24617:X 10 Jun 21:28:33.458 * +slave slave 192.168.141.129:6380 192.168.141.129 6380 @ mymaster 192.168.141.129 6381
24617:X 10 Jun 21:28:33.458 * +slave slave 192.168.141.129:6379 192.168.141.129 6379 @ mymaster 192.168.141.129 6381
24617:X 10 Jun 21:29:03.472 # +sdown slave 192.168.141.129:6379 192.168.141.129 6379 @ mymaster 192.168.141.129 6381
26381哨兵
[root@oracle test]# redis-server 26381.conf --sentinel
24626:X 10 Jun 21:26:52.408 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
24626:X 10 Jun 21:26:52.408 # Redis version=4.0.14, bits=64, commit=00000000, modified=0, pid=24626, just started
24626:X 10 Jun 21:26:52.408 # Configuration loaded
24626:X 10 Jun 21:26:52.408 * Increased maximum number of open files to 10032 (it was originally set to 1024).
_._
_.-``__ ''-._
_.-`` `. `_. ''-._ Redis 4.0.14 (00000000/0) 64 bit
.-`` .-```. ```\/ _.,_ ''-._
( ' , .-` | `, ) Running in sentinel mode
|`-._`-...-` __...-.``-._|'` _.-'| Port: 26381
| `-._ `._ / _.-' | PID: 24626
`-._ `-._ `-./ _.-' _.-'
|`-._`-._ `-.__.-' _.-'_.-'|
| `-._`-._ _.-'_.-' | http://redis.io
`-._ `-._`-.__.-'_.-' _.-'
|`-._`-._ `-.__.-' _.-'_.-'|
| `-._`-._ _.-'_.-' |
`-._ `-._`-.__.-'_.-' _.-'
`-._ `-.__.-' _.-'
`-._ _.-'
`-.__.-'
24626:X 10 Jun 21:26:52.409 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
24626:X 10 Jun 21:26:52.410 # Sentinel ID is a443f3621568e41b03d1a442a0d0ec02b5c7e6be
24626:X 10 Jun 21:26:52.410 # +monitor master mymaster 192.168.141.129 6379 quorum 2
24626:X 10 Jun 21:28:32.674 # +sdown master mymaster 192.168.141.129 6379
24626:X 10 Jun 21:28:32.733 # +odown master mymaster 192.168.141.129 6379 #quorum 3/2
24626:X 10 Jun 21:28:32.733 # +new-epoch 6
24626:X 10 Jun 21:28:32.733 # +try-failover master mymaster 192.168.141.129 6379
24626:X 10 Jun 21:28:32.734 # +vote-for-leader a443f3621568e41b03d1a442a0d0ec02b5c7e6be 6
24626:X 10 Jun 21:28:32.737 # 1a5a45d59dbddeaf1e30b63b27cee5605c0b7ed5 voted for a443f3621568e41b03d1a442a0d0ec02b5c7e6be 6
24626:X 10 Jun 21:28:32.737 # 0e85c75b573140a05addc38543f18cfbb3dc74f3 voted for a443f3621568e41b03d1a442a0d0ec02b5c7e6be 6
24626:X 10 Jun 21:28:32.738 # e1949da6609ebbe9c123e4375179013495f2b230 voted for a443f3621568e41b03d1a442a0d0ec02b5c7e6be 6
24626:X 10 Jun 21:28:32.825 # +elected-leader master mymaster 192.168.141.129 6379
24626:X 10 Jun 21:28:32.825 # +failover-state-select-slave master mymaster 192.168.141.129 6379
24626:X 10 Jun 21:28:32.902 # +selected-slave slave 192.168.141.129:6381 192.168.141.129 6381 @ mymaster 192.168.141.129 6379
24626:X 10 Jun 21:28:32.902 * +failover-state-send-slaveof-noone slave 192.168.141.129:6381 192.168.141.129 6381 @ mymaster 192.168.141.129 6379
24626:X 10 Jun 21:28:32.986 * +failover-state-wait-promotion slave 192.168.141.129:6381 192.168.141.129 6381 @ mymaster 192.168.141.129 6379
24626:X 10 Jun 21:28:33.385 # +promoted-slave slave 192.168.141.129:6381 192.168.141.129 6381 @ mymaster 192.168.141.129 6379
24626:X 10 Jun 21:28:33.385 # +failover-state-reconf-slaves master mymaster 192.168.141.129 6379
24626:X 10 Jun 21:28:33.458 * +slave-reconf-sent slave 192.168.141.129:6380 192.168.141.129 6380 @ mymaster 192.168.141.129 6379
24626:X 10 Jun 21:28:33.826 # -odown master mymaster 192.168.141.129 6379
24626:X 10 Jun 21:28:34.441 * +slave-reconf-inprog slave 192.168.141.129:6380 192.168.141.129 6380 @ mymaster 192.168.141.129 6379
24626:X 10 Jun 21:28:34.441 * +slave-reconf-done slave 192.168.141.129:6380 192.168.141.129 6380 @ mymaster 192.168.141.129 6379
24626:X 10 Jun 21:28:34.518 # +failover-end master mymaster 192.168.141.129 6379
24626:X 10 Jun 21:28:34.518 # +switch-master mymaster 192.168.141.129 6379 192.168.141.129 6381
24626:X 10 Jun 21:28:34.518 * +slave slave 192.168.141.129:6380 192.168.141.129 6380 @ mymaster 192.168.141.129 6381
24626:X 10 Jun 21:28:34.518 * +slave slave 192.168.141.129:6379 192.168.141.129 6379 @ mymaster 192.168.141.129 6381
24626:X 10 Jun 21:29:04.533 # +sdown slave 192.168.141.129:6379 192.168.141.129 6379 @ mymaster 192.168.141.129 6381
备注、注意事项
哨兵启动之后,会自动修改哨兵的配置文件,比如:
启动前的26379.conf
port 26379
sentinel monitor mymaster 127.0.0.1 6379 2
bind 192.168.141.129 127.0.0.1
启动之后:
port 26379
bind 192.168.141.129 127.0.0.1
sentinel myid e7f398056cc17cdc1d1abf4ddaeff6485881c6e8
# Generated by CONFIG REWRITE
dir "/root/test"
sentinel deny-scripts-reconfig yes
sentinel monitor mymaster 192.168.141.129 6379 2
sentinel config-epoch mymaster 1
sentinel leader-epoch mymaster 1
sentinel known-slave mymaster 192.168.141.129 6380
sentinel known-slave mymaster 192.168.141.129 6381
sentinel known-sentinel mymaster 192.168.141.129 26380 33839d2c5fa9d5af14f89eaef58ac97bbc822523
sentinel known-sentinel mymaster 192.168.141.129 26381 8a3fa6cdd0736d8bc83ef8cfa1babb794f0001c7
sentinel current-epoch 1
redis主服务器(以6379为例)挂掉之后重启,而且如果是在已经自动产生了新的主服务器(以6380为例)的情况下,原主服务器的配置文件6379.conf会被改写,6379重启完自动变成6380的从服务器:
# Generated by CONFIG REWRITE
slaveof 192.168.141.129 6380
重启日志:
注意里面有CONFIG REWRITE操作
30696:M 10 Jun 22:57:49.201 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
30696:M 10 Jun 22:57:49.201 # Server initialized
30696:M 10 Jun 22:57:49.201 # WARNING overcommit_memory is set to 0! Background save may fail under low memory condition. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect.
30696:M 10 Jun 22:57:49.201 # WARNING you have Transparent Huge Pages (THP) support enabled in your kernel. This will create latency and memory usage issues with Redis. To fix this issue run the command 'echo never > /sys/kernel/mm/transparent_hugepage/enabled' as root, and add it to your /etc/rc.local in order to retain the setting after a reboot. Redis must be restarted after THP is disabled.
30696:M 10 Jun 22:57:49.201 * Reading RDB preamble from AOF file...
30696:M 10 Jun 22:57:49.201 * Reading the remaining AOF tail...
30696:M 10 Jun 22:57:49.201 * DB loaded from append only file: 0.000 seconds
30696:M 10 Jun 22:57:49.201 * Ready to accept connections
30696:S 10 Jun 22:57:59.252 * Before turning into a slave, using my master parameters to synthesize a cached master: I may be able to synchronize with the new master with just a partial transfer.
30696:S 10 Jun 22:57:59.252 * SLAVE OF 192.168.141.129:6380 enabled (user request from 'id=3 addr=192.168.141.129:52622 fd=9 name=sentinel-8a3fa6cd-cmd age=10 idle=0 flags=x db=0 sub=0 psub=0 multi=3 qbuf=0 qbuf-free=32768 obl=36 oll=0 omem=0 events=r cmd=exec')
30696:S 10 Jun 22:57:59.253 # CONFIG REWRITE executed with success.
30696:S 10 Jun 22:57:59.285 * Connecting to MASTER 192.168.141.129:6380
30696:S 10 Jun 22:57:59.285 * MASTER <-> SLAVE sync started
30696:S 10 Jun 22:57:59.285 * Non blocking connect for SYNC fired the event.
30696:S 10 Jun 22:57:59.286 * Master replied to PING, replication can continue...
30696:S 10 Jun 22:57:59.286 * Trying a partial resynchronization (request 71f7c705e36cab2d3194888520c23278f331b63e:1).
30696:S 10 Jun 22:57:59.287 * Full resync from master: c4183d3ac691fcd9fb4831d2aaba82321a2fc8b6:157380
30696:S 10 Jun 22:57:59.287 * Discarding previously cached master state.
30696:S 10 Jun 22:57:59.320 * MASTER <-> SLAVE sync: receiving 198 bytes from master
30696:S 10 Jun 22:57:59.320 * MASTER <-> SLAVE sync: Flushing old data
30696:S 10 Jun 22:57:59.321 * MASTER <-> SLAVE sync: Loading DB in memory
30696:S 10 Jun 22:57:59.321 * MASTER <-> SLAVE sync: Finished with success
30696:S 10 Jun 22:57:59.321 * Background append only file rewriting started by pid 30709
30696:S 10 Jun 22:57:59.371 * AOF rewrite child asks to stop sending diffs.
30709:C 10 Jun 22:57:59.371 * Parent agreed to stop sending diffs. Finalizing AOF...
30709:C 10 Jun 22:57:59.371 * Concatenating 0.00 MB of AOF diff received from parent.
30709:C 10 Jun 22:57:59.371 * SYNC append only file rewrite performed
30709:C 10 Jun 22:57:59.372 * AOF rewrite: 0 MB of memory used by copy-on-write
30696:S 10 Jun 22:57:59.386 * Background AOF rewrite terminated with success
30696:S 10 Jun 22:57:59.386 * Residual parent diff successfully flushed to the rewritten AOF (0.00 MB)
30696:S 10 Jun 22:57:59.386 * Background AOF rewrite finished successfully
redis的主服务器挂掉之后,哨兵会在30s之后自动投票出新的主服务器。而且哨兵的配置文件又会进行修改:
port 26379
bind 192.168.141.129 127.0.0.1
sentinel myid e7f398056cc17cdc1d1abf4ddaeff6485881c6e8
# Generated by CONFIG REWRITE
dir "/root/test"
sentinel deny-scripts-reconfig yes
sentinel monitor mymaster 192.168.141.129 6380 2
sentinel config-epoch mymaster 1
sentinel leader-epoch mymaster 1
sentinel known-slave mymaster 127.0.0.1 6379
sentinel known-slave mymaster 192.168.141.129 6381
sentinel known-sentinel mymaster 192.168.141.129 26380 33839d2c5fa9d5af14f89eaef58ac97bbc822523
sentinel known-sentinel mymaster 192.168.141.129 26381 8a3fa6cdd0736d8bc83ef8cfa1babb794f0001c7
sentinel current-epoch 1
因此,如果需要在原主服务器6379重启之后重新变回主服务器,那么不止6379.conf配置文件要修改(去掉最后一行关于slaveof的配置),哨兵的配置文件也要进行修改。这样redis服务器和哨兵服务器重启之后,6379才能变回主服务器
开放原子开发者工作坊旨在鼓励更多人参与开源活动,与志同道合的开发者们相互交流开发经验、分享开发心得、获取前沿技术趋势。工作坊有多种形式的开发者活动,如meetup、训练营等,主打技术交流,干货满满,真诚地邀请各位开发者共同参与!
更多推荐
所有评论(0)