0%

keyspace创建以后,仍然可以更改其复制因子,也就是keyspace中数据的复制份数是可以动态修改的。

cassandra集群的系统keyspace system_auth默认的replication factor是1,也就是其实是没有冗余的。如果这唯一的节点挂掉,就无法再登录到集群了。
因此官方文档推荐将其复制因子设置为每个数据中心的每一个节点。也就是将其复制到集群中的每一个节点上。

查看system_auth的复制因子

1
2
3
cqlsh> DESC KEYSPACE system_auth
CREATE KEYSPACE system_auth WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '1'} AND durable_writes = true;
...

果然replication factor只有1,修改之:

1
cqlsh> ALTER KEYSPACE system_auth WITH replication = {'class': 'NetworkTopologyStrategy', 'dc1':2,'dc2':2};

durable_writes参数用于设置写数据时是否写入commit log,如果设置为false,则写请求不会写commit log,会有丢失数据的风险。
此参数默认为true,即要写commit log,生产系统应该将该参数设置为true。

References:
[1]CREATE KEYSPACE

===
[erq]

xterm终端窗口默认标题都是一样的。经常开很多终端窗口,虽然已经设置了命令行提示符,但如果标题栏也能反应出xterm的当前状态就更好了。

更改标题栏

可以使用xterm的转义序列更改窗口的title

  • ESC]0;stringBEL — Set icon name and window title to string
    设置图标化名字(窗口最小化时)和窗口标题
  • ESC]1;stringBEL — Set icon name to string
    设置图标化名字
  • ESC]2;stringBEL — Set window title to string
    设置窗口标题

where ESC is the escape character (\033), and BEL is the bell character (\007)
此处,ESC是转义字符\033,BEL是bell字符\007

因此在.bashrc中添加如下行,让xterm标题栏显示当前的主机名和用户名以及当前路径信息:

1
PROMPT_COMMAND='echo -ne "\\033\]0;${USER}@${HOSTNAME}\[\`basename ${PWD}\`\]\\007"'

ssh登录时更改标题栏

ssh登录时,xterm窗口应该显示当前所在的远程主机和在远程主机上的当前用户以及路径信息,只要在远程主机的bashrc文件中包含同样的行就可以了。

终端vim标题栏

在终端下使用vim时,默认会修改xterm的标题栏,但是没有主机和用户信息,在~/.vimrc中添加如下:

1
2
let &titlestring=$USER."@".hostname().": %t%M(%F)"
set title

更多符号的含义请

1
2
:help titlestring
:help statusline

References:
[1]Automatically set screen title

===
[erq]

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
# bash
set -o vi
PS1='${debian_chroot:+($debian_chroot)}\\\[\\033\[00;31m\\\]\\u@\\h\\\[\\033\[00m\\\]:\\\[\\033\[00;33m\\\]\\w\\\[\\033\[00m\\\]\\$ \\\[\\033\[00;32m\\\]'

# bash history
export HISTCONTROL=ignoredups
shopt -s histappend

export PATH=$HOME/bin:/sbin:/usr/sbin:/usr/local/sbin:/opt/bin:$PATH

# xterm
if \[ "$TERM" == "xterm" \]; then
export TERM=xterm-256color
fi

# mac os x
if \[ \`uname\` == "Darwin" \]; then
alias ll='ls -lh grep ^total && ls -lh grep ^d && ls -lh grep -v ^d grep -v ^total'
fi

# freebsd
if \[ \`uname\` == "FreeBSD" \]; then
# gnuls
alias ls='gnuls --color=auto --show-control-chars'
fi


# linux
if \[ \`uname\` == "Linux" \]; then
alias ll='ls -lh --group-directories-first'
alias update='sudo apt-get update && sudo apt-get dist-upgrade -y && sudo apt-get autoremove -y'
fi

# general
alias la='ls -a'
alias ccd='clear;cd'
alias :q='exit'

# mail
export MAIL=$HOME/Maildir

#oracle
alias sqlplus='rlwrap sqlplus'
alias rman='rlwrap rman'

#export ORACLE_SID=
#export ORACLE_BASE=/u01/app/oracle
#export ORACLE_HOME=$ORACLE_BASE/product/10.2.0/db_1
#export PATH=$ORACLE_HOME/bin:$PATH
#export TNS_ADMIN=$ORACLE_HOME/network/admin
#export SQLPATH=$ORACLE_BASE/scripts

===
[erq]

pgpool-II是一个中间件,工作在PostgreSQL服务器和PostgreSQL数据库客户端之间,具有连接池,复制,负载均衡,并行查询和突破连接限制等功能。
git代码库

Postgres-XC (eXtensible Cluster) is a multi-master write-scalable PostgreSQL cluster based on shared-nothing architecture。

Postgres-XC(eXtensible Cluster)是基于无共享架构的多主节点、写性能扩展PostgreSQL集群。主要的特性有:写性能扩展,多主节点同步,对应用程序透明,就像使用传统的PostgeSQL一样,等等。但其没有sharding机制,只是对称多主节点集群,提高写性能和可用性。

Postgres-XL是一个通用的 ACID 、开源的、可方便进行水平扩展的、擅长OLTP 写频繁的业务、 SQL 数据库解决方案。衍生自PostgreSQL数据库。

shutdown immediate有时候会长时间挂起(hang),一般是因为在等待某些进程关闭。
不要轻易尝试shutdown abort,shutdown abort之后启动时,需要进行实例恢复,容易出现问题。

据说startup force会中止当前数据库的运行,并开始重新正常的启动数据库,没试过,最好也不好尝试。

最佳的办法还是找到等待的进程,将其kill之后,再行shutdown immediate。

References:
[1]oracle shutdown 没有反应
[2]Oracle shutdown immediate无法关闭数据库解决方法
[3]oracle shutdown immediate 一直没反应解决方案

===
[erq]

本来DG已经配置为最大可用模式(maximize availability)了,也就是说如果物理备库无法访问,不应该影响主库的运行才对。
但是有一条光纤物理链路出现故障严重丢包,导致主库向其传输归档日志时出现超时错误,无法归档,导致主库运行十分缓慢,直至无法正常访问。

alert_orcl.log文件中有如下错误记录:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
Fri Jul 17 00:34:32 2015
ORA-16198: LGWR received timedout error from KSR
LGWR: Attempting destination LOG_ARCHIVE_DEST_3 network reconnect (16198)
LGWR: Destination LOG_ARCHIVE_DEST_3 network reconnect abandoned
Fri Jul 17 00:34:32 2015
Errors in file e:\\oracle\\product\\10.2.0\\admin\\orcl\\bdump\\orcl_lgwr_1568.trc:
ORA-16198: Timeout incurred on internal channel during remote archival

LGWR: Network asynch I/O wait error 16198 log 1 service 'db_feich'
Fri Jul 17 00:34:32 2015
Destination LOG_ARCHIVE_DEST_3 is UNSYNCHRONIZED
LGWR: Failed to archive log 1 thread 1 sequence 4526 (16198)
Fri Jul 17 00:34:32 2015
Errors in file e:\\oracle\\product\\10.2.0\\admin\\orcl\\bdump\\orcl_lgwr_1568.trc:
ORA-16198: Timeout incurred on internal channel during remote archival

LGWR: Error 16198 closing archivelog file 'db_feich'
LGWR: Error 16198 disconnecting from destination LOG_ARCHIVE_DEST_3 standby host 'db_feich'
Fri Jul 17 00:34:42 2015
LGWR: Standby redo logfile selected to archive thread 1 sequence 4527
LGWR: Standby redo logfile selected for thread 1 sequence 4527 for destination LOG_ARCHIVE_DEST_2
Fri Jul 17 00:34:43 2015
Thread 1 advanced to log sequence 4527 (LGWR switch)
Current log# 2 seq# 4527 mem# 0: E:\\ORACLE\\PRODUCT\\10.2.0\\ORADATA\\ORCL\\REDO02.LOG
Fri Jul 17 00:37:18 2015
LGWR: Standby redo logfile selected to archive thread 1 sequence 4528
LGWR: Standby redo logfile selected for thread 1 sequence 4528 for destination LOG_ARCHIVE_DEST_2
Fri Jul 17 00:37:18 2015
Thread 1 advanced to log sequence 4528 (LGWR switch)
Current log# 3 seq# 4528 mem# 0: E:\\ORACLE\\PRODUCT\\10.2.0\\ORADATA\\ORCL\\REDO03.LOG
Fri Jul 17 00:40:00 2015
LGWR: Standby redo logfile selected to archive thread 1 sequence 4529
LGWR: Standby redo logfile selected for thread 1 sequence 4529 for destination LOG_ARCHIVE_DEST_2
Fri Jul 17 00:40:00 2015
Thread 1 advanced to log sequence 4529 (LGWR switch)
Current log# 1 seq# 4529 mem# 0: E:\\ORACLE\\PRODUCT\\10.2.0\\ORADATA\\ORCL\\REDO01.LOG
Fri Jul 17 00:40:05 2015
ARC1: LGWR is actively archiving destination LOG_ARCHIVE_DEST_3
ARC1: Standby redo logfile selected for thread 1 sequence 4528 for destination LOG_ARCHIVE_DEST_3
Fri Jul 17 00:40:06 2015
Thread 1 cannot allocate new log, sequence 4530
Checkpoint not complete
Current log# 1 seq# 4529 mem# 0: E:\\ORACLE\\PRODUCT\\10.2.0\\ORADATA\\ORCL\\REDO01.LOG
LNSc started with pid=88, OS id=3680
Fri Jul 17 00:40:16 2015
LGWR: Standby redo logfile selected to archive thread 1 sequence 4530
LGWR: Standby redo logfile selected for thread 1 sequence 4530 for destination LOG_ARCHIVE_DEST_3
LGWR: Standby redo logfile selected to archive thread 1 sequence 4530
LGWR: Standby redo logfile selected for thread 1 sequence 4530 for destination LOG_ARCHIVE_DEST_2
Fri Jul 17 00:40:21 2015
Thread 1 advanced to log sequence 4530 (LGWR switch)
Current log# 2 seq# 4530 mem# 0: E:\\ORACLE\\PRODUCT\\10.2.0\\ORADATA\\ORCL\\REDO02.LOG
Fri Jul 17 00:40:21 2015
ARC0: LGWR is actively archiving destination LOG_ARCHIVE_DEST_3
Fri Jul 17 00:40:44 2015
ARC0: Standby redo logfile selected for thread 1 sequence 4529 for destination LOG_ARCHIVE_DEST_3
Fri Jul 17 00:48:36 2015
ORA-16198: LGWR received timedout error from KSR
LGWR: Attempting destination LOG_ARCHIVE_DEST_3 network reconnect (16198)
LGWR: Destination LOG_ARCHIVE_DEST_3 network reconnect abandoned
Fri Jul 17 00:48:36 2015
Errors in file e:\\oracle\\product\\10.2.0\\admin\\orcl\\bdump\\orcl_lgwr_1568.trc:
ORA-16198: Timeout incurred on internal channel during remote archival

LGWR: Network asynch I/O wait error 16198 log 2 service 'db_feich'
Fri Jul 17 00:50:58 2015
LGWR: Failed to archive log 2 thread 1 sequence 4530 (16198)
Fri Jul 17 00:50:58 2015
Errors in file e:\\oracle\\product\\10.2.0\\admin\\orcl\\bdump\\orcl_lgwr_1568.trc:
ORA-16198: Timeout incurred on internal channel during remote archival

LGWR: Error 16198 closing archivelog file 'db_feich'
LGWR: Standby redo logfile selected to archive thread 1 sequence 4531
LGWR: Standby redo logfile selected for thread 1 sequence 4531 for destination LOG_ARCHIVE_DEST_2
Fri Jul 17 00:51:07 2015
Thread 1 advanced to log sequence 4531 (LGWR switch)
Current log# 1 seq# 4531 mem# 0: E:\\ORACLE\\PRODUCT\\10.2.0\\ORADATA\\ORCL\\REDO01.LOG
Fri Jul 17 00:54:12 2015
Thread 1 cannot allocate new log, sequence 4532
All online logs needed archiving
Current log# 1 seq# 4531 mem# 0: E:\\ORACLE\\PRODUCT\\10.2.0\\ORADATA\\ORCL\\REDO01.LOG
Fri Jul 17 00:56:10 2015
LGWR: Standby redo logfile selected to archive thread 1 sequence 4532
LGWR: Standby redo logfile selected for thread 1 sequence 4532 for destination LOG_ARCHIVE_DEST_2
Fri Jul 17 00:56:10 2015
Thread 1 advanced to log sequence 4532 (LGWR switch)
Current log# 2 seq# 4532 mem# 0: E:\\ORACLE\\PRODUCT\\10.2.0\\ORADATA\\ORCL\\REDO02.LOG
Fri Jul 17 00:56:10 2015
ARC0: LGWR is actively archiving destination LOG_ARCHIVE_DEST_3
ARC0: Standby redo logfile selected for thread 1 sequence 4531 for destination LOG_ARCHIVE_DEST_3
Fri Jul 17 00:58:53 2015
ARC1: Attempting destination LOG_ARCHIVE_DEST_3 network reconnect (12571)
ARC1: Destination LOG_ARCHIVE_DEST_3 network reconnect abandoned
Fri Jul 17 00:58:53 2015
Errors in file e:\\oracle\\product\\10.2.0\\admin\\orcl\\bdump\\orcl_arc1_2584.trc:
ORA-12571: TNS:packet writer failure

ARC1: I/O error 12571 archiving log 3 to 'db_feich'
Fri Jul 17 00:58:54 2015
Errors in file e:\\oracle\\product\\10.2.0\\admin\\orcl\\bdump\\orcl_arc1_2584.trc:
ORA-01041: internal error. hostdef extension doesn't exist

ARC1: Error 1041 Closing archive log file 'db_feich'
LNSc started with pid=46, OS id=3000
Fri Jul 17 00:59:30 2015
LGWR: Standby redo logfile selected to archive thread 1 sequence 4533
LGWR: Standby redo logfile selected for thread 1 sequence 4533 for destination LOG_ARCHIVE_DEST_3
LGWR: Standby redo logfile selected to archive thread 1 sequence 4533
LGWR: Standby redo logfile selected for thread 1 sequence 4533 for destination LOG_ARCHIVE_DEST_2
Fri Jul 17 00:59:35 2015
Thread 1 advanced to log sequence 4533 (LGWR switch)
Current log# 3 seq# 4533 mem# 0: E:\\ORACLE\\PRODUCT\\10.2.0\\ORADATA\\ORCL\\REDO03.LOG
Fri Jul 17 01:05:13 2015
ORA-16198: LGWR received timedout error from KSR
LGWR: Attempting destination LOG_ARCHIVE_DEST_3 network reconnect (16198)
LGWR: Destination LOG_ARCHIVE_DEST_3 network reconnect abandoned
Fri Jul 17 01:05:13 2015
Errors in file e:\\oracle\\product\\10.2.0\\admin\\orcl\\bdump\\orcl_lgwr_1568.trc:
ORA-16198: Timeout incurred on internal channel during remote archival

LGWR: Network asynch I/O wait error 16198 log 3 service 'db_feich'
Fri Jul 17 01:07:52 2015
Thread 1 cannot allocate new log, sequence 4534
All online logs needed archiving
Current log# 3 seq# 4533 mem# 0: E:\\ORACLE\\PRODUCT\\10.2.0\\ORADATA\\ORCL\\REDO03.LOG
Fri Jul 17 01:23:03 2015
ARC1: Attempting destination LOG_ARCHIVE_DEST_3 network reconnect (12571)
ARC1: Destination LOG_ARCHIVE_DEST_3 network reconnect abandoned
Fri Jul 17 01:23:03 2015
Errors in file e:\\oracle\\product\\10.2.0\\admin\\orcl\\bdump\\orcl_arc1_2584.trc:
ORA-12571: TNS:packet writer failure

FAL\[server, ARC1\]: FAL archive failed, see trace file.
Fri Jul 17 01:23:03 2015
Errors in file e:\\oracle\\product\\10.2.0\\admin\\orcl\\bdump\\orcl_arc1_2584.trc:
ORA-16055: FAL request rejected

ARCH: FAL archive failed. Archiver continuing
Fri Jul 17 02:01:41 2015
>>> WAITED TOO LONG FOR A ROW CACHE ENQUEUE LOCK! pid=128
System State dumped to trace file e:\\oracle\\product\\10.2.0\\admin\\orcl\\udump\\orcl_ora_2960.trc
Fri Jul 17 02:05:22 2015
ARC1: Attempting destination LOG_ARCHIVE_DEST_3 network reconnect (12571)
ARC1: Destination LOG_ARCHIVE_DEST_3 network reconnect abandoned
Fri Jul 17 02:05:22 2015
Errors in file e:\\oracle\\product\\10.2.0\\admin\\orcl\\bdump\\orcl_arc1_2584.trc:
ORA-12571: TNS:packet writer failure

FAL\[server, ARC1\]: FAL archive failed, see trace file.
Fri Jul 17 02:05:22 2015
Errors in file e:\\oracle\\product\\10.2.0\\admin\\orcl\\bdump\\orcl_arc1_2584.trc:
ORA-16055: FAL request rejected

ARCH: FAL archive failed. Archiver continuing
Fri Jul 17 02:05:25 2015
LGWR: Failed to archive log 3 thread 1 sequence 4533 (16198)
Fri Jul 17 02:05:25 2015
Errors in file e:\\oracle\\product\\10.2.0\\admin\\orcl\\bdump\\orcl_lgwr_1568.trc:
ORA-16198: Timeout incurred on internal channel during remote archival

LGWR: Error 16198 closing archivelog file 'db_feich'
LGWR: Error 16198 disconnecting from destination LOG_ARCHIVE_DEST_3 standby host 'db_feich'
Fri Jul 17 02:05:36 2015
LGWR: Standby redo logfile selected to archive thread 1 sequence 4534
LGWR: Standby redo logfile selected for thread 1 sequence 4534 for destination LOG_ARCHIVE_DEST_2
Fri Jul 17 02:05:36 2015
Thread 1 advanced to log sequence 4534 (LGWR switch)
Current log# 2 seq# 4534 mem# 0: E:\\ORACLE\\PRODUCT\\10.2.0\\ORADATA\\ORCL\\REDO02.LOG
Fri Jul 17 02:07:39 2015
Thread 1 cannot allocate new log, sequence 4535
All online logs needed archiving
Current log# 2 seq# 4534 mem# 0: E:\\ORACLE\\PRODUCT\\10.2.0\\ORADATA\\ORCL\\REDO02.LOG
LNSc started with pid=44, OS id=3356
Fri Jul 17 02:11:42 2015
Error 12637 received logging on to the standby
Fri Jul 17 02:11:42 2015
Errors in file e:\\oracle\\product\\10.2.0\\admin\\orcl\\bdump\\orcl_arc1_2584.trc:
ORA-12637: Packet receive failed

PING\[ARC1\]: Heartbeat failed to connect to standby 'db_feich'. Error is 12637.
Fri Jul 17 02:12:02 2015
Error 28547 received logging on to the standby
Fri Jul 17 02:12:06 2015
LGWR: Error 28547 creating archivelog file 'db_feich'
Fri Jul 17 02:12:06 2015
Errors in file e:\\oracle\\product\\10.2.0\\admin\\orcl\\bdump\\orcl_lgwr_1568.trc:
ORA-28547: connection to server failed, probable Oracle Net admin error

LGWR: Standby redo logfile selected to archive thread 1 sequence 4535
LGWR: Standby redo logfile selected for thread 1 sequence 4535 for destination LOG_ARCHIVE_DEST_2
Fri Jul 17 02:12:11 2015
Thread 1 advanced to log sequence 4535 (LGWR switch)
Current log# 3 seq# 4535 mem# 0: E:\\ORACLE\\PRODUCT\\10.2.0\\ORADATA\\ORCL\\REDO03.LOG
Fri Jul 17 02:14:11 2015
Thread 1 cannot allocate new log, sequence 4536
All online logs needed archiving
Current log# 3 seq# 4535 mem# 0: E:\\ORACLE\\PRODUCT\\10.2.0\\ORADATA\\ORCL\\REDO03.LOG
Fri Jul 17 02:17:20 2015
LGWR: Failed to archive log 3 thread 1 sequence 4535 (28547)
LNSc started with pid=46, OS id=4036
Fri Jul 17 02:17:33 2015
Error 12170 received logging on to the standby
Fri Jul 17 02:17:33 2015
Errors in file e:\\oracle\\product\\10.2.0\\admin\\orcl\\bdump\\orcl_arc1_2584.trc:
ORA-12170: TNS:Connect timeout occurred

PING\[ARC1\]: Heartbeat failed to connect to standby 'db_feich'. Error is 12170.
Fri Jul 17 02:17:44 2015
Error 12170 received logging on to the standby
Fri Jul 17 02:17:48 2015
LGWR: Error 12170 creating archivelog file 'db_feich'
Fri Jul 17 02:17:48 2015
Errors in file e:\\oracle\\product\\10.2.0\\admin\\orcl\\bdump\\orcl_lgwr_1568.trc:
ORA-12170: TNS:Connect timeout occurred

Fri Jul 17 02:17:53 2015
ARC0: Attempting destination LOG_ARCHIVE_DEST_3 network reconnect (12571)
ARC0: Destination LOG_ARCHIVE_DEST_3 network reconnect abandoned
Fri Jul 17 02:17:53 2015
Errors in file e:\\oracle\\product\\10.2.0\\admin\\orcl\\bdump\\orcl_arc0_2784.trc:
ORA-12571: TNS:packet writer failure

ARC0: I/O error 12571 archiving log 1 to 'db_feich'
Fri Jul 17 02:17:53 2015
LGWR: Standby redo logfile selected to archive thread 1 sequence 4536
LGWR: Standby redo logfile selected for thread 1 sequence 4536 for destination LOG_ARCHIVE_DEST_2
Fri Jul 17 02:17:53 2015
Thread 1 advanced to log sequence 4536 (LGWR switch)
Current log# 2 seq# 4536 mem# 0: E:\\ORACLE\\PRODUCT\\10.2.0\\ORADATA\\ORCL\\REDO02.LOG
Fri Jul 17 02:17:57 2015
Errors in file e:\\oracle\\product\\10.2.0\\admin\\orcl\\bdump\\orcl_arc0_2784.trc:
ORA-01041: internal error. hostdef extension doesn't exist

Fri Jul 17 02:17:57 2015
ARC0: Error 1041 Closing archive log file 'db_feich'
Fri Jul 17 02:19:45 2015
LGWR: Failed to archive log 2 thread 1 sequence 4536 (12170)
LGWR: Standby redo logfile selected to archive thread 1 sequence 4537
LGWR: Standby redo logfile selected for thread 1 sequence 4537 for destination LOG_ARCHIVE_DEST_2
Fri Jul 17 02:19:51 2015
Thread 1 advanced to log sequence 4537 (LGWR switch)
Current log# 1 seq# 4537 mem# 0: E:\\ORACLE\\PRODUCT\\10.2.0\\ORADATA\\ORCL\\REDO01.LOG
Fri Jul 17 02:21:23 2015
LGWR: Standby redo logfile selected to archive thread 1 sequence 4538
LGWR: Standby redo logfile selected for thread 1 sequence 4538 for destination LOG_ARCHIVE_DEST_2
Fri Jul 17 02:21:23 2015
Thread 1 advanced to log sequence 4538 (LGWR switch)
Current log# 3 seq# 4538 mem# 0: E:\\ORACLE\\PRODUCT\\10.2.0\\ORADATA\\ORCL\\REDO03.LOG
Fri Jul 17 02:22:55 2015
LGWR: Standby redo logfile selected to archive thread 1 sequence 4539
LGWR: Standby redo logfile selected for thread 1 sequence 4539 for destination LOG_ARCHIVE_DEST_2
Fri Jul 17 02:22:55 2015
Thread 1 advanced to log sequence 4539 (LGWR switch)
Current log# 2 seq# 4539 mem# 0: E:\\ORACLE\\PRODUCT\\10.2.0\\ORADATA\\ORCL\\REDO02.LOG
Fri Jul 17 02:22:55 2015
ARC0: LGWR is actively archiving destination LOG_ARCHIVE_DEST_3
Fri Jul 17 02:23:16 2015
Error 12170 received logging on to the standby
Fri Jul 17 02:23:16 2015
Errors in file e:\\oracle\\product\\10.2.0\\admin\\orcl\\bdump\\orcl_arc0_2784.trc:
ORA-12170: TNS:Connect timeout occurred

ARC0: Error 12170 Creating archive log file to 'db_feich'
Fri Jul 17 05:22:08 2015
LGWR: Standby redo logfile selected to archive thread 1 sequence 4540
LGWR: Standby redo logfile selected for thread 1 sequence 4540 for destination LOG_ARCHIVE_DEST_2
Fri Jul 17 05:22:08 2015
Thread 1 advanced to log sequence 4540 (LGWR switch)
Current log# 1 seq# 4540 mem# 0: E:\\ORACLE\\PRODUCT\\10.2.0\\ORADATA\\ORCL\\REDO01.LOG
Fri Jul 17 05:23:19 2015
Suppressing further error logging of LOG_ARCHIVE_DEST_3.
Fri Jul 17 05:23:40 2015
Error 12170 received logging on to the standby
Suppressing further error logging of LOG_ARCHIVE_DEST_3.

这错误从早上0点多就出现了。
trace文件中有如下错误:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
*** 2015-07-17 00:34:32.348 57480 kcrr.c
Making upinblc request to LNSc (ocis 0x000000001677E828). Begin time is <07/17/2015 00:34:32> and NET_TIMEOUT is <180> seconds
NetServer pid:2832
LGWR had received a timeout previously. return timeout again
LGWR had received a timeout previously. return timeout again
Error 16198 closing standby archive log file at host 'db_feich'
ORA-16198: Timeout incurred on internal channel during remote archival
Archive destination LOG_ARCHIVE_DEST_3 made inactive: File close error
*** 2015-07-17 00:34:32.351 62692 kcrr.c
LGWR: Error 16198 closing archivelog file 'db_feich'
LGWR had received a timeout previously. return timeout again
Error 16198 detaching RFS from standby instance at host 'db_feich'
*** 2015-07-17 00:34:32.392 57269 kcrr.c
Making upidhs request to LNSc (ocis 0x000000001677E828). Begin time is <07/17/2015 00:34:32> and NET_TIMEOUT <180> seconds
NetServer pid:2832
*** 2015-07-17 00:34:36.392
*** 2015-07-17 00:34:36.392 57391 kcrr.c
Cleaninup up LNS12 \[pid 2832\] after network detach
*** 2015-07-17 00:34:36.397 54392 kcrr.c
LNSc \[pid 2832\] receiving termination signal..
.... killed successfully
.. pmon posted for async lns cleanup
*** 2015-07-17 00:34:37.597 62692 kcrr.c
LGWR: Error 16198 disconnecting from destination LOG_ARCHIVE_DEST_3 standby host 'db_feich'
Ignoring krslcmp() detach error 16198
Receiving message from LNSb
*** 2015-07-17 00:34:42.863 57625 kcrr.c
Making upinbls request to LNSb (ocis 0x000000001677E580). Begin time is <07/17/2015 00:34:37> and NET_TIMEOUT is <180> seconds
NetServer pid:3584
*** 2015-07-17 00:37:13.051
*** 2015-07-17 00:37:13.051 57480 kcrr.c
Making upinblc request to LNSb (ocis 0x000000001677E580). Begin time is <07/17/2015 00:37:13> and NET_TIMEOUT is <180> seconds
NetServer pid:3584
Receiving message from LNSb
Receiving message from LNSb
*** 2015-07-17 00:37:18.509 57625 kcrr.c
Making upinbls request to LNSb (ocis 0x000000001677E580). Begin time is <07/17/2015 00:37:13> and NET_TIMEOUT is <180> seconds
NetServer pid:3584
*** 2015-07-17 00:39:54.545
*** 2015-07-17 00:39:54.545 57480 kcrr.c
Making upinblc request to LNSb (ocis 0x000000001677E580). Begin time is <07/17/2015 00:39:54> and NET_TIMEOUT is <180> seconds
NetServer pid:3584
Receiving message from LNSb
Receiving message from LNSb
*** 2015-07-17 00:40:00.802 57625 kcrr.c
Making upinbls request to LNSb (ocis 0x000000001677E580). Begin time is <07/17/2015 00:39:54> and NET_TIMEOUT is <180> seconds
NetServer pid:3584
*** 2015-07-17 00:40:12.967
*** 2015-07-17 00:40:12.967 57480 kcrr.c
Making upinblc request to LNSb (ocis 0x000000001677E580). Begin time is <07/17/2015 00:40:12> and NET_TIMEOUT is <180> seconds
NetServer pid:3584
Receiving message from LNSb
*** 2015-07-17 00:40:13.066 55452 kcrr.c
Initializing NetServer\[LNSc\] for dest=db_feich mode SYNC
LNSc is not running anymore.
New SYNC LNSc needs to be started
Waiting for subscriber count on LGWR-LNSc channel to go to zero
Subscriber count went to zero - time now is <07/17/2015 00:40:13>
Starting LNSc ...
Waiting for LNSc to initialize itself
*** 2015-07-17 00:40:16.083 55743 kcrr.c
Netserver LNSc \[pid 3680\] for mode SYNC has been initialized
Performing a channel reset to ignore previous responses
Successfully started LNSc \[pid 3680\] for dest db_feich mode SYNC ocis=0x000000001677E828
*** 2015-07-17 00:40:16.083 56246 kcrr.c
Making upiahm request to LNSc \[pid 3680\]: Begin Time is <07/17/2015 00:40:13>. NET_TIMEOUT = <180> seconds
Waiting for LNSc to respond to upiahm
*** 2015-07-17 00:40:16.156 56410 kcrr.c
upiahm connect done status is 0
Receiving message from LNSc
Receiving message from LNSc
Receiving message from LNSc
*** 2015-07-17 00:40:16.508 57625 kcrr.c
Making upinbls request to LNSc (ocis 0x000000001677E828). Begin time is <07/17/2015 00:40:13> and NET_TIMEOUT is <180> seconds
NetServer pid:3680
Receiving message from LNSb
*** 2015-07-17 00:40:21.818 57625 kcrr.c
Making upinbls request to LNSb (ocis 0x000000001677E580). Begin time is <07/17/2015 00:40:16> and NET_TIMEOUT is <180> seconds
NetServer pid:3584
*** 2015-07-17 00:40:42.129
LGWR found LNSc alive.. waiting for msg
*** 2015-07-17 00:40:52.140
LGWR found LNSc alive.. waiting for msg
*** 2015-07-17 00:41:03.356
LGWR found LNSc alive.. waiting for msg
*** 2015-07-17 00:41:18.972

按官方文档讲,DG最大可用模式下备库不可用是可以自动降级到最大性能模式的,但这次没有,主库的运行受到了影响。
因为物理链路修复没有时间表,因此当务之急时先略过或禁用出故障的备库。

首先尝试将DG降级到最大性能模式:
[sql]
SQL>alter database set standby database to maximize performance;
[/sql]

无果

禁用出故障的备库:
[sql]
SQL>alter system set log_archive_dest_state_3 = defer;
SQL>shutdown immediate
SQL>startup
[/sql]
故障解除。而且需要重新启动数据库。

LOG_ARCHIVE_DEST_STATE_n参数,可以取值alternate reset defer enable,其含义如下:

  • alternate
    备用。只有当其他归档目标失效时才尝试使用本归档目标。

  • defer
    保留配置信息,但从归档目标中删除,直到重新启用。

  • enable
    启用。这是默认值。

  • *UPDATE(06/05/2016):**

在未禁用有网络故障的standby备库的情况下,如果重新启动数据库,有可能startup过程会一直卡在Database mounted.处

从alert.log看,主库正在向有网络故障的备库同步归档日志文件,因为网络不通畅就卡住了,此时将备库禁用后,重新启动数据库就可以了。

====
[erq]

主要的浏览器平台都已经支持promise了,当然IE还不行,但是Edge支持。