本来DG已经配置为最大可用模式(maximize availability)了,也就是说如果物理备库无法访问,不应该影响主库的运行才对。 但是有一条光纤物理链路出现故障严重丢包,导致主库向其传输归档日志时出现超时错误,无法归档,导致主库运行十分缓慢,直至无法正常访问。
alert_orcl.log文件中有如下错误记录:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 Fri Jul 17 00 :34 :32 2015 ORA-16198 : LGWR received timedout error from KSR LGWR: Attempting destination LOG_ARCHIVE_DEST_3 network reconnect (16198 ) LGWR: Destination LOG_ARCHIVE_DEST_3 network reconnect abandoned Fri Jul 17 00 :34 :32 2015 Errors in file e:\\oracle\\product\\10.2 .0 \\admin\\orcl\\bdump\\orcl_lgwr_1568.trc: ORA-16198 : Timeout incurred on internal channel during remote archival LGWR: Network asynch I/O wait error 16198 log 1 service 'db_feich' Fri Jul 17 00 :34 :32 2015 Destination LOG_ARCHIVE_DEST_3 is UNSYNCHRONIZED LGWR: Failed to archive log 1 thread 1 sequence 4526 (16198 ) Fri Jul 17 00 :34 :32 2015 Errors in file e:\\oracle\\product\\10.2 .0 \\admin\\orcl\\bdump\\orcl_lgwr_1568.trc: ORA-16198 : Timeout incurred on internal channel during remote archival LGWR: Error 16198 closing archivelog file 'db_feich' LGWR: Error 16198 disconnecting from destination LOG_ARCHIVE_DEST_3 standby host 'db_feich' Fri Jul 17 00 :34 :42 2015 LGWR: Standby redo logfile selected to archive thread 1 sequence 4527 LGWR: Standby redo logfile selected for thread 1 sequence 4527 for destination LOG_ARCHIVE_DEST_2 Fri Jul 17 00 :34 :43 2015 Thread 1 advanced to log sequence 4527 (LGWR switch ) Current log# 2 seq# 4527 mem# 0: E:\\ORACLE\\PRODUCT\\10.2.0\\ORADATA\\ORCL\\REDO02.LOG Fri Jul 17 00 :37 :18 2015 LGWR: Standby redo logfile selected to archive thread 1 sequence 4528 LGWR: Standby redo logfile selected for thread 1 sequence 4528 for destination LOG_ARCHIVE_DEST_2 Fri Jul 17 00 :37 :18 2015 Thread 1 advanced to log sequence 4528 (LGWR switch ) Current log# 3 seq# 4528 mem# 0: E:\\ORACLE\\PRODUCT\\10.2.0\\ORADATA\\ORCL\\REDO03.LOG Fri Jul 17 00 :40 :00 2015 LGWR: Standby redo logfile selected to archive thread 1 sequence 4529 LGWR: Standby redo logfile selected for thread 1 sequence 4529 for destination LOG_ARCHIVE_DEST_2 Fri Jul 17 00 :40 :00 2015 Thread 1 advanced to log sequence 4529 (LGWR switch ) Current log# 1 seq# 4529 mem# 0: E:\\ORACLE\\PRODUCT\\10.2.0\\ORADATA\\ORCL\\REDO01.LOG Fri Jul 17 00 :40 :05 2015 ARC1: LGWR is actively archiving destination LOG_ARCHIVE_DEST_3 ARC1: Standby redo logfile selected for thread 1 sequence 4528 for destination LOG_ARCHIVE_DEST_3 Fri Jul 17 00 :40 :06 2015 Thread 1 cannot allocate new log, sequence 4530 Checkpoint not complete Current log# 1 seq# 4529 mem# 0: E:\\ORACLE\\PRODUCT\\10.2.0\\ORADATA\\ORCL\\REDO01.LOG LNSc started with pid=88 , OS id=3680 Fri Jul 17 00 :40 :16 2015 LGWR: Standby redo logfile selected to archive thread 1 sequence 4530 LGWR: Standby redo logfile selected for thread 1 sequence 4530 for destination LOG_ARCHIVE_DEST_3 LGWR: Standby redo logfile selected to archive thread 1 sequence 4530 LGWR: Standby redo logfile selected for thread 1 sequence 4530 for destination LOG_ARCHIVE_DEST_2 Fri Jul 17 00 :40 :21 2015 Thread 1 advanced to log sequence 4530 (LGWR switch ) Current log# 2 seq# 4530 mem# 0: E:\\ORACLE\\PRODUCT\\10.2.0\\ORADATA\\ORCL\\REDO02.LOG Fri Jul 17 00 :40 :21 2015 ARC0: LGWR is actively archiving destination LOG_ARCHIVE_DEST_3 Fri Jul 17 00 :40 :44 2015 ARC0: Standby redo logfile selected for thread 1 sequence 4529 for destination LOG_ARCHIVE_DEST_3 Fri Jul 17 00 :48 :36 2015 ORA-16198 : LGWR received timedout error from KSR LGWR: Attempting destination LOG_ARCHIVE_DEST_3 network reconnect (16198 ) LGWR: Destination LOG_ARCHIVE_DEST_3 network reconnect abandoned Fri Jul 17 00 :48 :36 2015 Errors in file e:\\oracle\\product\\10.2 .0 \\admin\\orcl\\bdump\\orcl_lgwr_1568.trc: ORA-16198 : Timeout incurred on internal channel during remote archival LGWR: Network asynch I/O wait error 16198 log 2 service 'db_feich' Fri Jul 17 00 :50 :58 2015 LGWR: Failed to archive log 2 thread 1 sequence 4530 (16198 ) Fri Jul 17 00 :50 :58 2015 Errors in file e:\\oracle\\product\\10.2 .0 \\admin\\orcl\\bdump\\orcl_lgwr_1568.trc: ORA-16198 : Timeout incurred on internal channel during remote archival LGWR: Error 16198 closing archivelog file 'db_feich' LGWR: Standby redo logfile selected to archive thread 1 sequence 4531 LGWR: Standby redo logfile selected for thread 1 sequence 4531 for destination LOG_ARCHIVE_DEST_2 Fri Jul 17 00 :51 :07 2015 Thread 1 advanced to log sequence 4531 (LGWR switch ) Current log# 1 seq# 4531 mem# 0: E:\\ORACLE\\PRODUCT\\10.2.0\\ORADATA\\ORCL\\REDO01.LOG Fri Jul 17 00 :54 :12 2015 Thread 1 cannot allocate new log, sequence 4532 All online logs needed archiving Current log# 1 seq# 4531 mem# 0: E:\\ORACLE\\PRODUCT\\10.2.0\\ORADATA\\ORCL\\REDO01.LOG Fri Jul 17 00 :56 :10 2015 LGWR: Standby redo logfile selected to archive thread 1 sequence 4532 LGWR: Standby redo logfile selected for thread 1 sequence 4532 for destination LOG_ARCHIVE_DEST_2 Fri Jul 17 00 :56 :10 2015 Thread 1 advanced to log sequence 4532 (LGWR switch ) Current log# 2 seq# 4532 mem# 0: E:\\ORACLE\\PRODUCT\\10.2.0\\ORADATA\\ORCL\\REDO02.LOG Fri Jul 17 00 :56 :10 2015 ARC0: LGWR is actively archiving destination LOG_ARCHIVE_DEST_3 ARC0: Standby redo logfile selected for thread 1 sequence 4531 for destination LOG_ARCHIVE_DEST_3 Fri Jul 17 00 :58 :53 2015 ARC1: Attempting destination LOG_ARCHIVE_DEST_3 network reconnect (12571 ) ARC1: Destination LOG_ARCHIVE_DEST_3 network reconnect abandoned Fri Jul 17 00 :58 :53 2015 Errors in file e:\\oracle\\product\\10.2 .0 \\admin\\orcl\\bdump\\orcl_arc1_2584.trc: ORA-12571 : TNS:packet writer failure ARC1: I/O error 12571 archiving log 3 to 'db_feich' Fri Jul 17 00 :58 :54 2015 Errors in file e:\\oracle\\product\\10.2 .0 \\admin\\orcl\\bdump\\orcl_arc1_2584.trc: ORA-01041 : internal error. hostdef extension doesn't exist ARC1: Error 1041 Closing archive log file ' db_feich' LNSc started with pid=46, OS id=3000 Fri Jul 17 00:59:30 2015 LGWR: Standby redo logfile selected to archive thread 1 sequence 4533 LGWR: Standby redo logfile selected for thread 1 sequence 4533 for destination LOG_ARCHIVE_DEST_3 LGWR: Standby redo logfile selected to archive thread 1 sequence 4533 LGWR: Standby redo logfile selected for thread 1 sequence 4533 for destination LOG_ARCHIVE_DEST_2 Fri Jul 17 00:59:35 2015 Thread 1 advanced to log sequence 4533 (LGWR switch) Current log# 3 seq# 4533 mem# 0: E:\\ORACLE\\PRODUCT\\10.2.0\\ORADATA\\ORCL\\REDO03.LOG Fri Jul 17 01:05:13 2015 ORA-16198: LGWR received timedout error from KSR LGWR: Attempting destination LOG_ARCHIVE_DEST_3 network reconnect (16198) LGWR: Destination LOG_ARCHIVE_DEST_3 network reconnect abandoned Fri Jul 17 01:05:13 2015 Errors in file e:\\oracle\\product\\10.2.0\\admin\\orcl\\bdump\\orcl_lgwr_1568.trc: ORA-16198: Timeout incurred on internal channel during remote archival LGWR: Network asynch I/O wait error 16198 log 3 service ' db_feich' Fri Jul 17 01:07:52 2015 Thread 1 cannot allocate new log, sequence 4534 All online logs needed archiving Current log# 3 seq# 4533 mem# 0: E:\\ORACLE\\PRODUCT\\10.2.0\\ORADATA\\ORCL\\REDO03.LOG Fri Jul 17 01:23:03 2015 ARC1: Attempting destination LOG_ARCHIVE_DEST_3 network reconnect (12571) ARC1: Destination LOG_ARCHIVE_DEST_3 network reconnect abandoned Fri Jul 17 01:23:03 2015 Errors in file e:\\oracle\\product\\10.2.0\\admin\\orcl\\bdump\\orcl_arc1_2584.trc: ORA-12571: TNS:packet writer failure FAL\[server, ARC1\]: FAL archive failed, see trace file. Fri Jul 17 01:23:03 2015 Errors in file e:\\oracle\\product\\10.2.0\\admin\\orcl\\bdump\\orcl_arc1_2584.trc: ORA-16055: FAL request rejected ARCH: FAL archive failed. Archiver continuing Fri Jul 17 02:01:41 2015 >>> WAITED TOO LONG FOR A ROW CACHE ENQUEUE LOCK! pid=128 System State dumped to trace file e:\\oracle\\product\\10.2.0\\admin\\orcl\\udump\\orcl_ora_2960.trc Fri Jul 17 02:05:22 2015 ARC1: Attempting destination LOG_ARCHIVE_DEST_3 network reconnect (12571) ARC1: Destination LOG_ARCHIVE_DEST_3 network reconnect abandoned Fri Jul 17 02:05:22 2015 Errors in file e:\\oracle\\product\\10.2.0\\admin\\orcl\\bdump\\orcl_arc1_2584.trc: ORA-12571: TNS:packet writer failure FAL\[server, ARC1\]: FAL archive failed, see trace file. Fri Jul 17 02:05:22 2015 Errors in file e:\\oracle\\product\\10.2.0\\admin\\orcl\\bdump\\orcl_arc1_2584.trc: ORA-16055: FAL request rejected ARCH: FAL archive failed. Archiver continuing Fri Jul 17 02:05:25 2015 LGWR: Failed to archive log 3 thread 1 sequence 4533 (16198) Fri Jul 17 02:05:25 2015 Errors in file e:\\oracle\\product\\10.2.0\\admin\\orcl\\bdump\\orcl_lgwr_1568.trc: ORA-16198: Timeout incurred on internal channel during remote archival LGWR: Error 16198 closing archivelog file ' db_feich' LGWR: Error 16198 disconnecting from destination LOG_ARCHIVE_DEST_3 standby host ' db_feich' Fri Jul 17 02:05:36 2015 LGWR: Standby redo logfile selected to archive thread 1 sequence 4534 LGWR: Standby redo logfile selected for thread 1 sequence 4534 for destination LOG_ARCHIVE_DEST_2 Fri Jul 17 02:05:36 2015 Thread 1 advanced to log sequence 4534 (LGWR switch) Current log# 2 seq# 4534 mem# 0: E:\\ORACLE\\PRODUCT\\10.2.0\\ORADATA\\ORCL\\REDO02.LOG Fri Jul 17 02:07:39 2015 Thread 1 cannot allocate new log, sequence 4535 All online logs needed archiving Current log# 2 seq# 4534 mem# 0: E:\\ORACLE\\PRODUCT\\10.2.0\\ORADATA\\ORCL\\REDO02.LOG LNSc started with pid=44, OS id=3356 Fri Jul 17 02:11:42 2015 Error 12637 received logging on to the standby Fri Jul 17 02:11:42 2015 Errors in file e:\\oracle\\product\\10.2.0\\admin\\orcl\\bdump\\orcl_arc1_2584.trc: ORA-12637: Packet receive failed PING\[ARC1\]: Heartbeat failed to connect to standby ' db_feich'. Error is 12637. Fri Jul 17 02:12:02 2015 Error 28547 received logging on to the standby Fri Jul 17 02:12:06 2015 LGWR: Error 28547 creating archivelog file ' db_feich' Fri Jul 17 02:12:06 2015 Errors in file e:\\oracle\\product\\10.2.0\\admin\\orcl\\bdump\\orcl_lgwr_1568.trc: ORA-28547: connection to server failed, probable Oracle Net admin error LGWR: Standby redo logfile selected to archive thread 1 sequence 4535 LGWR: Standby redo logfile selected for thread 1 sequence 4535 for destination LOG_ARCHIVE_DEST_2 Fri Jul 17 02:12:11 2015 Thread 1 advanced to log sequence 4535 (LGWR switch) Current log# 3 seq# 4535 mem# 0: E:\\ORACLE\\PRODUCT\\10.2.0\\ORADATA\\ORCL\\REDO03.LOG Fri Jul 17 02:14:11 2015 Thread 1 cannot allocate new log, sequence 4536 All online logs needed archiving Current log# 3 seq# 4535 mem# 0: E:\\ORACLE\\PRODUCT\\10.2.0\\ORADATA\\ORCL\\REDO03.LOG Fri Jul 17 02:17:20 2015 LGWR: Failed to archive log 3 thread 1 sequence 4535 (28547) LNSc started with pid=46, OS id=4036 Fri Jul 17 02:17:33 2015 Error 12170 received logging on to the standby Fri Jul 17 02:17:33 2015 Errors in file e:\\oracle\\product\\10.2.0\\admin\\orcl\\bdump\\orcl_arc1_2584.trc: ORA-12170: TNS:Connect timeout occurred PING\[ARC1\]: Heartbeat failed to connect to standby ' db_feich'. Error is 12170. Fri Jul 17 02:17:44 2015 Error 12170 received logging on to the standby Fri Jul 17 02:17:48 2015 LGWR: Error 12170 creating archivelog file ' db_feich' Fri Jul 17 02:17:48 2015 Errors in file e:\\oracle\\product\\10.2.0\\admin\\orcl\\bdump\\orcl_lgwr_1568.trc: ORA-12170: TNS:Connect timeout occurred Fri Jul 17 02:17:53 2015 ARC0: Attempting destination LOG_ARCHIVE_DEST_3 network reconnect (12571) ARC0: Destination LOG_ARCHIVE_DEST_3 network reconnect abandoned Fri Jul 17 02:17:53 2015 Errors in file e:\\oracle\\product\\10.2.0\\admin\\orcl\\bdump\\orcl_arc0_2784.trc: ORA-12571: TNS:packet writer failure ARC0: I/O error 12571 archiving log 1 to ' db_feich' Fri Jul 17 02:17:53 2015 LGWR: Standby redo logfile selected to archive thread 1 sequence 4536 LGWR: Standby redo logfile selected for thread 1 sequence 4536 for destination LOG_ARCHIVE_DEST_2 Fri Jul 17 02:17:53 2015 Thread 1 advanced to log sequence 4536 (LGWR switch) Current log# 2 seq# 4536 mem# 0: E:\\ORACLE\\PRODUCT\\10.2.0\\ORADATA\\ORCL\\REDO02.LOG Fri Jul 17 02:17:57 2015 Errors in file e:\\oracle\\product\\10.2.0\\admin\\orcl\\bdump\\orcl_arc0_2784.trc: ORA-01041: internal error. hostdef extension doesn' t existFri Jul 17 02 :17 :57 2015 ARC0: Error 1041 Closing archive log file 'db_feich' Fri Jul 17 02 :19 :45 2015 LGWR: Failed to archive log 2 thread 1 sequence 4536 (12170 ) LGWR: Standby redo logfile selected to archive thread 1 sequence 4537 LGWR: Standby redo logfile selected for thread 1 sequence 4537 for destination LOG_ARCHIVE_DEST_2 Fri Jul 17 02 :19 :51 2015 Thread 1 advanced to log sequence 4537 (LGWR switch ) Current log# 1 seq# 4537 mem# 0: E:\\ORACLE\\PRODUCT\\10.2.0\\ORADATA\\ORCL\\REDO01.LOG Fri Jul 17 02 :21 :23 2015 LGWR: Standby redo logfile selected to archive thread 1 sequence 4538 LGWR: Standby redo logfile selected for thread 1 sequence 4538 for destination LOG_ARCHIVE_DEST_2 Fri Jul 17 02 :21 :23 2015 Thread 1 advanced to log sequence 4538 (LGWR switch ) Current log# 3 seq# 4538 mem# 0: E:\\ORACLE\\PRODUCT\\10.2.0\\ORADATA\\ORCL\\REDO03.LOG Fri Jul 17 02 :22 :55 2015 LGWR: Standby redo logfile selected to archive thread 1 sequence 4539 LGWR: Standby redo logfile selected for thread 1 sequence 4539 for destination LOG_ARCHIVE_DEST_2 Fri Jul 17 02 :22 :55 2015 Thread 1 advanced to log sequence 4539 (LGWR switch ) Current log# 2 seq# 4539 mem# 0: E:\\ORACLE\\PRODUCT\\10.2.0\\ORADATA\\ORCL\\REDO02.LOG Fri Jul 17 02 :22 :55 2015 ARC0: LGWR is actively archiving destination LOG_ARCHIVE_DEST_3 Fri Jul 17 02 :23 :16 2015 Error 12170 received logging on to the standbyFri Jul 17 02 :23 :16 2015 Errors in file e:\\oracle\\product\\10.2 .0 \\admin\\orcl\\bdump\\orcl_arc0_2784.trc: ORA-12170 : TNS:Connect timeout occurred ARC0: Error 12170 Creating archive log file to 'db_feich' Fri Jul 17 05 :22 :08 2015 LGWR: Standby redo logfile selected to archive thread 1 sequence 4540 LGWR: Standby redo logfile selected for thread 1 sequence 4540 for destination LOG_ARCHIVE_DEST_2 Fri Jul 17 05 :22 :08 2015 Thread 1 advanced to log sequence 4540 (LGWR switch ) Current log# 1 seq# 4540 mem# 0: E:\\ORACLE\\PRODUCT\\10.2.0\\ORADATA\\ORCL\\REDO01.LOG Fri Jul 17 05 :23 :19 2015 Suppressing further error logging of LOG_ARCHIVE_DEST_3. Fri Jul 17 05 :23 :40 2015 Error 12170 received logging on to the standbySuppressing further error logging of LOG_ARCHIVE_DEST_3.
这错误从早上0点多就出现了。 trace文件中有如下错误:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 *** 2015 -07 -17 00 :34 :32.348 57480 kcrr.c Making upinblc request to LNSc (ocis 0x000000001677E828 ). Begin time is <07 /17 /2015 00 :34 :32 > and NET_TIMEOUT is <180 > seconds NetServer pid:2832 LGWR had received a timeout previously. return timeout again LGWR had received a timeout previously. return timeout again Error 16198 closing standby archive log file at host 'db_feich' ORA-16198 : Timeout incurred on internal channel during remote archival Archive destination LOG_ARCHIVE_DEST_3 made inactive: File close error *** 2015 -07 -17 00 :34 :32.351 62692 kcrr.c LGWR: Error 16198 closing archivelog file 'db_feich' LGWR had received a timeout previously. return timeout again Error 16198 detaching RFS from standby instance at host 'db_feich' *** 2015 -07 -17 00 :34 :32.392 57269 kcrr.c Making upidhs request to LNSc (ocis 0x000000001677E828 ). Begin time is <07 /17 /2015 00 :34 :32 > and NET_TIMEOUT <180 > seconds NetServer pid:2832 *** 2015 -07 -17 00 :34 :36.392 *** 2015 -07 -17 00 :34 :36.392 57391 kcrr.c Cleaninup up LNS12 \[pid 2832 \] after network detach *** 2015 -07 -17 00 :34 :36.397 54392 kcrr.c LNSc \[pid 2832 \] receiving termination signal.. .... killed successfully .. pmon posted for async lns cleanup *** 2015 -07 -17 00 :34 :37.597 62692 kcrr.c LGWR: Error 16198 disconnecting from destination LOG_ARCHIVE_DEST_3 standby host 'db_feich' Ignoring krslcmp() detach error 16198 Receiving message from LNSb *** 2015 -07 -17 00 :34 :42.863 57625 kcrr.c Making upinbls request to LNSb (ocis 0x000000001677E580 ). Begin time is <07 /17 /2015 00 :34 :37 > and NET_TIMEOUT is <180 > seconds NetServer pid:3584 *** 2015 -07 -17 00 :37 :13.051 *** 2015 -07 -17 00 :37 :13.051 57480 kcrr.c Making upinblc request to LNSb (ocis 0x000000001677E580 ). Begin time is <07 /17 /2015 00 :37 :13 > and NET_TIMEOUT is <180 > seconds NetServer pid:3584 Receiving message from LNSb Receiving message from LNSb *** 2015 -07 -17 00 :37 :18.509 57625 kcrr.c Making upinbls request to LNSb (ocis 0x000000001677E580 ). Begin time is <07 /17 /2015 00 :37 :13 > and NET_TIMEOUT is <180 > seconds NetServer pid:3584 *** 2015 -07 -17 00 :39 :54.545 *** 2015 -07 -17 00 :39 :54.545 57480 kcrr.c Making upinblc request to LNSb (ocis 0x000000001677E580 ). Begin time is <07 /17 /2015 00 :39 :54 > and NET_TIMEOUT is <180 > seconds NetServer pid:3584 Receiving message from LNSb Receiving message from LNSb *** 2015 -07 -17 00 :40 :00 .802 57625 kcrr.c Making upinbls request to LNSb (ocis 0x000000001677E580 ). Begin time is <07 /17 /2015 00 :39 :54 > and NET_TIMEOUT is <180 > seconds NetServer pid:3584 *** 2015 -07 -17 00 :40 :12.967 *** 2015 -07 -17 00 :40 :12.967 57480 kcrr.c Making upinblc request to LNSb (ocis 0x000000001677E580 ). Begin time is <07 /17 /2015 00 :40 :12 > and NET_TIMEOUT is <180 > seconds NetServer pid:3584 Receiving message from LNSb *** 2015 -07 -17 00 :40 :13.066 55452 kcrr.c Initializing NetServer\[LNSc\] for dest=db_feich mode SYNC LNSc is not running anymore. New SYNC LNSc needs to be started Waiting for subscriber count on LGWR-LNSc channel to go to zero Subscriber count went to zero - time now is <07 /17 /2015 00 :40 :13 > Starting LNSc ... Waiting for LNSc to initialize itself *** 2015 -07 -17 00 :40 :16.083 55743 kcrr.c Netserver LNSc \[pid 3680 \] for mode SYNC has been initialized Performing a channel reset to ignore previous responses Successfully started LNSc \[pid 3680 \] for dest db_feich mode SYNC ocis=0x000000001677E828 *** 2015 -07 -17 00 :40 :16.083 56246 kcrr.c Making upiahm request to LNSc \[pid 3680 \]: Begin Time is <07 /17 /2015 00 :40 :13 >. NET_TIMEOUT = <180 > seconds Waiting for LNSc to respond to upiahm *** 2015 -07 -17 00 :40 :16.156 56410 kcrr.c upiahm connect done status is 0 Receiving message from LNSc Receiving message from LNSc Receiving message from LNSc *** 2015 -07 -17 00 :40 :16.508 57625 kcrr.c Making upinbls request to LNSc (ocis 0x000000001677E828 ). Begin time is <07 /17 /2015 00 :40 :13 > and NET_TIMEOUT is <180 > seconds NetServer pid:3680 Receiving message from LNSb *** 2015 -07 -17 00 :40 :21.818 57625 kcrr.c Making upinbls request to LNSb (ocis 0x000000001677E580 ). Begin time is <07 /17 /2015 00 :40 :16 > and NET_TIMEOUT is <180 > seconds NetServer pid:3584 *** 2015 -07 -17 00 :40 :42.129 LGWR found LNSc alive.. waiting for msg *** 2015 -07 -17 00 :40 :52.140 LGWR found LNSc alive.. waiting for msg *** 2015 -07 -17 00 :41 :03 .356 LGWR found LNSc alive.. waiting for msg *** 2015 -07 -17 00 :41 :18.972
按官方文档讲,DG最大可用模式下备库不可用是可以自动降级到最大性能模式的,但这次没有,主库的运行受到了影响。 因为物理链路修复没有时间表,因此当务之急时先略过或禁用出故障的备库。
首先尝试将DG降级到最大性能模式: [sql] SQL>alter database set standby database to maximize performance; [/sql]
无果
禁用出故障的备库: [sql] SQL>alter system set log_archive_dest_state_3 = defer; SQL>shutdown immediate SQL>startup [/sql] 故障解除。而且需要重新启动数据库。
LOG_ARCHIVE_DEST_STATE_n参数,可以取值alternate reset defer enable,其含义如下:
在未禁用有网络故障的standby备库的情况下,如果重新启动数据库,有可能startup过程会一直卡在Database mounted.处
从alert.log看,主库正在向有网络故障的备库同步归档日志文件,因为网络不通畅就卡住了,此时将备库禁用后,重新启动数据库就可以了。
==== [erq]