欢迎来到冰点文库! | 帮助中心 分享价值,成长自我!
冰点文库
全部分类
  • 临时分类>
  • IT计算机>
  • 经管营销>
  • 医药卫生>
  • 自然科学>
  • 农林牧渔>
  • 人文社科>
  • 工程科技>
  • PPT模板>
  • 求职职场>
  • 解决方案>
  • 总结汇报>
  • ImageVerifierCode 换一换
    首页 冰点文库 > 资源分类 > DOCX文档下载
    分享到微信 分享到微博 分享到QQ空间

    haip异常导致rac节点无法启动的解决方案.docx

    • 资源ID:12891766       资源大小:54.15KB        全文页数:11页
    • 资源格式: DOCX        下载积分:5金币
    快捷下载 游客一键下载
    账号登录下载
    微信登录下载
    三方登录下载: 微信开放平台登录 QQ登录
    二维码
    微信扫一扫登录
    下载资源需要5金币
    邮箱/手机:
    温馨提示:
    快捷下载时,用户名和密码都是您填写的邮箱或者手机号,方便查询和重复下载(系统自动生成)。
    如填写123,账号就是123,密码也是123。
    支付方式: 支付宝    微信支付   
    验证码:   换一换

    加入VIP,免费下载
     
    账号:
    密码:
    验证码:   换一换
      忘记密码?
        
    友情提示
    2、PDF文件下载后,可能会被浏览器默认打开,此种情况可以点击浏览器菜单,保存网页到桌面,就可以正常下载了。
    3、本站不支持迅雷下载,请使用电脑自带的IE浏览器,或者360浏览器、谷歌浏览器下载即可。
    4、本站资源下载后的文档和图纸-无水印,预览文档经过压缩,下载后原文更清晰。
    5、试题试卷类文档,如果标题没有明确说明有答案则都视为没有答案,请知晓。

    haip异常导致rac节点无法启动的解决方案.docx

    1、haip异常导致rac节点无法启动的解决方案HAIP异常,导致RAC节点无法启动的解决方案一个网友咨询一个问题,他的11.2.0.2 RAC(for Aix),没有安装任何patch或PSU。其中一个节点重启之后无法正常启动,查看ocssd日志如下:2014-08-09 14:21:46.094: CSSD5414clssnmSendingThread: sent 4 join msgs to all nodes2014-08-09 14:21:46.421: CSSD4900clssgmWaitOnEventValue: after CmInfo State val 3, eval 1 wa

    2、ited 0s2014-08-09 14:21:47.042: CSSD4129clssnmvDHBValidateNCopy: node 1, rac01, has a disk HB, but no network HB, DHB has rcfg 217016033, wrtcnt, 255958157, LATS 1518247992, lastSeqNo 255958154, uniqueness 1406064021, timestamp 1407565306/15017580722014-08-09 14:21:47.051: CSSD3358clssnmvDHBValidate

    3、NCopy: node 1, rac01, has a disk HB, but no network HB, DHB has rcfg 217016033, wrtcnt, 255958158, LATS 1518248002, lastSeqNo 255958155, uniqueness 1406064021, timestamp 1407565306/15017581902014-08-09 14:21:47.421: CSSD4900clssgmWaitOnEventValue: after CmInfo State val 3, eval 1 waited 02014-08-09

    4、14:21:48.042: CSSD4129clssnmvDHBValidateNCopy: node 1, rac01, has a disk HB, but no network HB, DHB has rcfg 217016033, wrtcnt, 255958160, LATS 1518248993, lastSeqNo 255958157, uniqueness 1406064021, timestamp 1407565307/15017590802014-08-09 14:21:48.052: CSSD3358clssnmvDHBValidateNCopy: node 1, rac

    5、01, has a disk HB, but no network HB, DHB has rcfg 217016033, wrtcnt, 255958161, LATS 1518249002, lastSeqNo 255958158, uniqueness 1406064021, timestamp 1407565307/15017591912014-08-09 14:21:48.421: CSSD4900clssgmWaitOnEventValue: after CmInfo State val 3, eval 1 waited 02014-08-09 14:21:49.043: CSSD

    6、4129clssnmvDHBValidateNCopy: node 1, rac01, has a disk HB, but no network HB, DHB has rcfg 217016033, wrtcnt, 255958163, LATS 1518249993, lastSeqNo 255958160, uniqueness 1406064021, timestamp 1407565308/15017600822014-08-09 14:21:49.056: CSSD3358clssnmvDHBValidateNCopy: node 1, rac01, has a disk HB,

    7、 but no network HB, DHB has rcfg 217016033, wrtcnt, 255958164, LATS 1518250007, lastSeqNo 255958161, uniqueness 1406064021, timestamp 1407565308/15017601932014-08-09 14:21:49.421: CSSD4900clssgmWaitOnEventValue: after CmInfo State val 3, eval 1 waited 02014-08-09 14:21:50.044: CSSD4129clssnmvDHBVali

    8、dateNCopy: node 1, rac01, has a disk HB, but no network HB, DHB has rcfg 217016033, wrtcnt, 255958166, LATS 1518250994, lastSeqNo 255958163, uniqueness 1406064021, timestamp 1407565309/15017610902014-08-09 14:21:50.057: CSSD3358clssnmvDHBValidateNCopy: node 1, rac01, has a disk HB, but no network HB

    9、, DHB has rcfg 217016033, wrtcnt, 255958167, LATS 1518251007, lastSeqNo 255958164, uniqueness 1406064021, timestamp 1407565309/15017611952014-08-09 14:21:50.421: CSSD4900clssgmWaitOnEventValue: after CmInfo State val 3, eval 1 waited 02014-08-09 14:21:51.046: CSSD4129clssnmvDHBValidateNCopy: node 1,

    10、 rac01, has a disk HB, but no network HB, DHB has rcfg 217016033, wrtcnt, 255958169, LATS 1518251996, lastSeqNo 255958166, uniqueness 1406064021, timestamp 1407565310/15017621002014-08-09 14:21:51.057: CSSD3358clssnmvDHBValidateNCopy: node 1, rac01, has a disk HB, but no network HB, DHB has rcfg 217

    11、016033, wrtcnt, 255958170, LATS 1518252008, lastSeqNo 255958167, uniqueness 1406064021, timestamp 1407565310/15017622052014-08-09 14:21:51.102: CSSD5414clssnmSendingThread: sending join msg to all nodes2014-08-09 14:21:51.102: CSSD5414clssnmSendingThread: sent 5 join msgs to all nodes2014-08-09 14:2

    12、1:51.421: CSSD4900clssgmWaitOnEventValue: after CmInfo State val 3, eval 1 waited 02014-08-09 14:21:52.050: CSSD4129clssnmvDHBValidateNCopy: node 1, rac01, has a disk HB, but no network HB, DHB has rcfg 217016033, wrtcnt, 255958172, LATS 1518253000, lastSeqNo 255958169, uniqueness 1406064021, timest

    13、amp 1407565311/15017631102014-08-09 14:21:52.058: CSSD3358clssnmvDHBValidateNCopy: node 1, rac01, has a disk HB, but no network HB, DHB has rcfg 217016033, wrtcnt, 255958173, LATS 1518253008, lastSeqNo 255958170, uniqueness 1406064021, timestamp 1407565311/15017632302014-08-09 14:21:52.089: CSSD5671

    14、clssnmRcfgMgrThread: Local Join2014-08-09 14:21:52.089: CSSD5671clssnmLocalJoinEvent: begin on node(2), waittime 1930002014-08-09 14:21:52.089: CSSD5671clssnmLocalJoinEvent: set curtime (1518253039) for my node2014-08-09 14:21:52.089: CSSD5671clssnmLocalJoinEvent: scanning 32 nodes2014-08-09 14:21:5

    15、2.089: CSSD5671clssnmLocalJoinEvent: Node rac01, number 1, is in an existing cluster with disk state 32014-08-09 14:21:52.090: CSSD5671clssnmLocalJoinEvent: takeover aborted due to cluster member node found on disk2014-08-09 14:21:52.431: CSSD4900clssgmWaitOnEventValue: after CmInfo State val 3, eva

    16、l 1 waited 0从上面的信息,很容易给人感觉是心跳的问题。这么理解也不错,只是这里的心跳不是指的我们说理解的传统的心跳网络。我让他在crs正常的一个节点查询如下信息,我们就知道原因了,如下:SQLselectname,ip_addressfromv$cluster_interconnects;NAMEIP_ADDRESS-en0169.254.116.242大家可以看到,这里心跳IP为什么是169网段呢?很明显跟我们的/etc/hosts设置不匹配啊?why ?这里我们要介绍下Oracle 11gR2 引入的HAIP特性,Oracle引入该特性的目的是为了通过自身的技术来实现心跳网络的

    17、冗余,而不再依赖于第三方技术,比如Linux的bond等等。在Oracle 11.2.0.2版本之前,如果使用了OS级别的心跳网卡绑定,那么Oracle仍然以OS绑定的为准。从11.2.0.2开始,如果没有在OS层面进行心跳冗余的配置,那么Oracle自己的HAIP就启用了。所以你虽然设置的192.168.1.100,然而实际上Oracle使用是169.254这个网段。关于这一点,大家可以去看下alert log,从该日志都能看出来,这里不多说。我们可以看到,正常节点能看到如下的169网段的ip,问题节点确实看不到这个169的网段IP:Oracle MOS提供了一种解决方案,如下:crsctl

    18、 start res ora.cluster_interconnect.haip -init经过测试,使用root进行操作,也是不行的。针对HAIP的无法启动,Oracle MOS文档说通常是如下几种情况:1) 心跳网卡异常2) 多播工作机制异常3)防火墙等原因4)Oracle bug对于心跳网卡异常,如果只有一块心跳网卡,那么ping其他的ip就可以进行验证了,这一点很容易排除。对于多播的问题,可以通过Oracle提供的mcasttest.pl脚本进行检测(请参考Grid Infrastructure Startup During Patching, Install or Upgrade M

    19、ay Fail Due to Multicasting Requirement (ID 1212703.1),我这里的检查结果如下:$ ./mcasttest.pl -n rac02,rac01 -i en0# Setup for node rac02 #Checking node access rac02Checking node login rac02Checking/Creating Directory /tmp/mcasttest for binary on node rac02Distributing mcast2 binary to node rac02# Setup for no

    20、de rac01 #Checking node access rac01Checking node login rac01Checking/Creating Directory /tmp/mcasttest for binary on node rac01Distributing mcast2 binary to node rac01# testing Multicast on all nodes #Test for Multicast address 230.0.1.0Aug 11 21:39:39 | Multicast Failed for en0 using address 230.0

    21、.1.0:42000Test for Multicast address 224.0.0.251Aug 11 21:40:09 | Multicast Failed for en0 using address 224.0.0.251:42001$虽然这里通过脚本检查,发现对于230和224网段都是不通的,然而这不见得一定说明是多播的问题导致的。虽然我们查看ocssd.log,通过搜索mcast关键可以看到相关的信息。实际上,我在自己的11.2.0.3 Linux RAC环境中测试,即使mcasttest.pl测试不通,也可以正常启动CRS的。由于网友这里是AIX,应该我就排除防火墙的问题了。因

    22、此最后怀疑Bug 9974223的可能性比较大。实际上,如果你去查询HAIP的相关信息,你会发现该特性其实存在不少的Oracle bug。其中 for knowns HAIP issues in 11gR2/12c Grid Infrastructure (1640865.1)就记录12个HAIP相关的bug。由于这里他的第1个节点无法操作,为了安全,是不能有太多的操作的。对于HAIP,如果没有使用多心跳网卡的情况下,我觉得完全是可以禁止掉的。但是昨天查MOS文档,具体说不能disabled。最后测试发现其实是可以禁止掉的。如下是我的测试过程:rootrac1 bin# ./crsctl mo

    23、dify res ora.cluster_interconnect.haip -attr ENABLED=0 -initrootrac1 bin# ./crsctl stop crsCRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on rac1CRS-2673: Attempting to stop ora.crsd on rac1CRS-2790: Starting shutdown of Cluster Ready Services-managed resources on

    24、 rac1CRS-2673: Attempting to stop ora.oc4j on rac1CRS-2673: Attempting to stop ora.cvu on rac1CRS-2673: Attempting to stop ora.LISTENER_SCAN1.lsnr on rac1CRS-2673: Attempting to stop ora.GRID.dg on rac1CRS-2673: Attempting to stop ora.registry.acfs on rac1CRS-2673: Attempting to stop ora.rac1.vip on

    25、 rac1CRS-2677: Stop of ora.rac1.vip on rac1 succeededCRS-2672: Attempting to start ora.rac1.vip on rac2CRS-2677: Stop of ora.LISTENER_SCAN1.lsnr on rac1 succeededCRS-2673: Attempting to stop ora.scan1.vip on rac1CRS-2677: Stop of ora.scan1.vip on rac1 succeededCRS-2672: Attempting to start ora.scan1

    26、.vip on rac2CRS-2676: Start of ora.rac1.vip on rac2 succeededCRS-2676: Start of ora.scan1.vip on rac2 succeededCRS-2672: Attempting to start ora.LISTENER_SCAN1.lsnr on rac2CRS-2676: Start of ora.LISTENER_SCAN1.lsnr on rac2 succeededCRS-2677: Stop of ora.registry.acfs on rac1 succeededCRS-2677: Stop

    27、of ora.oc4j on rac1 succeededCRS-2677: Stop of ora.cvu on rac1 succeededCRS-2677: Stop of ora.GRID.dg on rac1 succeededCRS-2673: Attempting to stop ora.asm on rac1CRS-2677: Stop of ora.asm on rac1 succeededCRS-2673: Attempting to stop ora.ons on rac1CRS-2677: Stop of ora.ons on rac1 succeededCRS-267

    28、3: Attempting to stop work on rac1CRS-2677: Stop of work on rac1 succeededCRS-2792: Shutdown of Cluster Ready Services-managed resources on rac1 has completedCRS-2677: Stop of ora.crsd on rac1 succeededCRS-2673: Attempting to stop ora.drivers.acfs on rac1CRS-2673: Attempting to stop ora.ctssd on rac

    29、1CRS-2673: Attempting to stop ora.evmd on rac1CRS-2673: Attempting to stop ora.asm on rac1CRS-2673: Attempting to stop ora.mdnsd on rac1CRS-2677: Stop of ora.mdnsd on rac1 succeededCRS-2677: Stop of ora.evmd on rac1 succeededCRS-2677: Stop of ora.ctssd on rac1 succeededCRS-2677: Stop of ora.asm on r

    30、ac1 succeededCRS-2673: Attempting to stop ora.cluster_interconnect.haip on rac1CRS-2677: Stop of ora.cluster_interconnect.haip on rac1 succeededCRS-2673: Attempting to stop ora.cssd on rac1CRS-2677: Stop of ora.cssd on rac1 succeededCRS-2673: Attempting to stop ora.crf on rac1CRS-2677: Stop of ora.drivers.acfs on rac1 succeededCRS-2677: Stop of ora.crf on rac1 succeededCRS-2673: Attempting to stop ora.gipcd on rac1CRS-2677: Stop of ora.gipcd on rac1 succeededCRS-2673: Attempting to stop ora.gpnpd on rac1CRS-2677: Stop of ora.gpnpd on rac1


    注意事项

    本文(haip异常导致rac节点无法启动的解决方案.docx)为本站会员主动上传,冰点文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。 若此文所含内容侵犯了您的版权或隐私,请立即通知冰点文库(点击联系客服),我们立即给予删除!

    温馨提示:如果因为网速或其他原因下载失败请重新下载,重复下载不扣分。




    关于我们 - 网站声明 - 网站地图 - 资源地图 - 友情链接 - 网站客服 - 联系我们

    copyright@ 2008-2023 冰点文库 网站版权所有

    经营许可证编号:鄂ICP备19020893号-2


    收起
    展开