Applies to:

Oracle Server - Enterprise Edition - Version: 11.2.0.2 and later   [Release: 11.2 and later ]
HP-UX Itanium

Symptoms

On HP-UX Itanium

After upgrade database from 11.2.0.1 to 11.2.0.2.

All V_$DIAG views are INVALID except V_$DIAG_AMS_XACTION and V_$DIAG_INFO.

ORA-942 reported on recompiling these invalid views.

ORA-942 was already reported on the same in the upgrade log file while running 'Catadrvw'.

When starting the instance, following messages are seen:
  
Error with dbgriap_init_adr_pga: 48178
   ORA-48178: error encountered while reading an ADR block file during ADR initialization [/home/oracle/oraHome112/diag/rdbms/nora/nora/metadata/ADR_INTERNAL.mif]
   ORA-48122: error with opening the ADR block file [/home/oracle/oraHome112/diag/rdbms/nora/nora/metadata/ADR_INTERNAL.mif] [0]
   ORA-27037: unable to obtain file status
   HPUX-ia64 Error: 251: Function is not available
   Additional information: 46


Changes

Upgrade to 11.2.0.2 on HP-UX Itanium

Cause


Bug 10302203: ORA-942 REPORTED WHEN COMPILING V_$DIAG VIEWS

As from 11.2.0.2, the file 'ADR_INTERNAL.mif' is opened with HP-UX directio code using
VX_SETCACHE (Set caching advisories) which is only available with the HP OnLineJFS product.

This issue happens when the HP OnLineJFS product is not installed !

Can be verified with:
    
usr/sbin/swlist -l product | grep -i vxfs


If HP OnLineJFS is installed it will give following output:
    
JFS B.11.31 Base VxFS File System 4.1 for HP-UX
OnlineJFS B.11.31 Online features of the VxFS File System


Solution


Bug 10302203 is resolved in release 12.1 and patchset 11.2.0.3 (not available at the time of creation of this article.)

A one-off patch 10302203 is available for 11.2.0.2 on HP-UX Itanium and can be downloaded at MOS.

Workarounds:

1. Install the HP OnLineJFS product

-OR-

2. Set init.ora parameter _DIAG_ADR_ENABLED to FALSE

    ADR is disabled and all tracing goes to $ORACLE_HOME/rdbms/log

Posted by pat98

In this Document
  Purpose
  Pre 11.2 Database Issues in 11gR2 Grid Infrastructure Environment
  References


Applies to:

Oracle Server - Enterprise Edition - Version: 10.1.0.2 to 11.1.0.8 - Release: 10.1 to 11.1
Information in this document applies to any platform.

Purpose

The note lists most known issues in 11gR2 Grid Infrastructure (in short GI) + pre-11gR2 database environment.

Even workaround is available in some cases, it's recommended to apply patches whenever possible. Refer to note 1064804.1 for instructions to patch in mixed environment.

For CRS PSU/bundle patch information, refer to note 405820.1 for 10.2 and note 810663.1 for 11.1

Pre 11.2 Database Issues in 11gR2 Grid Infrastructure Environment

1. Error creating or starting pre-11.2 database:

If it happens while creating database, DBCA fails with ORA-29702 and traces shows:

ORA-01501: CREATE DATABASE failed
ORA-00200: control file could not be created
ORA-00202: control file: '+DG_DATA/racdb/control01.ctl'
ORA-17502: ksfdcre:4 Failed to create file +DG_DATA/racdb/control01.ctl
ORA-15001: diskgroup "DG_DATA" does not exist or is not mounted
ORA-15077: could not locate ASM instance serving a required diskgroup


If it happens while starting existing database, sqlplus startup fails:

ORA-01078: failure in processing system parameters
ORA-01565: error in identifying file '+DATA/prod/spfileprod.ora'
ORA-17503: ksfdopn:2 Failed to open file +DATA/prod/spfileprod.ora
ORA-15077: could not locate ASM instance serving a required diskgroup
ORA-29701: unable to connect to Cluster Manager


Database alert.log shows the following message:

ORA-27504: IPC error creating OSD context
ORA-27300: OS system dependent operation:skgxnqtsz failed with status: 0
ORA-27301: OS failure message: Error 0
ORA-27302: failure occurred at: SKGXN not av
clsssinit ret = 21
interconnect information is not available from OCR
  WARNING: No cluster interconnect has been specified. Depending on
           the communication driver configured Oracle cluster traffic
           may be directed to the public interface of this machine.
           Oracle recommends that RAC clustered databases be configured
           with a private interconnect for enhanced security and
           performance


Solution:

To start a pre11gR2 database in 11gR2 Grid Infrastructure environment, node(s) must be pinned. To pin node(s), as root execute:

$GRID_HOME/bin/crsctl pin css -n <racnode1> <racnode2> <racnode3>

To find out whether node(s) is pinned or not:

$GRID_HOME/bin/olsnodes -t -n


2. If datafiles are located in ASM, DBCA fails to create database with error: 

"DBCA could not startup the ASM instance configured on this node. To processd with database creation using ASM you need the ASM instance to be up and running. Do you want to recreate the ASM instance on this node?"

DBCA trace (10g in $RDBMS_HOME/cfgtoollogs/dbca and 11g in $ORACLE_BASE/cfgtools/dbca) shows the following exception:

oracle.sysman.assistants.util.CommonUtils.getListenerProperties(CommonUtils.java:421)
oracle.sysman.assistants.util.asm.ASMAttributes.getConnection(ASMAttributes.java:150)
oracle.sysman.assistants.util.asm.ASMInstanceRAC.validateLocalASMConnection(ASMInstanceRAC.java:811)
oracle.sysman.assistants.util.asm.ASMInstanceRAC.validateASM(ASMInstanceRAC.java:595)
oracle.sysman.assistants.util.asm.ASMInstanceRAC.validateASM(ASMInstanceRAC.java:522)
oracle.sysman.assistants.util.asm.ASMInstanceRAC.validateASM(ASMInstanceRAC.java:515)
oracle.sysman.assistants.dbca.ui.StorageOptionsPage.validate(StorageOptionsPage.java:496)
oracle.sysman.assistants.util.wizard.WizardPageExt.wizardValidatePage(WizardPageExt.java:206)
....

java.awt.EventDispatchThread.pumpEventsForHierarchy(EventDispatchThread.java:151)
java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:145)
java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:137)
java.awt.EventDispatchThread.run(EventDispatchThread.java:100)
[AWT-EventQueue-0] [10:4:5:781] [StorageOptionsPage.validate:611]  ASM present but not startable, querying user..


Solution:

Due to unpublished bug 8288940 (fixed in 10.2.0.5), DBCA will fail if database files are located in ASM. patch 8288940 is platform independent and is available for 10.2.0.3, 10.2.0.4, 11.1.0.6 and 11.1.0.7 as a .jar file and needs to be applied to database home. For other RDBMS version where there's no patch, please refer to workaround section of bug 8520511 in Database Readme <Oracle Database Readme 11g Release 2 (11.2) -> Bug 8520511>


3. SRVCTL fails to start instance if OCR is located in an ASM diskgroup or with different permission/ownership. 

The racgimon log (located in $RDBMS_HOME/log/$HOST/racg/imon_${DBNAME}.log) shows the following message:

2009-10-17 11:20:22.093: [  OCROSD][7866809875]utopen:6':failed in stat OCR file/disk +DATA, errno=2, os err string=No such file or directory
2009-10-17 11:20:22.093: [  OCROSD][7866809875]utopen:7:failed to open OCR file/disk +DATA , errno=2, os err string=No such file or directory
2009-10-17 11:20:22.093: [  OCRRAW][7866809875]proprinit: Could not open raw device
2009-10-17 11:20:22.093: [ default][7866809875]a_init:7!: Backend init unsuccessful : [26]
..
2009-10-17 11:20:22.094: [ CSSCLNT][7866809875]clsssinit: Unable to access OCR device in OCR init.PROC-26: Error while accessing the physical storage OperatingSystem error [No such file or directory] [2]
2009-10-17 11:20:22.094: [    RACG][7866809875] [23974][7866809875][ora.default]: racgimon exiting clsz init failed

Solution:

Due to unpublished bug 8262786, if OCR is located in ASM or with different permission/ownership, srvctl will fail to start earlier database version.Fix for unpublished bug 8262786 is included in 10.2.0.4 CRS PSU4, 10.2.0.5, 11.1.0.7 CRS PSU4, and Windows 10.2.0.4 Patch 36 and needs to be applied to database home. The workaround is to use sqlplus to start the database instead of srvctl

@ 8312004 closed as dup

4. Database fails to start after restart of GI.

$GRID_HOME/log/$HOST/agent/crsd/application_<dbuser>/application_<dbuser>.log shows:

2009-11-05 14:31:19.922: [    AGFW][1342593344] Agent received the message: RESOURCE_START[ora.db10.db102.inst 1 1] ID 4098:632
..
2009-11-05 14:31:19.924: [    AGFW][1275476288] Executing command: start for resource: ora.db10.db102.inst 1 1
2009-11-05 14:31:19.924: [ora.db10.db102.inst][1275476288] [start] START action called.
2009-11-05 14:31:19.924: [ora.db10.db102.inst][1275476288] [start] Executing action script: /home/app/oracle/product/10.2/db/bin/racgwrap[start]
..
2009-11-05 14:31:22.781: [ora.db10.db102.inst][1275476288] [start] Enter user-name: Connected to an idle instance.
2009-11-05 14:31:22.781: [ora.db10.db102.inst][1275476288] [start]
2009-11-05 14:31:22.782: [ora.db10.db102.inst][1275476288] [start] SQL> ORA-01565: error in identifying file '+DATA/db10/spfiledb10.ora'
2009-11-05 14:31:22.782: [ora.db10.db102.inst][1275476288] [start] ORA-17503: ksfdopn:2 Failed to open file +DATA/db10/spfiledb10.ora
2009-11-05 14:31:22.782: [ora.db10.db102.inst][1275476288] [start] ORA-15077: could not locate ASM instance serving a required diskgroup
2009-11-05 14:31:22.782: [ora.db10.db102.inst][1275476288] [start]
2009-11-05 14:31:22.782: [ora.db10.db102.inst][1275476288] [start] ORA-01078: failure in processing system parameters

After GI restart, status of diskgroup:

$GRID_HOME/bin/crsctl stat res -t
..
ora.DATA.dg
              OFFLINE OFFLINE racnode1
              OFFLINE OFFLINE racnode2
..

Solution:

Due to unpublished bug 8448079, while stopping GI, ASM init.ora parameter asm_diskgroups will be nullified and some diskgroups will remain OFFLINE after restart of GI which cause pre-11.2 database fails to start. Fix for unpublished bug 8448079 is included in the 11.2.0.2, patch 8448079 exists for certain platform and needs to be applied to GI home.  The workaround is to add dependence of diskgroup to each instance:

$GRID_HOME/bin/crsctl modify res ora.db10.db102.inst -attr "REQUIRED_RESOURCES='ora.racnode2.ASM2.asm,ora.DATA.dg'"




5. SRVCTL fails to start service

For example:

$RDBMS_HOME/bin/srvctl start service -d b1 -s sb1

$GRID_HOME/bin/crsctl stat res

..
NAME=ora.b1.sb1.cs
TYPE=application
TARGET=ONLINE
STATE=ONLINE on eyrac1f

NAME=ora.b1.sb1.b11.srv
TYPE=application
TARGET=OFFLINE
STATE=OFFLINE

NAME=ora.b1.sb1.b12.srv
TYPE=application
TARGET=OFFLINE
STATE=OFFLINE

NAME=ora.eons
TYPE=ora.eons.type
TARGET=ONLINE           , ONLINE
STATE=ONLINE on eyrac1f, ONLINE on eyrac2f

Solution:

Due to unpublished bug 8373758, pre-11.2 srvctl will fail to start a service if the service is followed by a 11gR2 new resource in "crsctl stat res" output.  In the above example, ora.eons is a new resource in 11.2 and pre-11.2 srvctl can't parse its status properly. Fix for unpublished bug 8373758 is included in 10.2.0.4 CRS PSU4, 10.2.0.5, 11.1.0.7 CRS PSU2 and needs to be applied to database home.  The workaround is to:

    A. Create a dummy pre11.2 resource entry alphabetically after service entry.  In the example above, creating a dummy resource ora.b2.db should workaround the problem

        $RDBMS_HOME/bin/srvctl add database -d b2 -o $RDBMS_HOME

    B. Try to start all services for the database: $RDBMS_HOME/bin/srvctl start service -d b1

ORA-15025: could not open disk '/dev/rdsk/disk1'
ORA-27041: unable to open file
SVR4 Error: 13: Permission denied

And execution of setasmgidwrap fails with:

$GRID_HOME/bin/setasmgidwrap o=/home/oracle/10.2/bin/oracle
KFSG-00312: not an Oracle binary: '/home/oracle/10.2/bin/oracle'


Solution:


Due to bug 9575578, setasmgidwrap fails with pre-11.2 oracle binary. Fix for bug 9575578 is included in 11.2.0.2, patch 9575578 exists for certain platform and needs to be applied to GI home.


7. After removal of pre-11.2 CRS home, the following error reported while trying to start or stop database, or stop cluster:

CRS-5809: Failed to execute 'ACTION_SCRIPT' value of '/ocw/crs10/bin/racgwrap' for 'ora.db10.db'. Error information 'cmd /ocw/crs10/bin/racgwrap not found'
CRS-2680: Clean of 'ora.db10.db' on 'node1' failed

If GI is being stopped, the following will be reported:

CRS-2794: Shutdown of Cluster Ready Services-managed resources on 'node1' has failed
CRS-2675: Stop of 'ora.crsd' on 'node1' failed
CRS-4000: Command Stop failed, or completed with errors.


Solution:

Due to bug 9257105, even upgrade finishes successfully, OCR configuration for pre-11.2 database still points to pre-11.2 CRS home. Fix for bug 9257105 is included in 11.2.0.1.2 and 11.2.0.2, unfortunately the fix itself is having regression which is being worked in unpublished bug 9678856. The workaround is to:

    A. As pre-11.2 database owner, execute the following command for each pre-11.2 database:

crsctl modify res ora.<dbname>.db -attr "ACTION_SCRIPT=$GRID_HOME/bin/racgwrap"

For example

crsctl modify res ora.db10.db -attr "ACTION_SCRIPT=/ocw/grid/bin/racgwrap"

    Or

    B. As pre-11.2 database owner, recreate database resource in OCR with note 1069369.1


8. Singleton service does not failover or uniform service does not stop after local node VIP resource failed or stopped:

$DBHOME/bin/srvctl config service -d racstr
rac_u PREF: racstr1 racstr2 AVAIL:
rac_s PREF: racstr1 AVAIL: racstr2

$DBHOME/bin/srvctl status service -d racstr
Service rac_u is running on instance(s) racstr1, racstr2
Service rac_s is running on instance(s) racstr1

$GRID_HOME/bin/crsctl status res ora.strdt01.vip
NAME=ora.strdt01.vip
TYPE=ora.cluster_vip_net1.type
TARGET=ONLINE
STATE=ONLINE on strdt01

Disable public network on node where instance racstr1 is running, VIP failover to another node:

$GRID_HOME/bin/crsctl status res ora.strdt01.vip
NAME=ora.racha602.vip
TYPE=ora.cluster_vip_net1.type
TARGET=ONLINE
STATE=INTERMEDIATE on racha603  <== Vip failover to other node

$DBHOME/bin/srvctl status service -d racstr
Service rac_u is running on instance(s) racstr1, racstr2  <== Service still running on racstr1
Service rac_s is running on instance(s) racstr1           <== Service did not failover to racstr2

Solution:

Due to unpublished bug 9039498, pre-11.2 database service will not failover or stop if local public network is down. Fix for bug 9039498 is included in 11.2.0.2, 12.1 and applies to GI home.



9. DBCA/srvctl fails to add instance/database with the following error:

PRKO-2010 : Error in adding instance to node: node1
PRKR-1008 : adding of instance dba21 on node node1 to cluster database dba2 failed.
CRS-2518: Invalid directory path '/home/oracle/product/11.1/db/bin/racgwrap'
CRS-0241:  Invalid directory path

Solution:

Due to bug 9767810, if pre-11.2 database is not installed on all nodes of the cluster, srvctl fails to add instance/database to OCR. bug 9767810 is fixed in 11.2.0.1.2 and applies to GI home. The workaround is to copy pre-11.2db/bin/racgwrap to all nodes in the cluster, and make sure it's accessible by pre-11.2 database owner.

10. Pre-11.2.0.2 database fails if any private network fails.

It happens as pre-11.2.0.2 instance isn't aware of Redundant Interconnect feature (multiple active cluster_interconnect in "oifcfg getif"), see note 1210883.1 for more about HAIP

Solution:

As HAIP feature can not be disabled, for environment with GI 11.2.0.2 with pre-11.2.0.2 database/ASM, it's recommended to use OS level bonding solution for private network as in earlier clusterware version.

11. If OCR is located on ASM, pre-10.2.0.5 database can not get cluster interconnect even it's configured in OCR:

$ oifcfg getif
eth3  120.0.0.0  global  public
eth1  10.1.0.0  global  cluster_interconnect

Instance alert.log:

ORA-27504: IPC error creating OSD context
ORA-27300: OS system dependent operation:skgxnqtsz failed with ! status: 0
ORA-27301: OS failure message: Error 0
ORA-27302: failure occurred at: SKGXN not av
clsssinit ret = 21
interconnect information is not available from OCR
 WARNING: No cluster interconnect has been specified. Depending on
          the communication driver configured Oracle cluster traffic
          may be directed to the public interface of this machine.
          Oracle recommends that RAC clustered databases be configured
          with a private interconnect for enhanced security and
          performance.


Solution:

For 11gR2 GI + pre-10.2.0.5 database environment, it's not recommended to place OCR on ASM. Other workaround is to set init.ora parameter cluster_interconnects and ignore the warning.

@ 9865139 closed as duplicate of 5389506 which is only fixed in 10.2.0.5 and above


12. After clusterware upgrade, ocrdump shows key "SYSTEM.ORA_CRS_HOME" is not updated to new clusterware home.

If previous clusterware home is renamed or removed, pre-11gR2 SRVM client (srvctl, DBCA, DBUA etc) fails.

$ srvctl <command> database -d racdb

PRKA-2019 : Error executing command "/ocw/b201/bin/crs_stat". File is missing.

$ ocrdump -stdout -keyname SYSTEM.ORA_CRS_HOME

[SYSTEM.ORA_CRS_HOME]
ORATEXT : /ocw/b201

In this example, /ocw/b201 is pre-upgrade clusterware home which is not updated.

Solution:

Due to bug 10231584, OCR key SYSTEM.ORA_CRS_HOME is not updated during upgrade. The workaround is to execute the following as root on any node:

# ${11.2.0.2GI_HOME}/bin/clscfg -upgrade -lastnode -g <asmadmin>

Note: <asmadmin> is oinstall group typically, but it should be asmadmin in job role separation environment.



13. Service does not stop/failover after stopping corresponding instance

$DB_HOME/log/$HOST/racg/imon_<DB_NAME>.log

2011-03-14 23:22:52.228: [    RACG][1108842816] [14693][1108842816][ora.SD302.SD3021.inst]: CLSR-0521: Event ora.ha.racdb.racdb1.inst.down is rejected by EVM daemon
2011-03-14 23:22:52.228: [    RACG][1108842816] [14693][1108842816][ora.SD302.SD3021.inst]: clsrcepevm: clsrcepostevt status = 17

2011-03-14 23:22:52.228: [    RACG][1108842816] [14693][1108842816][ora.SD302.SD3021.inst]: clsrcep:evm post return 1
2011-03-14 23:22:54.458: [    RACG][1108842816] [14693][1108842816][ora.SD302.SD3021.inst]: CLSR-0521: Event sys.ora.clu.crs.app.trigger is rejected by EVM daemon

2011-03-14 23:23:06.495: [    RACG][1108842816] [14693][1108842816][ora.SD302.SD3021.inst]: clsrcexecut: env _USR_ORA_PFILE=/ocw/grid/racg/tmp/ora.racdb.racdb1.inst.ora

2011-03-14 23:23:06.495: [    RACG][1108842816] [14693][1108842816][ora.SD302.SD3021.inst]: clsrcexecut: cmd = /database/db205/bin/racgeut -e _USR_ORA_DEBUG=0 -e ORACLE_SID=racdb1 540 /database/db205/bin/racgimon stop ora.racdb.racdb1.inst


$GRID_HOME/log/$HOST/evmd/evmd.log

2011-02-16 06:13:55.965: [  EVMAPP][4163668704] EVMD Started
..
2011-02-16 06:13:55.980: [    EVMD][4163668704] Could not open /ocw/grid/evm/admin/conf/evmdaemon.conf Reconfiguration aborted.


Solution:

Due to bug 12340700, the following files has wrong permission:

ls -l $GRID_HOME/evm/admin/conf
-rw------- 1 root root 3032 Feb 19 14:42 evm.auth
-rw------- 1 root root 2318 Feb 19 14:42 evmdaemon.conf
-rw------- 1 root root 4871 Feb 19 14:42 evmlogger.conf

Unpublished bug 12340700 is fixed in 11.2.0.3 and applies to GI home. The workaround is to fix the permission manually with "chmod" command and restart GI. The expected ownership and permission are:

ls -l $GRID_HOME/evm/admin/conf
-rw-r--r-- 1 root root 3032 Feb 19 14:42 evm.auth
-rw-r--r-- 1 root root 2318 Feb 19 14:42 evmdaemon.conf
-rw-r--r-- 1 root root 4871 Feb 19 14:42 evmlogger.conf






14. 10.1 ASM or DB does not start during or after clusterware upgrade

During upgrade, OUI shows the following error:

[INS-20802] Automatic Storage Management Configuration Assistant failed.

Installation log ($ORACLE_BASE/oraInventory/logs/installActions{$TIMESTAMP}.log):

INFO: Read: PRCR-1079 : Failed to start resource ora.asm
INFO: Read: CRS-2672: Attempting to start 'ora.asm' on 'racnode2'
INFO: Read: CRS-5017: The resource action "ora.asm start" encountered the following error:
INFO: Read: ORA-29702: error occurred in Cluster Group Service operation
INFO: Read: . For details refer to "(:CLSN00107:)" in "/ocw/grid/log/racnode2/agent/ohasd/oraagent_oracle/oraagent_oracle.log".
INFO: Read:
INFO: Read: CRS-2674: Start of 'ora.asm' on 'racnode2' failed


Solution:

This is caused by bug 3841387, the solution is to apply patch 9632335 to 10.1 ASM and DB home and retry. Refer to note 969254.1 Section "OUI Plug-in Failure" for retry steps, alternately ASMCA can't be executed manually to upgrade ASM.


15. pre-11.2 instance/database stops after check timed out:

$GRID_HOME/log/<node>/crsd/crsd.log

2011-02-25 15:27:55.803: [    AGFW][1173416256] {0:8:64} Agfw Proxy Server received the message: RESOURCE_STATUS[Proxy] ID 20481:214148
2011-02-25 15:27:55.804: [    AGFW][1173416256] {0:8:64} Received state change for ora.db111.db1111.inst 1 1 [old state = ONLINE, new state = UNKNOWN]
2011-02-25 15:27:55.804: [    AGFW][1173416256] {0:8:64} Received state LABEL change for ora.db111.db1111.inst 1 1 [old label  = , new label = CHECK TIMED OUT]



$GRID_HOME/log/<node>/agent/crsd/application_<dbuser>/application_<dbuser>.log

2011-02-25 15:17:55.791: [ora.db111.db1111.inst][1111284032] {3:47926:62288} [check] Check action called..
2011-02-25 15:17:55.791: [ora.db111.db1111.inst][1111284032] {3:47926:62288} [check] CHECK action called.
2011-02-25 15:17:55.791: [ora.db111.db1111.inst][1111284032] {3:47926:62288} [check] Executing action script: /db/pre11.2/bin/racgwrap[check]
..
2011-02-25 15:27:55.800: [    AGFW][1093445952] {3:47926:62288} Created alert : (:CRSAGF00113:) :  Aborting the command: check for resource: ora.db111.db1111.inst 1 1
2011-02-25 15:27:55.800: [ora.db111.db1111.inst][1093445952] {3:47926:62288} [check] Abort operation called for command: CHECK
2011-02-25 15:27:55.800: [ora.db111.db1111.inst][1093445952] {3:47926:62288} [check] Killing action script: check
2011-02-25 15:27:55.800: [    AGFW][1093445952] {3:47926:62288} Command: check for resource: ora.db111.db1111.inst 1 1 completed with status: TIMEDOUT
2011-02-25 15:27:55.801: [    AGFW][1115486528] {3:47926:62288} ora.db111.db1111.inst 1 1 state changed from: ONLINE to: UNKNOWN
2011-02-25 15:27:55.801: [    AGFW][1115486528] {3:47926:62288} ora.db111.db1111.inst 1 1 would be continued to monitored!
2011-02-25 15:27:55.801: [    AGFW][1115486528] {0:8:64} Generating new Tint for unplanned state change. Original Tint: {3:47926:62288}
2011-02-25 15:27:55.801: [    AGFW][1115486528] {0:8:64} Agent sending message to PE: RESOURCE_STATUS[Proxy] ID 20481:214148
2011-02-25 15:27:55.822: [ora.db111.db1111.inst][1111284032] {3:47926:62288} [stop] Stop action called..
2011-02-25 15:27:55.822: [ora.db111.db1111.inst][1111284032] {3:47926:62288} [stop] STOP action called.
2011-02-25 15:27:55.822: [ora.db111.db1111.inst][1111284032] {3:47926:62288} [stop] Executing action script: /db/pre11.2/bin/racgwrap[stop]


$DB_HOME/log/<node>/racg/imon_<dbname>.log

2011-02-25 15:39:17.866: [    RACG][1141213504] [27191][1141213504][ora.db111.db1111.inst]: timeout: killed the spawned process
..

2011-02-25 15:39:17.867: [    RACG][1141213504] [27191][1141213504][ora.db111.db1111.inst]: clsrcexecut: cmd = /db/pre11.2/bin/racgeut -e _USR_ORA_DEBUG=0 -e ORACLE_SID=db1111 540 /db/pre11.2/bin/racgimon stop ora.db111.db1111.inst
2011-02-25 15:39:17.867: [    RACG][1141213504] [27191][1141213504][ora.db111.db1111.inst]: clsrcexecut: rc = 99, time = 540.680s
2011-02-25 15:39:18.554: [    RACG][1139112256] [27191][1139112256][ora.db111.db1111.inst]: clsrrlbgthr: exiting/shutting down flag is set.



Instance alert.log

..
Shutting down instance: further logons disabled
..
Shutting down instance (immediate)
License high water mark = nnnn



Solution:

Due to bug 11933693, pre-11.2 database/instance will be stopped if agent check times out. The bug will not cause timeout, the cause could be slow/hang OS/DB and needs to be investigated separately.


The bug is fixed in 11.2.0.3 and applies to GI home.


Posted by pat98


If you get the following errors whilst starting up nodeapps on a 10G RAC cluster

bart-> srvctl start nodeapps -n bart
CRS-1028: Dependency analysis failed because of:
CRS-0223: Resource 'ora.bart.gsd' has placement error.
CRS-1028: Dependency analysis failed because of:
CRS-0223: Resource 'ora.bart.ons' has placement error.

Check the status of these resources.

bart-> ./ShowCRS | grep UNK
ora.bart.gsd ONLINE UNKNOWN on bart
ora.bart.ons ONLINE UNKNOWN on bart

To clear the state of the resource you have to use the force option
in the crs_stop command

bart-> crs_stop ora.bart.gsd -f
Attempting to stop `ora.bart.gsd` on member `bart`
Stop of `ora.bart.gsd` on member `bart` succeeded.


bart-> crs_stop ora.bart.ons -f
Attempting to stop `ora.bart.ons` on member `bart`
Stop of `ora.bart.ons` on member `bart` succeeded.

bart-> srvctl start nodeapps -n bart
bart-> ./ShowCRS | grep UNK

bart-> ./ShowCRS | grep bart
ora.bart.gsd ONLINE ONLINE on bart
ora.bart.ons ONLINE ONLINE on bart
ora.bart.vip ONLINE ONLINE on bart

ShowCRS is a shell script which is based on Metalink Note 259301.1

--------------------------- Begin Shell Script -------------------------------

#!/usr/bin/ksh
#
# Sample 10g CRS resource status query script
#
# Description:
#    - Returns formatted version of crs_stat -t, in tabular
#      format, with the complete rsc names and filtering keywords
#   - The argument, $RSC_KEY, is optional and if passed to the script, will
#     limit the output to HA resources whose names match $RSC_KEY.
# Requirements:
#   - $ORA_CRS_HOME should be set in your environment

RSC_KEY=$1
QSTAT=-u
AWK=/usr/xpg4/bin/awk    # if not available use /usr/bin/awk

# Table header:echo ""
$AWK \
  'BEGIN {printf "%-45s %-10s %-18s\n", "HA Resource", "Target", "State";
          printf "%-45s %-10s %-18s\n", "-----------", "------", "-----";}'

# Table body:
$ORA_CRS_HOME/bin/crs_stat $QSTAT | $AWK \
'BEGIN { FS="="; state = 0; }
  $1~/NAME/ && $2~/'$RSC_KEY'/ {appname = $2; state=1};
  state == 0 {next;}
  $1~/TARGET/ && state == 1 {apptarget = $2; state=2;}
  $1~/STATE/ && state == 2 {appstate = $2; state=3;}
  state == 3 {printf "%-45s %-10s %-18s\n", appname, apptarget, appstate; state=0;}'

--------------------------- End Shell Script -------------------------------


Posted by pat98

Metalink (문서 ID 749160.1)

node1을 stop 했는데 VIP가 넘어가지 않을때, 10.2.0.5 로 patch or 11g 사용권장
========================================================================================

Running a 2 node rac with 10.2.0.3. When nodeapps are stopped from one node, the vip-interface

does not failover to existing nodes. Therefore all users can not connect to the database.
When issuing command :
srvctl stop nodeapps -n node1

then VIP-interface of node node1 does not get migrated to node node2.

As you entered the stop command manually this is interpreted by crs not as a failure but as a wish
from your side and so no failover takes place.

The behavior has changed in 11g and will in 10.2.0.5 due to enhancement request 4623630.

 In 11g and 10.2.0.5 you will use the following syntax:

srvctl stop nodeapps -n node1 -r


The -r switch tells the system to relocate the VIP to an available node.

In case this functionality is important, the best is to upgrade to 11g or 10.2.0.5 as it becomes available.


Posted by pat98


1. REDO
 - 모든 변경사항을 REDO로그에 기록한다.
 - REDO로그는 Online REDO와 Archuved REDO로그로 구성된다.
    Online REDO 로그는 2개 이상의 파일로 구성되어 있어,
    현재 사용중인 로그 파일이 꽉 차면 다음 로그 파일로 스위칭
    이때 꽉 차여진 로그 파일을 다른 위치로 백업래 준 파일을 Archived REDO로그 이다.
 - 목적 3가지
  - 데이타 복구 : Archived REDO이용
  - 버퍼캐시복구 : 인스턴스가 비정상적종료시 그떄까지 작업내용이 잃어버리게 됨.
                          재기동 되면 Online REDO로그에 저장된 기록사항을 읽어와 마지막
                          채크포인트와 사고발생직전까지 수행한 트랜젹션을 재현
  - Fast Commit(IO속도 차 극복) : 데이타 버퍼 블록을 디스크에 기록하는 작업은 Random엑세스 방식,
                                              Append방식 Append방식이 상대적으로 빠르게 때문에 우선 변경사항을
                                              Append방식으로 기록하고 동기화는 후에 배치방식으로 일괄수행.
 - REDO로그 버퍼를 REDO로그에 기록하는 시점
  - 3초마다 DBWR프로세스부터 신호흫 받을때
  - 로그 버퍼의 1/3이 차거나 기록된 REDO레코드량이 1MB를 넘을떄
  - 사용자 커킷 또는 롤백 명령이 날릴때
  
2. UNDO
 - 각 트랜지션별로 UNDO세그먼트를 할당해주고 그 트랜지션이 발생시킨 테이블과 인댁스에
    대한 변경사항을 UNDO레코드 단위로 UNDO세그먼트 블록에 기록.
 - 목적 3가지
  - 트랜지션 롤백
  - 트랜지션 리커버리(인스턴스 리커버리시 롤백단계)
  - READ Consistency(읽기 일관성)
  타 DBMS는 Lock를 통해 일기 일관성을 구현하지만,
  오라클에서는 UNDO데이타를 이용해서 읽기 일관성을 구현한다.  
             읽기 일관성이란 Transaction이 진행되는 동안 Database의 다른 사용자는 이 Consistent Read에 의해
             Commit되지 않은 변경 사항을 볼 수 없는 기능 입니다.  
 - UNDO레코드에 기록되는 내용
  - Insert : 추가된 레코드의 rowid
  - Update : 변경되는 컬럼에 대한 before image
  - Delete : 지워지는 로우의 모든 컬럼의 대한 before image
------------------------------------------------------------------------------------
REDO 와 UNDO를 차이점에서 바라본 관점.
------------------------------------------------------------------------------------
REDO 는 UNDO를 포함 합니다.
REDO 는 시스템 장애시 복구를 위해 사용 합니다.
복구시에 UNDO 데이터도 같이 복구하구요. Commit 되지 않은 데이터를 Rollback 하게 됩니다.
UNDO 는 Rollback 시에도 사용 되지만 Read Consistency(읽기 일관성) 을 위해서 도 사용 됩니다.
REDO 는 모든 변경사항(UNDO 포함)을 기록 합니다.
복구는 UNDO 를 통해서 복구를 하게 됩니다. 즉, ROLLBACK을 한다는 말이죠.
시스템 장애가 발생하게 되면 UNDO 데이터도 모두 날아가게 되겠죠.
결국 시스템 장애시 REDO 데이터를 이용해서 마지막 CHECK POINT 부터 장애까지의 DB BUFFER CACHE 를 복구하게 됩니다.
이게 완료가 되면 UNDO DATA 를 이용하여 COMMIT 되지 않은 데이터를 모두 ROLLBACK 함으로써 복구를 완료하게 됩니다.
결국 REDO 가 UNDO 를 복구하고 최종적으로 UNDO가 복구를 하게 됩니다.
UNDO(안한것 처럼)는 되돌리는 것 이라고 보시면 될거에요.
어떤 세션에서 DML을 발생시키면 commit이나 rollback을 날리기 전까지 이전 정보를 저장하기 위해서
UNDO 블럭에 해당 정보를 기록하죠.. 해당 세션이 트렌젝션 중에 비정상적으로 종료가 되었다면,
안한것 처럼(UNDO) 다시 원복해야 되니까요.
REDO(다시 함)는 위에 설명에도 있듯이 인스턴스 실패시(DB가 내려감) 데이터 파일에 쓰여지지 않은 커밋된 데이터를 복구한다고 되어 있습니다.
좀 더 설명을 해드리자면 커밋을 날릴 경우 LGWR가 로그 파일에 변경된 정보를 기록하게 되고
그 다음에 DBWR이 버퍼캐쉬에 있는 변경 사항에 대해서 데이터 파일에 내려 쓰게 됩니다.
이는 복구 정보가 더 중요해서 그런 것이라고 이해하시면 됩니다(복구 정보가 있으면 재적용이나 되돌릴 수 있으므로).
DB가 비정상적으로 내려가면, 데이터 파일에 쓰여지지 않은 정보들이 리두로그에 기록되어 있으므로
리두로그에서 정보를 가져와 다시 적용(REDO)하는 것입니다.
Posted by pat98


Applies to:

Oracle Server - Enterprise Edition - Version: 10.2.0.5 and later   [Release: 10.2 and later ]
Microsoft Windows (32-bit)
Microsoft Windows x64 (64-bit)
Microsoft Windows x64 (64-bit) - OS Version: 7
Microsoft Windows (32-bit) - OS Version: 7
Microsoft Windows x64 (64-bit) - Version: 2008 R2

Goal

To explain how to install 10.2.0.5 Oracle Client software on Windows 7 and Windows 2008 R2.


NOTE:  Oracle 10g R2 (10.2) Client Software was NOT supported on Windows 7 (32/64 bit) or Windows 2008 R2 (32/64 bit) when it was initially released.  These OS's were not available at that time.  Oracle decided to start supporting them with the release of the 10.2.0.5.0 patch.  Please review the following Statement of Direction for more details:

Note 1061272.1 Statement of Direction: Oracle Database 10g Release 2 Client (10.2.0.5) with Microsoft Windows 7 and Windows Server 2008 R2

Solution

How To Install 32-bit 10.2.0.5 Oracle Client Software on Windows 7 (32 or 64-bit) or Windows 2008 R2 (32 or 64-bit)

  1. Download the following base software from TechNet:
     
         Oracle Database 10g Client Release 2 (10.2.0.3)
         10203_vista_w2k8_x86_production_client.zip (616,757,461 bytes)

  2. Extract the software.  Browse the extracted files and find setup.exe which is located inside the Disk1 folder.

  3. Open a Command Prompt, change to the location where setup.exe resides and issue the following command:

         C:\Temp\10203\client\Disk1> setup.exe -ignoreSysprereqs -ignorePrereq

    NOTE:  The parameters are CASE-SENSITIVE and must be entered EXACTLY as shown above.

    These flags attempt to suppress any Windows 7/2008 R2 warnings from being raised about the OS not being compatible with the Oracle Universal Installer (OUI).  Even if warnings are raised when running the OUI you will still be able to proceed with the software installation (i.e. click the Next button).

  4. After the software installation completes, you must download the 10.2.0.5 patch and apply it.

    1. Log into My Oracle Support.

    2. Click on the Patches & Updates tab.

    3. In the Patch Search section set Patch Name or Number to 8202632.

    4. Set the Platform to Microsoft Windows (32-bit).

    5. Click Search.

    6. Download the following patch:

            Patch 8202632: 10.2.0.5.0 PATCH SET FOR ORACLE DATABASE SERVER

      NOTE:  Do not be confused by the name of the patch containing the words Oracle Database Server.  This patch is used to upgrade BOTH Oracle Client AND Oracle Database software.

    7. Once the patch has been downloaded, extract the patch.  Browse the extracted files and run setup.exe, which is located inside the Disk1 folder, to install the 10.2.0.5 patch.

      NOTE:  When the OUI asks you to Specify Home Details be sure the choose your EXISTING 10.2.0.3 Oracle Home installed in Step 3.  If you do not see the correct Name and Path to your current Oracle Home, click the drop-down button and choose it from the list.  DO NOT attempt to install this patch into a NEW Oracle Home.


  5. Once the patch has been applied you will be running a supported version of 32-bit 10g R2 client software on a Windows 7/Windows 2008 R2 OS.

How To Install 64-bit 10.2.0.5 Oracle Client Software on Windows 7 (64-bit) or Windows 2008 R2 (64-bit)

  1. Download the following base software from TechNet:
     
         Oracle Database 10g Client Release 2 (10.2.0.4)
         10204_vista_w2k8_x64_production_client.zip (507,934,134 bytes)

  2. Extract the software.  Browse the extracted files and find setup.exe which is located inside the Disk1 folder.

  3. Open a Command Prompt, change to the location where setup.exe resides and issue the following command:

         C:\Temp\10204\client\Disk1> setup.exe -ignoreSysprereqs -ignorePrereq

    NOTE:  The parameters are CASE-SENSITIVE and must be entered EXACTLY as shown above.

    These flags attempt to suppress any Windows 7/2008 R2 warnings from being raised about the OS not being compatible with the Oracle Universal Installer (OUI).  Even if warnings are raised when running the OUI you will still be able to proceed with the software installation (i.e. click the Next button).

  4. After the software installation completes, you must download the 10.2.0.5 patch and apply it.

    1. Log into My Oracle Support.

    2. Click on the Patches & Updates tab.

    3. In the Patch Search section set Patch Name or Number to 8202632.

    4. Set the Platform to Microsoft Windows x64 (64-bit).

    5. Click Search.

    6. Download the following patch:

            Patch 8202632: 10.2.0.5.0 PATCH SET FOR ORACLE DATABASE SERVER

      NOTE:  Do not be confused by the name of the patch containing the words Oracle Database Server.  This patch is used to upgrade BOTH Oracle Client AND Oracle Database software.

    7. Once the patch has been downloaded, extract the patch.  Browse the extracted files and run setup.exe, which is located inside the Disk1 folder, to install the 10.2.0.5 patch.

      NOTE:  When the OUI asks you to Specify Home Details be sure the choose your EXISTING 10.2.0.4 Oracle Home installed in Step 3.  If you do not see the correct Name and Path to your current Oracle Home, click the drop-down button and choose it from the list.  DO NOT attempt to install this patch into a NEW Oracle Home.


  5. Once the patch has been applied you will be running a supported version of 64-bit 10g R2 client software on a Windows 7/Windows 2008 R2 OS.


Posted by pat98



Applies to:

Oracle Server - Enterprise Edition - Version: 11.1.0.6 to 11.2.0.2 - Release: 11.1 to 11.2
Information in this document applies to any platform.
Oracle Server Enterprise Edition - Version: 11.1 to 11.1
***Checked for relevance on 14-Jan-2011***

Goal

The purpose of this document is to provide information about the Copy the asm files between remote instances and local instances using ASMCMD.

Solution

11g new feature,you can use asmcmd to copy files between remote instances
Enables you to copy files between ASM disk groups on local instances and remote instances. You can also use this command to copy files from ASM disk groups to the operating system.

cp -ifr <source file name> <user_name>@<host_name>.<Port Number>.<SID>:<targer path>/<target file name>

user_name@host_name.<Port Number>.<SID>

The user_name, host_name, and SID are required. The default port number is 1521.


Example :-

asmcmd>cp -ifr +DATA/RAC/PARAMETERFILE/spfile.257.678975489 sys@stgrac1.1521.+ASM2:+FRA/RAC/ARCHIVELOG/spfile



Troubleshooting-ASMCMD remote copy

asmcmd remote copy works through listener connection.

ASMCMD remote connection can fail with below generic error.

ASMCMD-08202: internal error: [asmcmdshare_error_msg_05] [8201]

[8201] means unable to connect remote ASM Instance.

It could be due to following reason.

* not able to reach remote host.
* Remote host listener is down.
* Remote ASM Instance is not registered with listener and running non-default port.
* sysasm remote connection does not work.
* Incorrect password given for sys user.
* Remote ASM Instance password file missing.


We need to enable additional tracing for asmcmd connection to get a exact failure message.

++ set DBI_TRACE environment variable for asmcmd perl tracing.

export DBI_TRACE=1

++ Now connect using asmcmd and re-produce the issue.

Example 1:-

asmcmd>cp +data/spfileorcl.ora.289.686235413 sys@stgrac1.1521.+ASM1:+test

-> DBI->connect(dbi:Oracle:host=stgrac1;port=1521;sid=+ASM1, sys, ****, HASH (0x8b2b044))
connect using '(DESCRIPTION=(ADDRESS=(HOST=stgrac1)(PROTOCOL=tcp)(PORT=1521))(CONNECT_DATA=(SID=+ASM1)))'
ERROR: '1031' 'ORA-01031: insufficient privileges

(DBD ERROR: OCISessionBegin)'
<- DESTROY= undef at DBI.pm line 591
DBI connect('host=stgrac1;port=1521;sid=+ASM1','sys',...) failed: ORA-01031: insufficient privileges (DBD ERROR: OCISessionBegin)
KK FROM HERE A
ASMCMD-08202: internal error: [asmcmdshare_error_msg_05] [8201]


Here we can see that asmcmd copy failed due to ORA-01031.

ASMCMD uses SYSASM by default if -a option is not used.

Here the problem is sysasm privelege was not given to sys user on remote ASM Instance.

Given the SYSASM privilege to SYS ( or the user trying to connect ). When you grant a system privilege, the password file is updated.


SQL> grant sysasm to sys;
Grant succeeded.

SQL> select * from v$pwfile_users;
USERNAME SYSDB SYSOP SYSAS
------------------------------ ----- ----- -----
SYS TRUE TRUE TRUE

Now the remote asmcmd copy works fine

For more detals,please go through the below notes

Note.730067.1 - Troubleshooting ORA-1031 Insufficient Privilege
Note.578796.1 - ORA-01031 While Connecting as SYSASM

Example 2:-

ASMCMD> cp -ifr thread_2_seq_5.264.678983423 sys@bderac2-vip.1521.+ASM2:+FRA/RAC/ARCHIVELOG/
Enter password: ***
ASMCMD-08016: copy source->'+FRA/RAC/ARCHIVELOG/2009_02_16/thread_2_seq_5.264.678983423' and target->'+FRA/RAC/ARCHIVELOG/thread_2_seq_5.264.678983423' failed
ORA-17628: Oracle error 19505 returned by remote Oracle server
ORA-06512: at "SYS.X$DBMS_DISKGROUP", line 258
ORA-06512: at line 3 (DBD ERROR: OCIStmtExecute)
ASMCMD>


Solution:-

The cp command failed because the target ASM file name was not specified  or File name should not contain the file number/incarnation.We can not copy OMF form files without specifying file name

cp -ifr thread_2_seq_5.264.678983423 sys@bderac2-vip.1521.+ASM2:+FRA/RAC/ARCHIVELOG/thread_2_seq_5

The file number/incarnation will be created automatically during the copy.

Posted by pat98


In this Document
  Purpose
  Scope and Application
  11gR2 RAC TAF Configuration for Admin and Policy Managed Databases
     Concepts & Overview
     Administrator-managed Database
     Policy-managed Database
  References


Applies to:

Oracle Net Services - Version: 11.2.0.1 to 11.2.0.2 - Release: 11.2 to 11.2
Oracle Server - Enterprise Edition - Version: 11.2.0.1 to 11.2.0.2   [Release: 11.2 to 11.2]
Information in this document applies to any platform.

Purpose


This article discussed about Transparent Application Failover (TAF) and its configuration details for 11gR2 cluster database.

Scope and Application


This article is intended for Oracle Cluster Administrators who would like more details about TAF. Basic understanding of Oracle Clusterware Database is assumed.

11gR2 RAC TAF Configuration for Admin and Policy Managed Databases


Concepts & Overview


TAF allows Oracle clients to reconnect to surviving instance in the event of a failure of the current instance to which it connected. There are two types of TAF available, SESSION and SELECT.

SESSION: Session Failover re-creates the connections and sessions to the surviving instance.
SELECT:  In addition to recreating the session, Select Failover also replays the queries that were in progress.


There are two method in which TAF estabilishes the faiover connection, BASIC and PRECONNECT.

BASIC: The second connection is re-established only after the first connection to the instance failed.
PRECONNECT: Two connections are established when first login to database. A login to database will create two connection at the same time. For this to work clusterware actually starts two service. One main service and another shadow service.


TAF can be configured at the Oracle Client side in tnsnames.ora or at the Database Server side using srvctl. Configuring it at the server is preferred as it is convenient to put the configuration in one place i.e. on the server. This article focuses on configuration at the server using srvctl. TAF configuration is associated with the database service. Creating the service with TAF related options is all about configuring TAF.

1. Create service with srvctl
srvctl defines services in OCR which is accessible across nodes. The service configuration details can be viewed with srvctl config service command.

2. Start the Service with srvctl
When the service is started with srvctl, it actually creates a service in the database i.e. RDBMS. The service name and the configuration details of the service created at the database can be viwed with DBA_SERVICES view. Then the SERVICE_NAMES database parameter is set, which allows PMON to register this service with the listener. Hence connections can be established through listener to the database.

NOTE:
The clusterware automatically starts/stops and manages the services in the database according to the configuration of the service.
manually setting the SERVICE_NAMES parameter in RAC is not allowed as clusterware manages setting this parameter.


Let's have a look at the srvctl syntax for creating service. Databases in 11gR2 can be Administrator-managed or Policy-managed. Certain options can be used only with certain type of database.

srvctl add service -d <db_unique_name> -s <service_name>
-r "<preferred_list>" [-a "<available_list>"] [-P {BASIC | NONE | PRECONNECT}]
-g <server_pool> [-c {UNIFORM | SINGLETON}]
[-k <net_num>]
[-l [PRIMARY][,PHYSICAL_STANDBY][,LOGICAL_STANDBY][,SNAPSHOT_STANDBY]]
[-y {AUTOMATIC | MANUAL}]
[-q {TRUE|FALSE}]
[-x {TRUE|FALSE}]
[-j {SHORT|LONG}]
[-B {NONE|SERVICE_TIME|THROUGHPUT}]
[-e {NONE|SESSION|SELECT}]
[-m {NONE|BASIC}]
[-z <failover_retries>]
[-w <failover_delay>]

Here is the description of the options that we are going to use in configuring TAF.

-r "<preferred_list>" [-a "<available_list>"] [-P {BASIC | NONE | PRECONNECT}]
         This clause is valid only for Administrator-managed database which support PRECONNECT method.
-g <server_pool> [-c {UNIFORM | SINGLETON}]
         This clause is valid only for Policy-managed database where in PRECONNECT method is not available.

The above two options instruct clusterware on how to handle this service.

[-e {NONE|SESSION|SELECT}]
        This defines the type of TAF whether SESSION or SELECT.
[-m {NONE|BASIC}]
        This defines the method of TAF.
[-z <failover_retries>]
        This defines the the number of times to attempt to connect after a failover.
[-w <failover_delay>]
       This defines the amount of time in seconds to wait between connect attempts.

Above four options are passed to database while starting up the service and is viewable in DBA_SERVICES view.


Administrator-managed Database


Basic Method


1. Create Service

Syntax:
          srvctl add service -d <db_unique_name> -s <service_name>
          -r "<preferred_list>" [-a "<available_list>"] [-P {BASIC | NONE | PRECONNECT}]
         [-e {NONE|SESSION|SELECT}]
         [-m {NONE|BASIC}]
         [-z <failover_retries>]
         [-w <failover_delay>]


Example:
         $ srvctl add service -d db112 -s mysrvb -r db1121,db1122 -P basic -e select -m basic -z 10 -w 2

Though this example only uses -r option, you can use both -r and -a option.


2. Start the service
$ srvctl start service -d db112 -s mysrvb

Check if the service is rightly registered with the listener.
$lsnrctl status listener_scan1
. . .
. . .
Service "mysrvb" has 2 instance(s).
  Instance "db1121", status READY, has 1 handler(s) for this service...
  Instance "db1122", status READY, has 1 handler(s) for this service...
. . .
. . .



3. Make a connection.

The client connect string do not require any special TAF parameters. A simple tnsnames.ora entry as shown below will do.
mysrvb =
  (description =
   (address = (protocol=tcp)(host=secrac-scan)(port=1521))
   (connect_data=
    (service_name=mysrvb)
   )
  )


$ sqlplus scott/tiger@mysrvb

Once connectd GV$SESSION view will reflect the connection made is TAF enabled. The FAILOVER_METHOD and FAILOVER_TYPE column reflects it and this confirms the TAF configuration is correct.

SQL> select inst_id,username,service_name,failover_type,failover_method,failed_over from gv$session where username='SCOTT';

INST_ID USERNAME SERVICE_NAME FAILOVER_TYPE FAILOVER_M FAI
------- -------- ------------ ------------- ---------- ---
      1    SCOTT       mysrvb        SELECT      BASIC  NO

Note that this has connected to the instance 1 with BASIC method and SELECT type.


Preconnect Method


1. Create service

Syntax:
           srvctl add service -d <db_unique_name> -s <service_name>
          -r "<preferred_list>" [-a "<available_list>"] [-P {BASIC | NONE | PRECONNECT}]
          [-e {NONE|SESSION|SELECT}]
          [-m {NONE|BASIC}]


Example:
          $ srvctl add service -d db112 -s mysrv -r db1121 -a db1122 -P preconnect

NOTE:
* The -e, -m option should be set to none. Omiting this option actually sets it to none.
* Use both -r and -a option for preconnect, hence a shadow service can be started.


2. Start the service.

$ srvctl start service -d db112 -s mysrv

Preconnect creates an additional shadow service.

$ crsctl stat res -t
. . .
. . .
ora.db112.mysrv.svc
          1 ONLINE ONLINE secrac1
ora.db112.mysrv_preconnect.svc
          1 ONLINE ONLINE secrac3
. . .
. . .


Note that the service 'mysrv' is started on the preferred instance. And the shadow service 'mysrv_preconnect' is started on the available instance. The shadow service is always defined as <servicename>_preconnect.


Check if the service is correctly registered with the listener.
$ lsnrctl status listener_scan1
. . .
. . .
Service "mysrv" has 1 instance(s).
  Instance "db1121", status READY, has 1 handler(s) for this service...
Service "mysrv_preconnect" has 1 instance(s).
  Instance "db1122", status READY, has 1 handler(s) for this service...
. . .
. . .





3. Define TNS entry.

Preconnect requires special parameter at the client side tnsnames.ora file. The TAF Type and the TAF method is actually specified at the client end.

mysrv =
  (description =
   (address = (protocol=tcp)(host=secrac-scan)(port=1521))
   (connect_data=
     (service_name=mysrv)
     (failover_mode=
       (backup=mysrv_preconnect)
       (type=select)
       (method=preconnect)
     )
   )
  )

mysrv_preconnect =
  (description =
   (address = (protocol=tcp)(host=secrac-scan)(port=1521))
   (connect_data=(service_name=mysrv_preconnect))
  )


Connect to the database with this tns entry

$sqlplus scott/tiger@mysrv

SQL> select inst_id,username,service_name,failover_type,failover_method,failed_over from gv$session where username='SCOTT';

INST_ID USERNAME     SERVICE_NAME FAILOVER_TYPE FAILOVER_M FAI
------- -------- ---------------- ------------- ---------- ---
      1    SCOTT            mysrv        SELECT PRECONNECT  NO
      2    SCOTT mysrv_preconnect          NONE       NONE  NO

Note that it has made two connections. The first one is the active main connection with PRECONNECT method and SELECT type. The second one is the shadow connection that will be used when the connection to the first instance fails.


Policy-managed Database


Basic Method


1. Create the service

Syntax:
          srvctl add service -d <db_unique_name> -s <service_name>
          -g <server_pool> [-c {UNIFORM | SINGLETON}]
          [-e {NONE|SESSION|SELECT}]
          [-m {NONE|BASIC}]
          [-z <failover_retries>]
          [-w <failover_delay>]


Example:
         $ srvctl add service -d db1 -s mysrvp -g mysrvpool -c uniform -e select -m basic -z 10 -w 2


2.
Start the service

$ srvctl start service -d db1 -s mysrvp


Check if the service is correctly registered with the listener
$ lsnrctl status listener_scan1
. . .
. . .
Service "mysrvp" has 2 instance(s).
  Instance "db1_1", status READY, has 1 handler(s) for this service...
  Instance "db1_2", status READY, has 1 handler(s) for this service...
. . .
. . .



3. Make a connection.

The client connect string do not require any special TAF parameters. A simple tnsnames.ora entry as shown below will do.

mysrvp =
  (description =
   (address = (protocol=tcp)(host=secrac-scan)(port=1521))
   (connect_data=
     (service_name=mysrvp)
   )
  )

$ sqlplus scott/tiger@mysrvp

SQL> select inst_id,username,service_name,failover_type,failover_method,failed_over from gv$session where username='SCOTT';

INST_ID USERNAME SERVICE_NAME FAILOVER_TYPE FAILOVER_M FAI
------- -------- ------------ ------------- ---------- ---
      1    SCOTT       mysrvp        SELECT      BASIC  NO


Posted by pat98


RAC on HP-UX Highlighted Recommendations


This is a list of the core best practices and references identified by the RAC Assurance Team relevant to HP-UX specific RAC implementations. The rest of this chapter provides additional best practices and references from our team.

  • Hostnames on HP-UX should be 8 characters or less. Refer to Note 469569.1 for more details.
  • Setting streams_sqmax=50000 causes gc cr block lost to be reduced. The default value of the 'streams_sqmax' was 'unlimited' for HP 11.11 or lower version, for HP 11.23 the default was changed to 1000.
  • On HPUX (PA-RISC) and HPUX (Itanium) and RDBMS Versions 11.1.0.6 when the DIAG process is activated to get the call stack for a background process on the instance, under certain situations (corrupt call stack on the target) this can result in the target process for diag being killed. If the target process is a critical background process (such as LMS) this can also result in the instance being terminated.
Workaround: Set the hidden parameter "_ksb_disable_diagpid" = true to prevent diag from taking stack traces from other processes. Note: This may result in insufficient diagnostics to progress a problem should there me an issue needing such information. This is fixed in 11.1.0.7. Reference: Note 6489596.8 and Note:6849769.8
  • Following is for HP-UX (Itanium) only: Implement the workaround mentioned in knowledge document Note 763724.1 RACGMAIN received SIGSEGV on checkresource causing a crash of a resource.
  • For versions 10gR2 and 11gR1, it is a best practice on all platforms to set the CSS diagwait parameter to 13 in order to provide time for dumping diagnostics in case of node evictions. Setting the diagwait above 13 is NOT recommended without explicit instruction from Support. This setting is no longer required in Oracle Clusterware 11g Release 2. Reference Document 559365.1 for more details on diagwait.

=========================
11.2.0.2
=========================

  • NTPQ executable should be in the grid install user's path, if ntp is being used.

RAC on HP-UX Best Practices

Operating System 

  • In order to assure proper Oracle process scheduling priority for HP-UX, review Note:759082.1
  • In HP-UX Itanium, RACGMAIN can cause a crash with a SIGSEGV error - Note:763724.1.  This can occur in 10.2.0.4 and 11.1.0.7. 

Storage

  • Cannot discover Disks in ASM after upgrade to 10.2.0.3 on HP-UX Itanium – Note 433770.1. Apply patch PHCO_35524 or rename /usr/sbin/lvmchk to something else or apply oracle patch 6051728 or upgrade to 10.2.0.4.
  • Shutdown the CRS stack cleanly before trying to unmount the shared filesystem. Most other clustered file systems require you to stop all process holding file descriptors open, before permitting the unmount command to continue. That is not presently the case with HP-UX Veritas Clustered Filesystem. If the clustered filesystem with OCR is unmounted 'cfsumount,' whilst the crs stack is still up and running, the cfsumount kills any holding processes. This terminate the OCSSD process, without warning

Networking

  • Jumbo frames with mtu 9000 cause problems with hp nc7170 drivers.
  • Setting streams_sqmax=50000 causes gc cr block lost to be reduced.  The default value of the 'streams_sqmax' was 'unlimited' for HP 11.11 or lower version, for HP 11.23 the default was changed to 1000.
  • Note: 883801.1 - Ensure Patch PHKL_40208 applied to avoid RAC hangs on HP UX Itanium 

Configuration

  • On HP-UX Itanium platform and bash version 3.2.0(1), the following problem occurs:
    sbin/init.d/init.cssd[995]: 21571 Memory fault
    /sbin/init.d/init.cssd[1001]: 21579 Memory fault
    Workaround:  Use /usr/bin/ksh or bash version 3.0.   Reference Note 734618.1.

Install

Patching


Posted by pat98

2011. 4. 8. 22:50 오라클

oracle pfile/spfile


1. 오라클 환경 : spfile & pfile
  Oracle9i부터 데이타베이스의 Initial Parameter를 저장하는 init<SID>.ora이외에 spfile<SID>.ora이 추가되었다.
  spfile(Server Parameter File)은 "Alter System"을 통해서 운영중인 데이타베이스의 parameter를 수정하고,
  재 가동을 시켜야하는 필요성을 줄였다.
  1) spfile<SID>.ora은 binary 파일이다.
    header에 checksum, meta-data 정보가 저장되어 있어서 텍스트에이터로 수정하면 안된다.
   
  2) spfile<SID>.ora파일이 깨졌다면, 백업한 init<SID>.ora 파일을 읽어서 시작한다.
 
  3) Db Open시 참조한 spfile확인
    sql> show parameter pfile
         NAME              TYPE           VALUE
         --------------------------------------
         spfile            String         $INST/spfile<SID>.ora
 
  4) init<SID>.ora -> spfile<SID>.ora로 변환하기
    spfile을 pfile로 백업을 해서 나중에 다시 복구할 때 사용한다.
    sql> connect system/manager as sysdba
    sql> create pfile='init<SID>.ora' from spfile='spfile<SID>.ora';
    sql> create spfile='<SID>.ora' from pfile='init<SID>.ora';
    파일은 $INST/dbs에 저장된다.
 
  5) Alter system으로 parameter 변경
    spfile은 "alter system"을 통해서 내용을 변경할 수 있다.
    sql> alter system set open_cursors='300' scope=[memory|spfile|both]
    scope = memory : 현재 세션부터 바로 반영
            spfile : 재 가동시 반영
            both   : 둘다
            deferred option 주면 다음 session 들도 적용이 됨.
           
  6) v$parameter와 v$spparameter
    parameter중에는 재 가동을 해야만 반영되는 것들이 있다. 이를 static parameter라고 한다.
   
    sql> select name, issys_modifiable from v$parameter;
    issys_modifiable = false     : static parameter로 재 가동시 반영
                       immediate : 바로 반영
                       deferred  : 다음번 세션부터 반영
    "alter system"을 통해서 parameter를 변경할 때 제대로 반영되었는지 여부는
    "show parameter" or "v$parameter" or "v$spparameter"로 확인한다.
    sql> show parameter open_cursors;
    sql> select name, value from v$parameter where name='open_cursors';
    sql> select name, value from v$spparameter where name='open_cursors';
    v$parameter는 현재 세션의 파라메타정보
    v$spparameter는 재 가동할 때 읽어들이는 파라메타.
    즉, v$spparameter를 읽어서 v$parameter값들을 구성한다.


# 임의의 pfile을 이용해서 startup

SQL> create pfile='/home/oracle/mydevdb.ora' from spfile;
SQL> startup force pfile=/home/oracle/mydevdb.ora


# 임의의 spfile을 이용해서 startup

임의의 pfile을 이용해서 경로를 지정해서는 startup가 되지만 spfile는 그렇게 되지 않는다.

하지만 spfile의 경로를 지정한 파일을 생성하여 해당 파일을 이용한 startup은 가능하다.

 

SQL> create spfile='/home/oracle/spmydevdb.ora' from pfile='/home/oracle/mydevdb.ora';
SQL> startup force spfile=/home/oracle/spmydevdb.ora     --> 에러

SQL> !vi /home/oracle/param.txt

         spfile=/home/oracle/spmydevdb.ora

SQL> startup force pfile=/home/oracle/param.txt

Posted by pat98

02-06 05:43
Flag Counter
Yesterday
Today
Total

글 보관함

최근에 올라온 글

달력

 « |  » 2025.2
1
2 3 4 5 6 7 8
9 10 11 12 13 14 15
16 17 18 19 20 21 22
23 24 25 26 27 28

최근에 달린 댓글