Build Your Own Oracle Infrastructure: Part 11 – Oracle RAC Database: Create, Test & Administer.

We’re now ready to create an Oracle RAC database.
It’s mostly quite similar to creating a single instance Oracle database.
The major differences come with how the various infrastructure components are administered.
We’ll get to that in good time.

Quick links to all the tasks:

Task #1: Create Additional ASM Diskgroups.

So far we have the +CRS ASM Diskgroup which provides storage for the Votedisk, the OCR and the GIMR database (_MGMTDB). Our own RAC database will also use ASM storage, so we’ll need to create a couple more ASM Diskgroups before we create the database.

On racnode1, login as the grid user and fire up the ASM Configuration Assistant (asmca):

[grid@racnode1 ~]$ . oraenv
ORACLE_SID = [grid] ? +ASM1
The Oracle base has been set to /u01/app/grid

Set your DISPLAY environment variable so it points back to your workstation:

[grid@racnode1 ~]$ export DISPLAY=<your-workstation-or-IP>:0.0
[grid@racnode1 ~]$ asmca

Click on Create. Enter DATA for the Disk Group Name, select External for the Redundancy level and select ASMDISK02 and ASMDISK03. Then Click OK:

Click OK:

Click Create:

Enter FRA for the Disk Group Name, select External for the Redundancy level and select ASMDISK04 and ASMDISK05. Then click OK:

Click OK in the Disk Group:Creation pop-up window:

Then click Exit to close the ASM Configuration Assistant.

Et voila! Two shiny new ASM Diskgroups ready for our RAC database to use. It will be pleased.

Task #2: Create an Oracle 12c RAC Database.

A quick way to set your oracle environment correctly when no databases currently exist on the server is to edit the /etc/oratab file to include a dummy line:

[oracle@racnode1 ~]$ vi /etc/oratab

dummy:/u01/app/oracle/product/12.1.0/dbhome_1:N

Use the standard oraenv script to set your environment, then unset the ORACLE_SID:

[oracle@racnode1 ~]$ . oraenv
ORACLE_SID = [oracle] ? dummy
The Oracle base has been set to /u01/app/oracle

[oracle@racnode1 ~]$ unset ORACLE_SID

You can tell you’re pointing to the correct set of Oracle executables by using the which command:

[oracle@racnode1 database]$ which dbca
/u01/app/oracle/product/12.1.0/dbhome_1/bin/dbca

Finally, set your DISPLAY variable correctly and fire up the Database Configuration Assistant (dbca):

[oracle@racnode1 ~]$ export DISPLAY=<your-workstation-or-IP>:0.0
[oracle@racnode1 ~]$ dbca

Select Create Database, then click Next:

Select Advanced Mode, then click Next:

Use these values to complete the next screen, then click Next:

Field Value
Database Type Oracle Real Application Clusters (RAC) database
Configuration Type Admin-Managed
Template General Purpose or Transaction Processing

Enter RAC1DB.mynet.com for the Global Database Name and uncheck Create As Container Database. Then Click Next:

Ensure both racnode1 and racnode2 are in the Selected column, then click Next:

Use these values to complete the next screen, then click Next:

Field Value
Run Cluster Verification Utility (CVU) Checks Periodically Checked
Configure Enterprise Manager (EM) Database Express Checked
EM Database Express Port 5500
Register with Enterprise Manager (EM) Cloud Control Checked
OMS Host oraemcc.mynet.com
OMS Port 4903
EM Admin Username SYSMAN
EM Admin Password (referenced here)

Select Use the Same Administrative Password for All Accounts. Enter the SYSDBA password twice then click Next:

Complete the next screen using the values shown in the screen shot, then click Next:

Create a password for the ASMSNMP ASM instance account, then click OK:

Just click Next:

On the Memory tab, select Use Automatic Memory Management and enter 2048 MB for the Memory Size, then click the Character Sets tab:

On the Character Sets tab, select Use Unicode (AL32UTF8) then click Next:

Ensure Create Database is checked, then click Next:

The Prerequisite Checks screen highlights the swap size failure which we can safely ignore. Click Next:

On the Summary screen, click Finish:

The familiar progress bar tells you how things are going:

Eventually you’ll see the Finish screen. Click Close and you’re done.

You now have an Oracle 12c RAC database up and running with 2 instances.
In the next section we’ll run through a series of tests to find out if all is well.

Task #3: Test Oracle 12c GI and RAC Database.

After you build out your cluster and RAC database, it’s important that you test everything is working properly, especially when it will become a production system. Testing really falls into two categories, functionality testing and workload testing. Functionality testing focuses on finding out if the configuration behaves in the way it should. Workload testing focuses on finding out if the configuration can handle the expected workload in terms of response time and throughput.

You can view an outline test plan here. It’s a decent start at putting together a comprehensive test plan, but it does omit three key areas. It does not contain any workload testing, network failure testing or any destructive testing. Let’s get testing and see what happens!

Section #1: Check & Test Clusterware.

The first set of checks and tests focuses on Clusterware. It gets us comfortable with using the various Clusterware administration commands and helps with identifying where certain resources are located.

Test #1B. Check: Resource TARGET and STATE columns.

The crs_stat command is supposed to be deprecated, but still works in 12c.

What we’re looking for here is any line where the Target and State column values are different. The Target value is what the status should be and the State value is what the status actually is. So, a Target value of ONLINE and a State value of OFFLINE could mean one of two things. Either the resource is coming online, but hadn’t quite made it when you ran the command or that resource has a problem and cannot come online. In which case, it would be time to dive into the log files. As we can see from this output, all is well.

[grid@racnode1 ~]$ crs_stat -t
Name           Type           Target   State     Host      
------------------------------------------------------------
ora.CRS.dg     ora....up.type ONLINE   ONLINE   racnode1  
ora.DATA.dg   ora....up.type  ONLINE   ONLINE   racnode1  
ora.FRA.dg     ora....up.type ONLINE   ONLINE   racnode1  
ora....ER.lsnr ora....er.type ONLINE   ONLINE   racnode1  
ora....N1.lsnr ora....er.type ONLINE   ONLINE   racnode2  
ora....N2.lsnr ora....er.type ONLINE   ONLINE   racnode1  
ora....N3.lsnr ora....er.type ONLINE   ONLINE   racnode1  
ora.MGMTLSNR   ora....nr.type ONLINE   ONLINE   racnode1  
ora.asm       ora.asm.type    ONLINE   ONLINE   racnode1  
ora.cvu       ora.cvu.type    ONLINE   ONLINE   racnode1  
ora.mgmtdb     ora....db.type ONLINE   ONLINE   racnode1  
ora....network ora....rk.type ONLINE   ONLINE   racnode1  
ora.oc4j       ora.oc4j.type  ONLINE   ONLINE   racnode1  
ora.ons       ora.ons.type    ONLINE   ONLINE   racnode1  
ora.rac1db.db ora....se.type  ONLINE   ONLINE   racnode1  
ora....SM1.asm application    ONLINE   ONLINE   racnode1  
ora....E1.lsnr application    ONLINE   ONLINE   racnode1  
ora....de1.ons application    ONLINE   ONLINE   racnode1  
ora....de1.vip ora....t1.type ONLINE   ONLINE   racnode1  
ora....SM2.asm application    ONLINE   ONLINE   racnode2  
ora....E2.lsnr application    ONLINE   ONLINE   racnode2  
ora....de2.ons application    ONLINE   ONLINE   racnode2  
ora....de2.vip ora....t1.type ONLINE   ONLINE   racnode2  
ora.scan1.vip ora....ip.type  ONLINE   ONLINE   racnode2  
ora.scan2.vip ora....ip.type  ONLINE   ONLINE   racnode1  
ora.scan3.vip ora....ip.type  ONLINE   ONLINE   racnode1  

This is the preferred CRS resource status check command:

[grid@racnode1 ~]$ crsctl status resource -t
--------------------------------------------------------------------------------
Name          Target State       Server                   State details      
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.CRS.dg
               ONLINE ONLINE       racnode1                 STABLE
               ONLINE ONLINE       racnode2                 STABLE
ora.DATA.dg
               ONLINE ONLINE       racnode1                 STABLE
               ONLINE ONLINE       racnode2                 STABLE
ora.FRA.dg
               ONLINE ONLINE       racnode1                 STABLE
               ONLINE ONLINE       racnode2                 STABLE
ora.LISTENER.lsnr
               ONLINE ONLINE       racnode1                 STABLE
               ONLINE ONLINE       racnode2                 STABLE
ora.asm
               ONLINE ONLINE       racnode1                 Started,STABLE
               ONLINE ONLINE       racnode2                 Started,STABLE
ora.net1.network
               ONLINE ONLINE       racnode1                 STABLE
               ONLINE ONLINE       racnode2                 STABLE
ora.ons
               ONLINE ONLINE       racnode1                 STABLE
               ONLINE ONLINE       racnode2                 STABLE
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
     1      ONLINE ONLINE          racnode2                 STABLE
ora.LISTENER_SCAN2.lsnr
     1       ONLINE ONLINE         racnode1                 STABLE
ora.LISTENER_SCAN3.lsnr
     1       ONLINE ONLINE         racnode1                 STABLE
ora.MGMTLSNR
   1       ONLINE ONLINE           racnode1                 169.254.182.6 200.20
                                                            0.30.11,STABLE
ora.cvu
     1       ONLINE ONLINE         racnode1                 STABLE
ora.mgmtdb
     1      ONLINE ONLINE          racnode1                 Open,STABLE
ora.oc4j
     1       ONLINE ONLINE         racnode1                 STABLE
ora.rac1db.db
     1       ONLINE ONLINE         racnode1                 Open,STABLE
     2       ONLINE ONLINE         racnode2                 Open,STABLE
ora.racnode1.vip
     1       ONLINE ONLINE         racnode1                 STABLE
ora.racnode2.vip
     1       ONLINE ONLINE         racnode2                 STABLE
ora.scan1.vip
     1       ONLINE ONLINE         racnode2                 STABLE
ora.scan2.vip
     1       ONLINE ONLINE         racnode1                 STABLE
ora.scan3.vip
     1       ONLINE ONLINE         racnode1                 STABLE
--------------------------------------------------------------------------------

Test #1C. Check: Cluster Verification.

The 3 main services are online on all nodes. Good news.

[grid@racnode1 ~]$ crsctl check cluster -all
**************************************************************
racnode1:
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
**************************************************************
racnode2:
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
**************************************************************

Test #1D. Check: CRS Verification.

Similar to Test 1C, this output shows the HA services are all OK too.

[grid@racnode1 ~]$ crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online

[grid@racnode2 ~]$ crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online

Test #1E. Check: CTSS.

Remember we’re not using NTP for server time synchronization, so Oracle’s CTSS should be doing the work for us which it seems to be.

[grid@racnode1 ~]$ crsctl check ctss
CRS-4701: The Cluster Time Synchronization Service is in Active mode.
CRS-4702: Offset (in msec): 0

[grid@racnode2 ~]$ crsctl check ctss
CRS-4701: The Cluster Time Synchronization Service is in Active mode.
CRS-4702: Offset (in msec): 0

Test #1F. Check: DNS.

Remember we configured DNS on the OEM Cloud Control server (oraemcc.mynet.com, 200.200.10.16). It seems to be working for both RAC nodes.

[grid@racnode1 ~]$ crsctl query dns -servers
CRS-10018: the following configuration was found on the system:
CRS-10019: There are 1 domains in search order. They are: mynet.com
CRS-10022: There are 1 name servers. They are: 200.200.10.16
CRS-10020: number of retry attempts for name lookup is: 4
CRS-10021: timeout for each name lookup is: 5

[grid@racnode2 ~]$ crsctl query dns -servers
CRS-10018: the following configuration was found on the system:
CRS-10019: There are 1 domains in search order. They are: mynet.com
CRS-10022: There are 1 name servers. They are: 200.200.10.16
CRS-10020: number of retry attempts for name lookup is: 4
CRS-10021: timeout for each name lookup is: 5

Test #1G. Check: Votedisk(s).

The Votedisk seems fine, but best practice says we should have more than just one and an odd number of them. We might want to fix this later.

[grid@racnode1 ~]$ crsctl query css votedisk
## STATE   File Universal Id                File Name Disk group
-- -----   -----------------                --------- ---------
1. ONLINE  6623c2c7b3a24fc5bffd28b93c053000 (/dev/oracleasm/disks/ASMDISK01) [CRS]
Located 1 voting disk(s).

Test #1H. Check: OCR File(s).

We can have up to 5 OCR files and we currently only have one. We might want to fix that later. Also, to perform the logical corruption check, we need to run this ocrcheck command as root.

[grid@racnode1 ~]$ cat /etc/oracle/ocr.loc
#Device/file +CRS getting replaced by device +CRS/cluster1/OCRFILE/registry.255.898728137
ocrconfig_loc=+CRS/cluster1/OCRFILE/registry.255.898728137

[grid@racnode1 ~]$ ocrcheck -details
Status of Oracle Cluster Registry is as follows :
         Version                  :          4
         Total space (kbytes)     :     409568
         Used space (kbytes)      :       1588
         Available space (kbytes) :     407980
         ID                       : 1641944593
         Device/File Name         : +CRS/cluster1/OCRFILE/registry.255.898728137
                                    Device/File integrity check succeeded
                                    Device/File not configured
                                    Device/File not configured
                                    Device/File not configured
                                    Device/File not configured

         Cluster registry integrity check succeeded
         Logical corruption check bypassed due to non-privileged user
[root@racnode1 ~]# /u01/app/12.1.0/grid/bin/ocrcheck -details
Status of Oracle Cluster Registry is as follows :
         Version                  :          4
         Total space (kbytes)     :     409568
         Used space (kbytes)      :       1588
         Available space (kbytes) :     407980
         ID                       : 1641944593
         Device/File Name         : +CRS/cluster1/OCRFILE/registry.255.898728137
                                    Device/File integrity check succeeded
                                    Device/File not configured
                                    Device/File not configured
                                    Device/File not configured
                                    Device/File not configured

         Cluster registry integrity check succeeded
         Logical corruption check succeeded

Test #1I. Check: OCR Backups.

The OCR is backed up automatically every 4 hours. If the cluster has been up and running long enough you can see the backup files. Running this command on any node should produce the same output.

[grid@racnode1 ~]$ ocrconfig -showbackup
racnode1     2016/01/27 10:15:35   /u01/app/12.1.0/grid/cdata/cluster1/backup00.ocr  0
racnode1     2016/01/27 06:14:48   /u01/app/12.1.0/grid/cdata/cluster1/backup01.ocr  0
racnode1     2016/01/27 02:14:11   /u01/app/12.1.0/grid/cdata/cluster1/backup02.ocr  0
racnode1     2016/01/26 18:13:01   /u01/app/12.1.0/grid/cdata/cluster1/day.ocr  0
racnode1     2016/01/15 04:31:01   /u01/app/12.1.0/grid/cdata/cluster1/week.ocr 0
PROT-25: Manual backups for the Oracle Cluster Registry are not available

Test #1J. Check: OLR.

The OLR plays in important role in starting the cluster if the OCR is stored in ASM. Better make sure each node’s OLR is alive and well.

[grid@racnode1 ~]$ cat /etc/oracle/olr.loc
olrconfig_loc=/u01/app/12.1.0/grid/cdata/racnode1.olr
crs_home=/u01/app/12.1.0/grid

[grid@racnode1 ~]$ ocrcheck -local -config
Oracle Local Registry configuration is :
Device/File Name         : /u01/app/12.1.0/grid/cdata/racnode1.olr

[grid@racnode1 ~]$ ls -l /u01/app/12.1.0/grid/cdata/racnode1.olr
-rw------- 1 root oinstall 503484416 Feb  6 14:22 /u01/app/12.1.0/grid/cdata/racnode1.olr

[grid@racnode2 ~]$ cat /etc/oracle/olr.loc
olrconfig_loc=/u01/app/12.1.0/grid/cdata/racnode2.olr
crs_home=/u01/app/12.1.0/grid

[grid@racnode2 ~]$ ocrcheck -local -config
Oracle Local Registry configuration is :
         Device/File Name         : /u01/app/12.1.0/grid/cdata/racnode2.olr

[grid@racnode2 ~]$ ls -l /u01/app/12.1.0/grid/cdata/racnode2.olr
-rw------- 1 root oinstall 503484416 Feb  6 18:24 /u01/app/12.1.0/grid/cdata/racnode2.olr

Test #1K. Check: GPnP.

The Grid Plug ‘n’ Play profile is an XML file which also plays an important role in cluster startup. It contains important node configuration information including the storage location of some GI resources, the node’s networking configuration and so on. Make sure it’s where it ought to be and eye ball its contents to make sure they appear accurate. They should be as they’re maintained by the GPnP daemon (gpnpd.bin).

[grid@racnode1 ~]$ cd /u01/app/12.1.0/grid/gpnp/racnode1/profiles/peer
[grid@racnode1 peer]$ ls -l
-rw-r--r-- 1 grid oinstall 1908 Dec 17 22:44 profile.old
-rw-r--r-- 1 grid oinstall 1850 Dec 17 22:36 profile_orig.xml
-rw-r--r-- 1 grid oinstall 1908 Dec 17 22:44 profile.xml

[grid@racnode2 ~]$ cd /u01/app/12.1.0/grid/gpnp/racnode2/profiles/peer
[grid@racnode2 peer]$ ls -l
-rw-r--r-- 1 grid oinstall 1850 Dec 17 23:05 profile_orig.xml
-rw-r--r-- 1 grid oinstall 1908 Dec 17 23:05 profile.xml

Test #1L. Check: SCAN VIPs.

The SCAN Listeners each need a VIP on which to operate. These commands show use they’re up and running and open for business.

[grid@racnode1 ~]$ srvctl status scan
SCAN VIP scan1 is enabled
SCAN VIP scan1 is running on node racnode2
SCAN VIP scan2 is enabled
SCAN VIP scan2 is running on node racnode2
SCAN VIP scan3 is enabled
SCAN VIP scan3 is running on node racnode1

[grid@racnode1 ~]$ srvctl config scan
SCAN name: cluster1-scan.mynet.com, Network: 1
Subnet IPv4: 200.200.10.0/255.255.255.0/eth0, static
Subnet IPv6: 
SCAN 0 IPv4 VIP: 200.200.10.122
SCAN VIP is enabled.
SCAN VIP is individually enabled on nodes: 
SCAN VIP is individually disabled on nodes: 
SCAN 1 IPv4 VIP: 200.200.10.120
SCAN VIP is enabled.
SCAN VIP is individually enabled on nodes: 
SCAN VIP is individually disabled on nodes: 
SCAN 2 IPv4 VIP: 200.200.10.121
SCAN VIP is enabled.
SCAN VIP is individually enabled on nodes: 
SCAN VIP is individually disabled on nodes: 

[grid@racnode1 ~]$ ps -ef | grep -v grep | grep SCAN
grid      7739     1  0 Jan26 ?        00:00:41 /u01/app/12.1.0/grid/bin/tnslsnr 
   LISTENER_SCAN3 -no_crs_notify -inherit

[grid@racnode2 ~]$ ps -ef | grep -v grep | grep SCAN
grid     16190     1  0 Jan26 ?        00:00:43 /u01/app/12.1.0/grid/bin/tnslsnr 
   LISTENER_SCAN1 -no_crs_notify -inherit
grid      7736     1  0 Jan26 ?        00:00:41 /u01/app/12.1.0/grid/bin/tnslsnr 
   LISTENER_SCAN2 -no_crs_notify -inherit

[grid@racnode1 ~]$ ping 200.200.10.120
PING 200.200.10.120 (200.200.10.120) 56(84) bytes of data.
64 bytes from 192.168.0.120: icmp_seq=1 ttl=64 time=0.288 ms
64 bytes from 192.168.0.120: icmp_seq=2 ttl=64 time=0.152 ms
--- 192.168.0.120 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1362ms
rtt min/avg/max/mdev = 0.152/0.220/0.288/0.068 ms

[grid@racnode1 ~]$ ping 200.200.10.121
PING 200.200.10.121 (200.200.10.121) 56(84) bytes of data.
64 bytes from 192.168.0.121: icmp_seq=1 ttl=64 time=0.072 ms
64 bytes from 192.168.0.121: icmp_seq=2 ttl=64 time=0.028 ms
--- 192.168.0.121 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1311ms
rtt min/avg/max/mdev = 0.028/0.050/0.072/0.022 ms

[grid@racnode1 ~]$ ping 200.200.10.122
PING 200.200.10.122 (200.200.10.122) 56(84) bytes of data.
64 bytes from 192.168.0.122: icmp_seq=1 ttl=64 time=0.310 ms
64 bytes from 192.168.0.122: icmp_seq=2 ttl=64 time=0.186 ms
--- 192.168.0.122 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1314ms
rtt min/avg/max/mdev = 0.186/0.248/0.310/0.062 ms

Test #1M. Check: Node VIPs.

To avoid a lengthy TCP timeout in the event of a node failure, its VIP needs to failover to a surviving node. For it to do that, it needs to be enabled and pingable.

[grid@racnode1 ~]$ srvctl status vip -node racnode1
VIP racnode1-vip.mynet.com is enabled
VIP racnode1-vip.mynet.com is running on node: racnode1

[grid@racnode1 ~]$ srvctl status vip -node racnode2
VIP racnode2-vip.mynet.com is enabled
VIP racnode2-vip.mynet.com is running on node: racnode2

[grid@racnode1 ~]$ ping racnode1-vip.mynet.com
PING racnode1-vip.mynet.com (200.200.10.111) 56(84) bytes of data.
64 bytes from racnode1-vip.mynet.com (200.200.10.111): icmp_seq=1 ttl=64 time=0.024 ms
64 bytes from racnode1-vip.mynet.com (200.200.10.111): icmp_seq=2 ttl=64 time=0.031 ms
64 bytes from racnode1-vip.mynet.com (200.200.10.111): icmp_seq=3 ttl=64 time=0.026 ms
--- racnode1-vip.mynet.com ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2441ms
rtt min/avg/max/mdev = 0.024/0.027/0.031/0.003 ms

[grid@racnode1 ~]$ ping racnode2-vip.mynet.com
PING racnode2-vip.mynet.com (200.200.10.112) 56(84) bytes of data.
64 bytes from racnode2-vip.mynet.com (200.200.10.112): icmp_seq=1 ttl=64 time=0.308 ms
64 bytes from racnode2-vip.mynet.com (200.200.10.112): icmp_seq=2 ttl=64 time=0.174 ms
64 bytes from racnode2-vip.mynet.com (200.200.10.112): icmp_seq=3 ttl=64 time=0.163 ms
--- racnode2-vip.mynet.com ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2513ms
rtt min/avg/max/mdev = 0.163/0.215/0.308/0.065 ms

Test #1N. Check: Nodeapps.

Nodeapps comprise a node’s VIP, network and ONS daemon. Check to see if they’re all up and running.

[grid@racnode1 ~]$ srvctl status nodeapps
VIP racnode1-vip.mynet.com is enabled
VIP racnode1-vip.mynet.com is running on node: racnode1
VIP racnode2-vip.mynet.com is enabled
VIP racnode2-vip.mynet.com is running on node: racnode2
Network is enabled
Network is running on node: racnode1
Network is running on node: racnode2
ONS is enabled
ONS daemon is running on node: racnode1
ONS daemon is running on node: racnode2

Test #1O. Check: Node Participation.

You might think you know which nodes make up your cluster, but does the cluster agree? In our case, yes it does!

[grid@racnode1 ~]$ olsnodes -n -i -s -t -a
racnode1        1       racnode1-vip.mynet.com        Active  Hub     Unpinned
racnode2        2       racnode2-vip.mynet.com        Active  Hub     Unpinned

Section #2: Check & Test ASM.

ASM has become the de facto storage standard for Oracle databases, so we’d better check the clustered version is healthy.

Test #2B. Check: ASM Instances.

Ensure the ASM instances are up, running and can accept logins.

[grid@racnode1 ~]$ srvctl status asm
ASM is running on racnode2,racnode1

[grid@racnode1 ~]$ sqlplus / as sysasm
SQL*Plus: Release 12.1.0.2.0 Production on Sat Feb 6 17:14:59 2016
Copyright (c) 1982, 2014, Oracle.  All rights reserved.
Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Real Application Clusters and Automatic Storage Management options
SQL>

[grid@racnode1 ~]$ asmcmd
ASMCMD> ls -l
State    Type    Rebal  Name
MOUNTED  EXTERN  N      CRS/
MOUNTED  EXTERN  N      DATA/
MOUNTED  EXTERN  N      FRA/

Test #2C. Check:  ASM Diskgroups.

Next, check that all the ASM Diskgroups are OK.

[grid@racnode1 ~]$ echo "select name from v\$asm_diskgroup;" | sqlplus -s / as sysasm

NAME
------------------------------
FRA
DATA
CRS

[grid@racnode1 ~]$ for dg in FRA DATA CRS
> do
> srvctl status diskgroup -diskgroup ${dg} -detail
> done
Disk Group FRA is running on racnode2,racnode1
Disk Group FRA is enabled
Disk Group DATA is running on racnode2,racnode1
Disk Group DATA is enabled
Disk Group CRS is running on racnode2,racnode1
Disk Group CRS is enabled

Test #2D. Check: ASM Diskgroup Metadata.

Check the ASM Diskgroup metadata is present and correct. No error messages means it is.

[grid@racnode1 ~]$ sqlplus / as sysasm

SQL> alter diskgroup crs check all;
Diskgroup altered.

SQL> alter diskgroup data check all;
Diskgroup altered.

[grid@racnode1 ~]$ asmcmd
ASMCMD> chkdg FRA
Diskgroup altered.

Test #2E. Check: ASM Disks.

Ensure all the ASM Disks you have configured are visible.

[grid@racnode1 ~]$ sqlplus / as sysasm

SQL> select dg.name DG,
            d.name Disk,
            decode(d.GROUP_NUMBER,
                   0,'Unallocated',
                   'In Use') State,
            d.path
     from   v$asm_disk d
            left outer join
            v$asm_diskgroup dg
            on dg.group_number = d.group_number
     order by
            dg.name,
            d.path;

DG         DISK       STATE       PATH
---------- ---------- ----------- ------------------------------
CRS        CRS_0000   In Use      /dev/oracleasm/disks/ASMDISK01
DATA       DATA_0000  In Use      /dev/oracleasm/disks/ASMDISK02
DATA       DATA_0001  In Use      /dev/oracleasm/disks/ASMDISK03
DATA       DATA_0002  In Use      /dev/oracleasm/disks/ASMDISK06
FRA        FRA_0000   In Use      /dev/oracleasm/disks/ASMDISK04
FRA        FRA_0001   In Use      /dev/oracleasm/disks/ASMDISK05
                      Unallocated /dev/oracleasm/disks/ASMDISK07
                      Unallocated /dev/oracleasm/disks/ASMDISK08
                      Unallocated /dev/oracleasm/disks/ASMDISK09
                      Unallocated /dev/oracleasm/disks/ASMDISK10

ASMCMD> lsdsk --discovery
Path
/dev/oracleasm/disks/ASMDISK01
/dev/oracleasm/disks/ASMDISK02
/dev/oracleasm/disks/ASMDISK03
/dev/oracleasm/disks/ASMDISK04
/dev/oracleasm/disks/ASMDISK05
/dev/oracleasm/disks/ASMDISK06
/dev/oracleasm/disks/ASMDISK07
/dev/oracleasm/disks/ASMDISK08
/dev/oracleasm/disks/ASMDISK09
/dev/oracleasm/disks/ASMDISK10

Test #2F. Check: ASM Clients.

Verify what is connected to each ASM instance.

[grid@racnode1 ~]$ sqlplus / as sysasm

SQL> select dg.NAME,
            c.INSTANCE_NAME,
            c.DB_NAME,
            c.CLUSTER_NAME,
            c.STATUS
     from   v$asm_client c,
            v$asm_diskgroup dg
     where  c.GROUP_NUMBER = dg.GROUP_NUMBER
     order by
            c.DB_NAME;

NAME       INSTANCE_N  DB_NAME  CLUSTER_NAME STATUS
---------- ----------  -------- ------------ ------------
DATA       +ASM1       +ASM     cluster1     CONNECTED
CRS        +ASM1       +ASM     cluster1     CONNECTED
FRA        RAC1DB1     RAC1DB   cluster1     CONNECTED
DATA       RAC1DB1     RAC1DB   cluster1     CONNECTED
CRS        -MGMTDB     _mgmtdb  cluster1     CONNECTED
[grid@racnode2 ~]$ sqlplus / as sysasm

NAME       INSTANCE_N  DB_NAME  CLUSTER_NAME STATUS
---------- ----------  -------- ------------ ------------
CRS        +ASM2       +ASM     cluster1     CONNECTED
DATA       +ASM2       +ASM     cluster1     CONNECTED
DATA       RAC1DB2     RAC1DB   cluster1     CONNECTED
FRA        RAC1DB2     RAC1DB   cluster1     CONNECTED

Section #3: Check & Test Databases & Instances.

Test #3C. Check: User RAC Database & Instance.

Confirm where the database instances are running.

[oracle@racnode1 ~]$ srvctl status database -d RAC1DB
Instance RAC1DB1 is running on node racnode1
Instance RAC1DB2 is running on node racnode2

Test #3D. Check: User RAC Database DBVERIFY.

For the sake of brevity, we’ll run the DBVERIFY executable (dbv) for only one datafile.

[oracle@racnode1 ~]$ sqlplus / as sysdba

SQL> select name from v$datafile;

NAME
--------------------------------------------------------------------------------
+DATA/RAC1DB/DATAFILE/system.258.898779945
+DATA/RAC1DB/DATAFILE/sysaux.257.898779761
+DATA/RAC1DB/DATAFILE/undotbs1.260.898780207
+DATA/RAC1DB/DATAFILE/undotbs2.265.898783783
+DATA/RAC1DB/DATAFILE/users.259.898780201

[oracle@racnode1 scripts]$ cat dbv_parfile.txt
FILE='+DATA/RAC1DB/DATAFILE/users.259.898780201'
LOGFILE='users_datafile.out'
FEEDBACK=10
USERID=system/<removed>

[oracle@racnode1 scripts]$ dbv parfile=dbv_parfile.txt
DBVERIFY: Release 12.1.0.2.0 - Production on Mon Feb 15 11:54:52 2016
Copyright (c) 1982, 2014, Oracle and/or its affiliates.  All rights reserved.
[oracle@racnode1 scripts]$ cat users_datafile.out
DBVERIFY: Release 12.1.0.2.0 - Production on Mon Feb 15 11:54:52 2016
Copyright (c) 1982, 2014, Oracle and/or its affiliates.  All rights reserved.

DBVERIFY - Verification starting : FILE = +DATA/RAC1DB/DATAFILE/users.259.898780201
DBVERIFY - Verification complete

Total Pages Examined         : 640
Total Pages Processed (Data) : 30
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 5
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 588
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 17
Total Pages Marked Corrupt   : 0
Total Pages Influx           : 0
Total Pages Encrypted        : 0
Highest block SCN            : 0 (0.0)

Test #3H. Test: Restart User RAC Database.

[oracle@racnode1 ~]$ srvctl stop database -d RAC1DB

[grid@racnode2 ~]$ crsctl status resource -t
--------------------------------------------------------------------------------
Name           Target  State        Server                   State details       
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.CRS.dg
               ONLINE  ONLINE       racnode1                 STABLE
               ONLINE  ONLINE       racnode2                 STABLE
ora.DATA.dg
               ONLINE  ONLINE       racnode1                 STABLE
               ONLINE  ONLINE       racnode2                 STABLE
ora.FRA.dg
               ONLINE  ONLINE       racnode1                 STABLE
               ONLINE  ONLINE       racnode2                 STABLE
ora.LISTENER.lsnr
               ONLINE  ONLINE       racnode1                 STABLE
               ONLINE  ONLINE       racnode2                 STABLE
ora.asm
               ONLINE  ONLINE       racnode1                 Started,STABLE
               ONLINE  ONLINE       racnode2                 Started,STABLE
ora.net1.network
               ONLINE  ONLINE       racnode1                 STABLE
               ONLINE  ONLINE       racnode2                 STABLE
ora.ons
               ONLINE  ONLINE       racnode1                 STABLE
               ONLINE  ONLINE       racnode2                 STABLE
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       racnode2                 STABLE
ora.LISTENER_SCAN2.lsnr
      1        ONLINE  ONLINE       racnode2                 STABLE
ora.LISTENER_SCAN3.lsnr
      1        ONLINE  ONLINE       racnode1                 STABLE
ora.MGMTLSNR
      1        ONLINE  ONLINE       racnode1                 169.254.182.6 200.20
                                                             0.30.11,STABLE
ora.cvu
      1        OFFLINE OFFLINE                               STABLE
ora.mgmtdb
      1        ONLINE  ONLINE       racnode1                 Open,STABLE
ora.oc4j
      1        ONLINE  ONLINE       racnode1                 STABLE
ora.rac1db.db
      1        OFFLINE OFFLINE                               Instance Shutdown,ST
                                                             ABLE
      2        OFFLINE OFFLINE                               Instance Shutdown,ST
                                                             ABLE
ora.racnode1.vip
      1        ONLINE  ONLINE       racnode1                 STABLE
ora.racnode2.vip
      1        ONLINE  ONLINE       racnode2                 STABLE
ora.scan1.vip
      1        ONLINE  ONLINE       racnode2                 STABLE
ora.scan2.vip
      1        ONLINE  ONLINE       racnode2                 STABLE
ora.scan3.vip
      1        ONLINE  ONLINE       racnode1                 STABLE
--------------------------------------------------------------------------------

[oracle@racnode1 ~]$ srvctl start database -d RAC1DB

[grid@racnode2 ~]$ crsctl status resource -t
--------------------------------------------------------------------------------
Name           Target  State        Server                   State details       
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.CRS.dg
               ONLINE  ONLINE       racnode1                 STABLE
               ONLINE  ONLINE       racnode2                 STABLE
ora.DATA.dg
               ONLINE  ONLINE       racnode1                 STABLE
               ONLINE  ONLINE       racnode2                 STABLE
ora.FRA.dg
               ONLINE  ONLINE       racnode1                 STABLE
               ONLINE  ONLINE       racnode2                 STABLE
ora.LISTENER.lsnr
               ONLINE  ONLINE       racnode1                 STABLE
               ONLINE  ONLINE       racnode2                 STABLE
ora.asm
               ONLINE  ONLINE       racnode1                 Started,STABLE
               ONLINE  ONLINE       racnode2                 Started,STABLE
ora.net1.network
               ONLINE  ONLINE       racnode1                 STABLE
               ONLINE  ONLINE       racnode2                 STABLE
ora.ons
               ONLINE  ONLINE       racnode1                 STABLE
               ONLINE  ONLINE       racnode2                 STABLE
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       racnode2                 STABLE
ora.LISTENER_SCAN2.lsnr
      1        ONLINE  ONLINE       racnode2                 STABLE
ora.LISTENER_SCAN3.lsnr
      1        ONLINE  ONLINE       racnode1                 STABLE
ora.MGMTLSNR
      1        ONLINE  ONLINE       racnode1                 169.254.182.6 200.20
                                                             0.30.11,STABLE
ora.cvu
      1        OFFLINE OFFLINE                               STABLE
ora.mgmtdb
      1        ONLINE  ONLINE       racnode1                 Open,STABLE
ora.oc4j
      1        ONLINE  ONLINE       racnode1                 STABLE
ora.rac1db.db
      1        ONLINE  ONLINE       racnode1                 Open,STABLE
      2        ONLINE  ONLINE       racnode2                 Open,STABLE
ora.racnode1.vip
      1        ONLINE  ONLINE       racnode1                 STABLE
ora.racnode2.vip
      1        ONLINE  ONLINE       racnode2                 STABLE
ora.scan1.vip
      1        ONLINE  ONLINE       racnode2                 STABLE
ora.scan2.vip
      1        ONLINE  ONLINE       racnode2                 STABLE
ora.scan3.vip
      1        ONLINE  ONLINE       racnode1                 STABLE
--------------------------------------------------------------------------------

Test #3I. Test: Planned User Instance Shutdown.

[oracle@racnode1 ~]$ srvctl stop instance -d RAC1DB -i RAC1DB2

[grid@racnode2 ~]$ crsctl status resource -t
--------------------------------------------------------------------------------
Name           Target  State        Server                   State details       
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.CRS.dg
               ONLINE  ONLINE       racnode1                 STABLE
               ONLINE  ONLINE       racnode2                 STABLE
ora.DATA.dg
               ONLINE  ONLINE       racnode1                 STABLE
               ONLINE  ONLINE       racnode2                 STABLE
ora.FRA.dg
               ONLINE  ONLINE       racnode1                 STABLE
               ONLINE  ONLINE       racnode2                 STABLE
ora.LISTENER.lsnr
               ONLINE  ONLINE       racnode1                 STABLE
               ONLINE  ONLINE       racnode2                 STABLE
ora.asm
               ONLINE  ONLINE       racnode1                 Started,STABLE
               ONLINE  ONLINE       racnode2                 Started,STABLE
ora.net1.network
               ONLINE  ONLINE       racnode1                 STABLE
               ONLINE  ONLINE       racnode2                 STABLE
ora.ons
               ONLINE  ONLINE       racnode1                 STABLE
               ONLINE  ONLINE       racnode2                 STABLE
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       racnode2                 STABLE
ora.LISTENER_SCAN2.lsnr
      1        ONLINE  ONLINE       racnode2                 STABLE
ora.LISTENER_SCAN3.lsnr
      1        ONLINE  ONLINE       racnode1                 STABLE
ora.MGMTLSNR
      1        ONLINE  ONLINE       racnode1                 169.254.182.6 200.20
                                                             0.30.11,STABLE
ora.cvu
      1        OFFLINE OFFLINE                               STABLE
ora.mgmtdb
      1        ONLINE  ONLINE       racnode1                 Open,STABLE
ora.oc4j
      1        ONLINE  ONLINE       racnode1                 STABLE
ora.rac1db.db
      1        ONLINE  ONLINE       racnode1                 Open,STABLE
      2        OFFLINE OFFLINE                               Instance Shutdown,ST
                                                             ABLE
ora.racnode1.vip
      1        ONLINE  ONLINE       racnode1                 STABLE
ora.racnode2.vip
      1        ONLINE  ONLINE       racnode2                 STABLE
ora.scan1.vip
      1        ONLINE  ONLINE       racnode2                 STABLE
ora.scan2.vip
      1        ONLINE  ONLINE       racnode2                 STABLE
ora.scan3.vip
      1        ONLINE  ONLINE       racnode1                 STABLE
--------------------------------------------------------------------------------

Test #3J. Test: User Session TAF.

[oracle@oraemcc ~]$ cat $ORACLE_HOME/network/admin/tnsnames.ora
RAC1DB =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = cluster1-scan.mynet.com)(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = RAC1DB.mynet.com)
      (FAILOVER_MODE = 
        (TYPE=SESSION)
        (METHOD=BASIC)
        (RETRIES=10)
        (DELAY=10)
      )
    )
  )
[oracle@oraemcc ~]$ sqlplus system@rac1db

SQL> select instance_name from v$instance;

INSTANCE_NAME
----------------
RAC1DB1

[oracle@racnode1 ~]$ srvctl stop instance -d RAC1DB -i RAC1DB1

[grid@racnode2 ~]$ crsctl status resource -t
...
ora.rac1db.db
      1        OFFLINE OFFLINE                               Instance Shutdown,ST
                                                             ABLE
      2        ONLINE  ONLINE       racnode2                 Open,STABLE
...
SQL> select instance_name from v$instance;
select instance_name from v$instance
*
ERROR at line 1:
ORA-25408: can not safely replay call
SQL> /

INSTANCE_NAME
----------------
RAC1DB2

Section #4: Test System & Cluster.

Test #4C. Test: Cluster Restart.

[root@racnode1 ~]# cd /u01/app/12.1.0/grid/bin
[root@racnode1 ~]# ./crsctl stop cluster -all

The output from this command is quite verbose, but it can be viewed here.

[root@racnode1 bin]# ./crsctl check cluster -all
**************************************************************
racnode1:
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4530: Communications failure contacting Cluster Synchronization Services daemon
CRS-4534: Cannot communicate with Event Manager
**************************************************************
racnode2:
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4530: Communications failure contacting Cluster Synchronization Services daemon
CRS-4534: Cannot communicate with Event Manager
**************************************************************

[grid@racnode2 ~]$ crsctl status resource -t
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4000: Command Status failed, or completed with errors.

[root@racnode1 bin]# ./crsctl start cluster -all

The output from the start cluster command is also a little verbose. It can be viewed here.

Note, despite the cluster having been shutdown and re-started, the instance RAC1DB1 on racnode1 remained shutdown. This is because that instance was shutdown in Test 3J. Re-starting the cluster does NOT automatically re-start cluster resources which had been previously shutdown. This is expected behavior.

[grid@racnode2 ~]$ crsctl status resource -t
--------------------------------------------------------------------------------
Name           Target  State        Server                   State details       
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.CRS.dg
               ONLINE  ONLINE       racnode1                 STABLE
               ONLINE  ONLINE       racnode2                 STABLE
ora.DATA.dg
               ONLINE  ONLINE       racnode1                 STABLE
               ONLINE  ONLINE       racnode2                 STABLE
ora.FRA.dg
               ONLINE  ONLINE       racnode1                 STABLE
               ONLINE  ONLINE       racnode2                 STABLE
ora.LISTENER.lsnr
               ONLINE  ONLINE       racnode1                 STABLE
               ONLINE  ONLINE       racnode2                 STABLE
ora.asm
               ONLINE  ONLINE       racnode1                 Started,STABLE
               ONLINE  ONLINE       racnode2                 Started,STABLE
ora.net1.network
               ONLINE  ONLINE       racnode1                 STABLE
               ONLINE  ONLINE       racnode2                 STABLE
ora.ons
               ONLINE  ONLINE       racnode1                 STABLE
               ONLINE  ONLINE       racnode2                 STABLE
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       racnode2                 STABLE
ora.LISTENER_SCAN2.lsnr
      1        ONLINE  ONLINE       racnode2                 STABLE
ora.LISTENER_SCAN3.lsnr
      1        ONLINE  ONLINE       racnode1                 STABLE
ora.MGMTLSNR
      1        ONLINE  ONLINE       racnode1                 169.254.182.6 192.16
                                                             8.2.11,STABLE
ora.cvu
      1        OFFLINE OFFLINE                               STABLE
ora.mgmtdb
      1        ONLINE  ONLINE       racnode1                 Open,STABLE
ora.oc4j
      1        ONLINE  ONLINE       racnode1                 STABLE
ora.rac1db.db
      1        OFFLINE OFFLINE                               Instance Shutdown,ST
                                                             ABLE
      2        ONLINE  ONLINE       racnode2                 Open,STABLE

ora.racnode1.vip
      1        ONLINE  ONLINE       racnode1                 STABLE
ora.racnode2.vip
      1        ONLINE  ONLINE       racnode2                 STABLE
ora.scan1.vip
      1        ONLINE  ONLINE       racnode2                 STABLE
ora.scan2.vip
      1        ONLINE  ONLINE       racnode2                 STABLE
ora.scan3.vip
      1        ONLINE  ONLINE       racnode1                 STABLE
--------------------------------------------------------------------------------

Section #5: Health Checks & ORAchk.

Test #5A. Check: Cluster Health Check..

There is a -html option for the cluvfy command which writes its output in HTML format to this location by default:

/u01/app/grid/crsdata/@global/cvu/report/html/cvucheckreport.html

The HTML version is very useful because some of the references it contains are hyperlinks to Oracle’s technical support documentation. How convenient is that?

Note, running the cluvfy command with the -html option will overwrite any existing cvucheckreport.html file.

[grid@racnode1 ~]$ cluvfy comp healthcheck -collect cluster -bestpractice > cluster_hc.txt

An example of this command’s output can be viewed here. Looks like we got busted for not using Jumbo Frames!

Test #5B. Check: RAC Database Health Check..

There is a -html option for the cluvfy command which writes its output in HTML format to this location by default:

/u01/app/grid/crsdata/@global/cvu/report/html/cvucheckreport.html

The HTML version is very useful because some of the references it contains are hyperlinks to Oracle’s technical support documentation.

Note, running the cluvfy command with the -html option will overwrite any existing cvucheckreport.html file.

[grid@racnode1 ~]$ cluvfy comp healthcheck -collect database -bestpractice > database_hc.txt

An example of this command’s output can be seen here. Looks like a couple of things could do with some attention.

Test #5C. Check: Run ORAchk.

ORAchk is a utility provided by Oracle Support which analyzes your operating system setup, Oracle software installation(s) and configurations and compares them with Oracle best practices, recommendations and available patches. The objective is to identify known problem areas before they cause disruption to database services. ORAchk can only be downloaded via MOS, Doc ID 1268927.2. As the oracle software installation user (usually oracle), you unzip the ORAchk zip file, then run the orachk command with the required parameters. For example:

[oracle@racnode1 ORAchk]$ ./orachk -a -o verbose

The great thing about ORAchk is it only analyzes the things it needs to or you want it to. It’s output is saved to a time stamped directory, the main component being a large HTML report file. When opened in a browser, the report shows an overall System Health Score out of 100. For each component checked, the report provides individual test feedback in the form of a Pass, Fail, Warning or Info message. You can view the details of each test which provides recommendations and even links to technical support documentation which show how to fix the problem. In some cases, the details contain actual SQL code which will fix the problem.

The not so great thing about ORAchk is that it is very fussy and as a result does not always generate correct or accurate results. For example, some perfectly valid unalias commands in the .bash_profile of the grid user caused ORAchk to fail multiple tests and even time out running others. Code timing out is especially problematic to the extent there are two environment variables you can set to allow a longer timeout periods. These are documented in the User Guide. The values I have used which (mostly) worked are:

RAT_ROOT_TIMEOUT=800
RAT_TIMEOUT=280

Suffice it to say, ORAchk isn’t perfect but is still worth trying. Just make sure you verify its results.

Task #4: Common Administration Tasks.

This task will see us perform a few corrective actions based upon the output of the health checks and ORAchk. We’ll also complete some admin which includes some of the things you’d typically do to a cluster and RAC database. We having fun yet? 🙂

We have 9 common tasks to complete. Links below:

Task #4.1: Enable Archiving for the RAC Database.

ORAchk highlighted our RAC database is not running in archivelog mode. So let’s fix that.

Task #4.1a: Review Parameters.

The normal parameters used to control the archiving of the Online Redo Logs have no effect when using ASM and a Fast Recovery Area. The parameters which do matter are these:

SQL> show parameter db_recovery_file_dest

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
db_recovery_file_dest                string      +FRA
db_recovery_file_dest_size           big integer 4785M

The parameter db_recovery_file_dest effectively trumps the traditional log_archive_dest_1 parameter and ensures the archived Online Redo Logs are stored in the Fast Recovery Area. In our case, that happens to be implemented as an ASM Diskgroup called +FRA. Also, the traditional parameter log_archive_format, which defaults to %t_%s_%r.dbf, has no effect since the archived Online Redo Log file names will follow the OMF standard naming convention.

Task #4.1b: Shutdown the RAC Database.

[oracle@racnode1 ~]$ srvctl stop database -d RAC1DB

Task #4.1c: Start One Instance.

[oracle@racnode1 ~]$ . oraenv
ORACLE_SID = [RAC1DB1] ? 
The Oracle base remains unchanged with value /u01/app/oracle

[oracle@racnode1 ~]$ sqlplus / as sysdba
SQL*Plus: Release 12.1.0.2.0 Production on Tue Feb 16 21:02:46 2016
Copyright (c) 1982, 2014, Oracle.  All rights reserved.

Connected to an idle instance.

SQL> startup mount
ORACLE instance started.

Total System Global Area 2147483648 bytes
Fixed Size                  2926472 bytes
Variable Size            1275070584 bytes
Database Buffers          855638016 bytes
Redo Buffers               13848576 bytes
Database mounted.

Task #4.1d: Enable Archiving & Shutdown the Instance.

SQL> alter database archivelog;
Database altered.

SQL> shutdown immediate 
ORA-01109: database not open

Database dismounted.
ORACLE instance shut down.
SQL> exit

Task #4.1e: Restart the RAC Database.

[oracle@racnode1 ~]$ srvctl start database -d RAC1DB

Task #4.1f: Verify Archivelog Mode & Generate Archived Online Redo Logs.

From racnode1 (Thread #1 – RAC1DB1):

SQL> archive log list
Database log mode              Archive Mode
Automatic archival             Enabled
Archive destination            USE_DB_RECOVERY_FILE_DEST
Oldest online log sequence     316
Next log sequence to archive   317
Current log sequence           317

SQL> alter system switch logfile;
System altered.

SQL> select name from v$archived_log;

NAME
--------------------------------------------------------------------------------
+FRA/RAC1DB/ARCHIVELOG/2016_02_16/thread_1_seq_317.261.903993279

From racnode2 (Thread #2 – RAC1DB2):

SQL> archive log list
Database log mode              Archive Mode
Automatic archival             Enabled
Archive destination            USE_DB_RECOVERY_FILE_DEST
Oldest online log sequence     265
Next log sequence to archive   265
Current log sequence           266

SQL> alter system switch logfile;
System altered.

SQL> select name from v$archived_log;

NAME
--------------------------------------------------------------------------------
+FRA/RAC1DB/ARCHIVELOG/2016_02_16/thread_1_seq_317.261.903993279
+FRA/RAC1DB/ARCHIVELOG/2016_02_16/thread_2_seq_265.262.903993427

We can also see the archived Online Redo Logs via ASMCMD:

ASMCMD> pwd
+FRA/RAC1DB/ARCHIVELOG/2016_02_16

ASMCMD> ls -l
Type        Redund  Striped  Time             Sys  Name
ARCHIVELOG  UNPROT  COARSE   FEB 16 21:00:00  Y    thread_1_seq_317.261.903993279
ARCHIVELOG  UNPROT  COARSE   FEB 16 21:00:00  Y    thread_2_seq_265.262.903993427

Task #4.2: Add an Additional OCR File.

We can have up to 5 OCR files and the fact Oracle allows for that suggests we ought to have more than just one. So let’s fix that.

Task #4.2a: Review Current OCR Configuration.

[grid@racnode2 ~]$ ocrcheck 
Status of Oracle Cluster Registry is as follows :
         Version                  :          4
         Total space (kbytes)     :     409568
         Used space (kbytes)      :       1588
         Available space (kbytes) :     407980
         ID                       : 1641944593
         Device/File Name         :       +CRS
                                    Device/File integrity check succeeded
                                    Device/File not configured
                                    Device/File not configured
                                    Device/File not configured
                                    Device/File not configured
         Cluster registry integrity check succeeded
         Logical corruption check bypassed due to non-privileged user

So, we know we have just one OCR file located in the +CRS ASM Diskgroup. Let’s take a look at that file using ASMCMD:

ASMCMD> pwd
+crs/cluster1/OCRFILE

ASMCMD> ls -l
Type     Redund  Striped  Time             Sys  Name
OCRFILE  UNPROT  COARSE   FEB 16 22:00:00  Y    REGISTRY.255.898728137

Task #4.2b: Add an OCR Location.

I created an additional ASM Diskgroup called CRS2. The ocrconfig command must be run as root.

[root@racnode1 ~]# . oraenv
ORACLE_SID = [root] ? +ASM1
The Oracle base has been set to /u01/app/grid

[root@racnode1 ~]# ocrconfig -add +CRS2

Note, initial attempts to run this command repeatedly generated PROT-16 and PROC-23 errors. CRS was running fine on both nodes, the CRS log file contained no errors and the ASM instance alert log did not tell me anything of interest. A reboot of both nodes made this problem go away. Go figure. YMMV.

Task #4.2c: Review Updated OCR Configuration.

[grid@racnode1 ~]$ ocrcheck
Status of Oracle Cluster Registry is as follows :
         Version                  :          4
         Total space (kbytes)     :     409568
         Used space (kbytes)      :       1604
         Available space (kbytes) :     407964
         ID                       : 1641944593
         Device/File Name         :       +CRS
                                    Device/File integrity check succeeded
         Device/File Name         :      +CRS2
                                    Device/File integrity check succeeded
                                    Device/File not configured
                                    Device/File not configured
                                    Device/File not configured
         Cluster registry integrity check succeeded
         Logical corruption check bypassed due to non-privileged user

[grid@racnode1 ~]$ cat /etc/oracle/ocr.loc
#Device/file  getting replaced by device +CRS2/cluster1/OCRFILE/REGISTRY.255.904309547 
ocrconfig_loc=+CRS/cluster1/OCRFILE/registry.255.898728137
ocrmirrorconfig_loc=+CRS2/cluster1/OCRFILE/REGISTRY.255.904309547

ASMCMD> pwd
+crs2/cluster1/OCRFILE

ASMCMD> ls -l
Type     Redund  Striped  Time             Sys  Name
OCRFILE  UNPROT  COARSE   FEB 20 14:00:00  Y    REGISTRY.255.904309547

Task #4.3: Add an Additional Votedisk.

Oracle best practice dictates we should have an odd number of Votedisks. Yes, I know 1 is an odd number but we should probably have more than one. Let’s fix that.

Task #4.3a: Review Current Votedisk Configuration.

[grid@racnode2 ~]$ crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
 1. ONLINE   6623c2c7b3a24fc5bffd28b93c053000 (/dev/oracleasm/disks/ASMDISK01) [CRS]
Located 1 voting disk(s).

Note, there is currently only ONE Votedisk because it resides in the +CRS ASM Diskgroup which is configured with external redundancy and a single ASM Disk. Hence, only ONE Votedisk.

Task #4.3b: Add Additional Votedisks.

Since we’re using ASM to store the Votedisk, we can’t just add more Votedisks to other ASM Diskgroups. It doesn’t work that way. Instead,the Votedisk redundancy (i.e. the number of Votedisks you have) is dependent upon the redundancy level of the ASM Diskgroup where the Votedisk(s) are stored. However, there is a mismatch between ASM Diskgroup redundancy and Oracle Clusterware Votedisk redundancy.

Redundancy Level Minimum # of ASM Disks # of Clusterware Votedisk Files
External 1 1
Normal 2 3
High 3 5

What this means is, if you create an ASM Diskgroup with NORMAL redundancy and 2 ASM Disks, trying to create Votedisks in that ASM Diskgroup would fail. This is because NORMAL redundancy for Votedisks would need 3 ASM Disks, not 2. Using a NORMAL redundancy ASM Diskgroup with 3 ASM Disks could contain Votedisks as all 3 disks would be used to store Votedisks.

To further make the point, I created a 4 disk NORMAL redundancy ASM Diskgroup called DATA2:

SQL> select dg.name,
            dg.type Redundancy,
            d.name,
            d.path
     from   v$asm_disk d,
            v$asm_diskgroup dg
     where  dg.group_number = d.group_number
     and    dg.name = 'DATA2'
     order by
            d.name;

NAME       REDUNDANCY NAME       PATH
---------- ---------- ---------- ------------------------------
DATA2      NORMAL     DATA2_0000 /dev/oracleasm/disks/ASMDISK08
DATA2      NORMAL     DATA2_0001 /dev/oracleasm/disks/ASMDISK09
DATA2      NORMAL     DATA2_0002 /dev/oracleasm/disks/ASMDISK10
DATA2      NORMAL     DATA2_0003 /dev/oracleasm/disks/ASMDISK06

Now we’ll move the Votedisk from the +CRS ASM Diskgroup to the +DATA2 ASM Diskgroup:

[grid@racnode2 ~]$ crsctl replace votedisk +DATA2
Successful addition of voting disk fd93674940d24f72bfcf07b3e1ac6d14.
Successful addition of voting disk fc9912809b3f4f3fbf64add0b80b34e4.
Successful addition of voting disk 5e840512f2904f41bfbc5dd468b4051b.
Successful deletion of voting disk 6623c2c7b3a24fc5bffd28b93c053000.
Successfully replaced voting disk group with +DATA2.
CRS-4266: Voting file(s) successfully replaced

Task #4.3c: Review Updated Votedisk Configuration.

As you can see, the NORMAL redundancy setting of the ASM Diskgroup created 3 Votedisks:

[grid@racnode2 ~]$ crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
 1. ONLINE   fd93674940d24f72bfcf07b3e1ac6d14 (/dev/oracleasm/disks/ASMDISK08) [DATA2]
 2. ONLINE   fc9912809b3f4f3fbf64add0b80b34e4 (/dev/oracleasm/disks/ASMDISK09) [DATA2]
 3. ONLINE   5e840512f2904f41bfbc5dd468b4051b (/dev/oracleasm/disks/ASMDISK10) [DATA2]
Located 3 voting disk(s).

Note, only 3 of the 4 ASM Disks in the +DATA2 ASM Diskgroup are used for Votedisks.

Task #4.4: Implement oswatcher.

ORAchk highlighted our RAC nodes are not running oswatcher. Let’s fix that.

The oswatcher utility available via Oracle Support, Doc ID 301137.1. It  captures and analyzes performance data from the underlying operating system. The capture aspect is documented in MOS Doc ID 1531223.1 and the analyze aspect is documented in MOS Doc ID 461053.1.

Task #4.4a: Download.

Within MOS, navigate to Doc ID 301137.1, then click the OSWatcher Download link. This will take you to the OSWatcher Download section of the document. There you’ll see another link called:

Click here to download OSWatcher for Solaris, Linux and HP-UX. (AIX users see link below)

Click that link and you’ll be prompted to download a file called oswbb733.tar (where the 733 means version 7.3.3). Once this file is downloaded, copy it to an appropriate location on ALL the nodes in your cluster.

Task #4.4b: Install.

Installing OSWatcher is simply a case of untarring the tar file, then changing the file permissions:

[oracle@racnode1 ~]$ cd ./media/oswatcher

[oracle@racnode1 OSWatcher]$ tar -xvf oswbb733.tar

[oracle@racnode1 oswatcher]$ ls -l
drwxr-xr-x 4 oracle oinstall    4096 Feb 27  2015 oswbb
-rw-r--r-- 1 oracle oinstall 6318080 Feb 21 15:10 oswbb733.tar

[oracle@racnode1 oswatcher]$ cd oswbb
[oracle@racnode1 oswbb]$ chmod 744 *
[oracle@racnode1 oswbb]$ ls -l
-rwxr--r-- 1 oracle oinstall     67 Jan 15  2014 call_du.sh
-rwxr--r-- 1 oracle oinstall     68 Oct  7  2013 call_sar.sh
-rwxr--r-- 1 oracle oinstall     71 Jan  7  2014 call_uptime.sh
drwxr--r-- 4 oracle oinstall   4096 Sep  8  2014 docs
-rwxr--r-- 1 oracle oinstall    626 Jan 15  2014 Example_extras.txt
-rwxr--r-- 1 oracle oinstall   1864 Oct  7  2013 Exampleprivate.net
-rwxr--r-- 1 oracle oinstall    772 May  8  2014 ifconfigsub.sh
-rwxr--r-- 1 oracle oinstall    743 Oct  7  2013 iosub.sh
-rwxr--r-- 1 oracle oinstall   1486 Jan  8  2014 ltop.sh
-rwxr--r-- 1 oracle oinstall    542 Oct  7  2013 mpsub.sh
-rwxr--r-- 1 oracle oinstall    740 Oct  7  2013 nfssub.sh
-rwxr--r-- 1 oracle oinstall   5062 Sep 17  2014 OSWatcherFM.sh
-rwxr--r-- 1 oracle oinstall  35108 Feb 27  2015 OSWatcher.sh
-rwxr--r-- 1 oracle oinstall 233897 Feb 27  2015 oswbba.jar
-rwxr--r-- 1 oracle oinstall    414 Oct  7  2013 oswib.sh
-rwxr--r-- 1 oracle oinstall    435 Jan  8  2014 oswnet.sh
-rwxr--r-- 1 oracle oinstall    825 Oct  7  2013 oswrds.sh
-rwxr--r-- 1 oracle oinstall    524 Oct  7  2013 oswsub.sh
-rwxr--r-- 1 oracle oinstall   1445 Oct 17  2013 psmemsub.sh
drwxr--r-- 2 oracle oinstall   4096 May  2  2014 src
-rwxr--r-- 1 oracle oinstall   2574 Feb 26  2015 startOSWbb.sh
-rwxr--r-- 1 oracle oinstall    558 Apr 17  2014 stopOSWbb.sh
-rwxr--r-- 1 oracle oinstall    746 Nov  6  2013 tarupfiles.sh
-rwxr--r-- 1 oracle oinstall   4219 Nov  6  2013 tar_up_partial_archive.sh
-rwxr--r-- 1 oracle oinstall    545 Feb 23  2015 vmsub.sh
-rwxr--r-- 1 oracle oinstall   1486 Feb 26  2015 xtop.sh

User guides and README files are in the docs subdirectory.

Task #4.4c: Configure.

OSWatcher basically runs a series of OS monitoring commands at a predetermined frequency, stores the output of those commands for a predetermined amount of time in a predetermined location.

The defaults are a frequency of 30 seconds with output data kept for the last 48 hours in an archive directory where you untarred the software. These can all be customized.

In addition, Oracle Support recommends you use OSWatcher to monitor Grid Infrastructure’s private interconnect networks. This is done by creating a special file called private.net with the appropriate entries:

[oracle@racnode1 oswbb]$ cat private.net
######################################################################
#Linux Example
######################################################################
echo "zzz ***"`date`
traceroute -r -F racnode1-priv
traceroute -r -F racnode2-priv
######################################################################
# DO NOT DELETE THE FOLLOWING LINE!!!!!!!!!!!!!!!!!!!!!
######################################################################
rm locks/lock.file

Task #4.4d: Run OSWatcher.

The script which starts OSWatcher is called startOSWbb.sh. It can take up to 4 parameters:

Parameter Position Meaning
1 Sample Frequency (Seconds)
2 Archive History (Hours)
3 OS Compression Utility to Use
4 Archive Directory Path

For example, to capture stats once per minute, keep those stats for 12 hours, use gzip to compress the archive  data and store the archive data in /nas1/OSWATCHER/racnode1, use this command:

[oracle@racnode1 oswbb]$ ./startOSWbb.sh 60 12 gzip /nas1/OSWATCHER/racnode1
Info...Zip option IS specified. 
Info...OSW will use gzip to compress files.

Testing for discovery of OS Utilities...
VMSTAT found on your system.
IOSTAT found on your system.
MPSTAT found on your system.
IFCONFIG found on your system.
NETSTAT found on your system.
TOP found on your system.
Warning... /proc/slabinfo not found on your system.

Testing for discovery of OS CPU COUNT
oswbb is looking for the CPU COUNT on your system
CPU COUNT will be used by oswbba to automatically look for cpu problems

CPU COUNT found on your system.
CPU COUNT = 1

Discovery completed.

Starting OSWatcher v7.3.3  on Sun Feb 21 17:32:10 CST 2016
With SnapshotInterval = 60
With ArchiveInterval = 12

OSWatcher - Written by Carl Davis, Center of Expertise,
Oracle Corporation
For questions on install/usage please go to MOS (Note:301137.1)
If you need further assistance or have comments or enhancement
requests you can email me Carl.Davis@Oracle.com

Data is stored in directory: /nas1/OSWATCHER/racnode1

Starting Data Collection...

Once OSWatcher is underway, it will echo heartbeat information to your screen at the frequency you specified (in our case every 60 seconds):

oswbb heartbeat:Sun Feb 21 17:40:16 CST 2016

It’s OK to log out of the session which initiated OSWatcher. The processes continue to run in the background:

[oracle@racnode1 ~]$ ps -ef | grep -v grep | grep OSW
oracle   27722     1  0 17:31 ?        00:00:00 /bin/sh ./OSWatcher.sh 60 12 gzip 
 /nas1/OSWATCHER/racnode1
oracle   27906 27722  0 17:32 ?        00:00:00 /bin/sh ./OSWatcherFM.sh 12 
 /nas1/OSWATCHER/racnode1

Task #4.4e: Analyze.

Analyzing the OSWatcher data might be a topic of a future update. You can’t have everything all in one go.

Task #4.5: Check I/O Calibration.

The RAC Database Health Check reported this failure:

Verification Check       : FILESYSTEMIO_OPTIONS
Verification Description : Checks FILESYSTEMIO_OPTIONS parameter
Verification Result      : WARNING
Verification Summary     : Check for FILESYSTEMIO_OPTIONS failed
Additional Details       : FILESYSTEMIO_OPTIONS=setall supports both direct I/O and
                           asynchronus I/O which helps to achieve optimal performance
                           with database data files

Database(Instance) Status  Expected Value                 Actual Value
---------------------------------------------------------------------------------------
rac1db             FAILED  filesystemio_options = SETALL  filesystemio_options = none

We know our RAC database is using ASM for its storage which should optimize I/O for us. One of the prerequisites of running Oracle’s I/O Calibration is to ensure asynchronous I/O is enabled for all the database datafiles, so let’s check that now:

SQL> SELECT NAME,
            ASYNCH_IO
     FROM   V$DATAFILE F,
            V$IOSTAT_FILE I
     WHERE  F.FILE#=I.FILE_NO
     AND    FILETYPE_NAME='Data File';

NAME                                               ASYNCH_IO
-------------------------------------------------- ---------
+DATA/RAC1DB/DATAFILE/system.258.898779945         ASYNC_ON
+DATA/RAC1DB/DATAFILE/sysaux.257.898779761         ASYNC_ON
+DATA/RAC1DB/DATAFILE/undotbs1.260.898780207       ASYNC_ON
+DATA/RAC1DB/DATAFILE/undotbs2.265.898783783       ASYNC_ON
+DATA/RAC1DB/DATAFILE/users.259.898780201          ASYNC_ON

Asynchronous I/O seems to be in play. We know all these files are in the +DATA ASM Diskgroup, which contains 2 ASM Disks:

SQL> select dg.name DG,
             d.name Disk,
             decode(d.GROUP_NUMBER,
                    0,'Unallocated',
                    'In Use') State,
             d.path
      from   v$asm_disk d left outer join v$asm_diskgroup dg
      on     dg.group_number = d.group_number
      order by
             dg.name,
             d.path;

DG         DISK         STATE       PATH
---------- ------------ ----------- ------------------------------
CRS        CRS_0000     In Use      /dev/oracleasm/disks/ASMDISK01
CRS2       CRS2_0000    In Use      /dev/oracleasm/disks/ASMDISK07
DATA       DATA_0000    In Use      /dev/oracleasm/disks/ASMDISK02
DATA       DATA_0001    In Use      /dev/oracleasm/disks/ASMDISK03
DATA2      DATA2_0003   In Use      /dev/oracleasm/disks/ASMDISK06
DATA2      DATA2_0000   In Use      /dev/oracleasm/disks/ASMDISK08
DATA2      DATA2_0001   In Use      /dev/oracleasm/disks/ASMDISK09
DATA2      DATA2_0002   In Use      /dev/oracleasm/disks/ASMDISK10
FRA        FRA_0000     In Use      /dev/oracleasm/disks/ASMDISK04
FRA        FRA_0001     In Use      /dev/oracleasm/disks/ASMDISK05

I/O calibration is discussed in the Oracle Database Performance Tuning Guide and can be reviewed here. Let’s run the calibration code multiple times to see what results we get with FILESYSTEMIO_OPTIONS set to NONE and SETALL. Here’s the calibration code:

SET SERVEROUTPUT ON
DECLARE
  lat  INTEGER;
  iops INTEGER;
  mbps INTEGER;
BEGIN
-- DBMS_RESOURCE_MANAGER.CALIBRATE_IO (<DISKS>, <MAX_LATENCY>, iops, mbps, lat);
   DBMS_RESOURCE_MANAGER.CALIBRATE_IO (2, 10, iops, mbps, lat);
 
  DBMS_OUTPUT.PUT_LINE ('max_iops = ' || iops);
  DBMS_OUTPUT.PUT_LINE ('latency  = ' || lat);
  dbms_output.put_line ('max_mbps = ' || mbps);
end;
/

Having run the code multiple times with FILESYSTEMIO_OPTIONS set to NONE, let’s change it to SETALL. It’s not a dynamic parameter so we’ll need to bounce the database:

SQL> show parameter filesystemio_options           
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
filesystemio_options                 string      none

SQL> alter system set filesystemio_options = SETALL scope=spfile;
System altered.

[oracle@racnode1 racnode1]$ srvctl stop database -d RAC1DB
[oracle@racnode1 racnode1]$ srvctl start database -d RAC1DB
[oracle@racnode1 racnode1]$ srvctl status database -d RAC1DB
Instance RAC1DB1 is running on node racnode1
Instance RAC1DB2 is running on node racnode2

SQL> show parameter filesystemio_options
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
filesystemio_options                 string      SETALL

These are the results averaged for multiple runs of the code:

FILESYSTEMIO_OPTIONS Max IOPS Latency Max MBPS
NONE 9,631 9.67 105.67
SETALL 9,796 8 105.67

As you can see, the Max IOPS improved slightly with SETALL which also performed with slightly lower latency. That said, the two sets of results are so close that it makes little difference. However, things didn’t get worse by setting FILESYSTEMIO_OPTIONS to SETALL, so let’s call that a win. Sort of.

Task #4.6: Apply a Patch (22191349).

ORAchk recommended applying the Grid Infrastructure Patch Set Update 12.1.0.2.160119 (JAN2016). There are multiple steps involved, so let’s get started.

Note, this is a rolling patch. It can be applied to a running system. Clusterware will be shutdown and patched as you apply the patch one node at a time.

Task #4.6a: Update opatch.

To apply this patch, the opatch utility must be version 12.1.0.1.7 or later. So let’s check the version in the GI and database $ORACLE_HOME directories:

[grid@racnode1 OPatch]$ ./opatch version
OPatch Version: 12.1.0.1.3
OPatch succeeded.

[oracle@racnode1 OPatch]$ ./opatch version
OPatch Version: 12.1.0.1.3
OPatch succeeded.

We need to update opatch which is done by downloading the latest version from MOS, patch #6880880.

We’ll update the GI and database $ORACLE_HOME directories on racnode1 first, taking a backup of the original OPatch directories as we go.

[grid@racnode1 grid]$ pwd
/u01/app/12.1.0/grid

[grid@racnode1 grid]$ mv OPatch OPatch_12.1.0.1.3
mv: cannot move `OPatch' to `OPatch_12.1.0.1.3': Permission denied

The directory is owned by root:oinstall and group does not have write permission. Let’s fix that:

[root@racnode1 12.1.0]# pwd
/u01/app/12.1.0

[root@racnode1 12.1.0]# ls -la
drwxr-xr-x 3 root oinstall 4096 Dec 17 18:09 .
drwxr-xr-x 6 root oinstall 4096 Dec 17 13:07 ..
drwxr-xr-x 73 root oinstall 4096 Dec 17 22:34 grid

[root@racnode1 12.1.0]# chmod 775 grid

[grid@racnode1 grid]$ mv OPatch OPatch_12.1.0.1.3

Now copy the new opatch zip file into the GI home and unzip it:

[grid@racnode1 grid]$ cp /nas1/PATCHES/6880880/p6880880_121010_Linux-x86-64.zip .
[grid@racnode1 grid]$ unzip p6880880_121010_Linux-x86-64.zip

Check the new opatch version then remove the zip file:

[grid@racnode1 OPatch]$ ./opatch version
OPatch Version: 12.1.0.1.10
OPatch succeeded.

[grid@racnode1 OPatch]$ cd ..
[grid@racnode1 grid]$ ls -l *.zip
-rw-r--r-- 1 grid oinstall 52774823 Feb 29 11:48 p6880880_121010_Linux-x86-64.zip

[grid@racnode1 grid]$ rm -f p6880880_121010_Linux-x86-64.zip

Note, don’t forget to change the GI $ORACLE_HOME permissions back to 755.

Now update opatch in the database $ORACLE_HOME:

[oracle@racnode1 dbhome_1]$ mv OPatch OPatch_12.1.0.1.3
[oracle@racnode1 dbhome_1]$ cp /nas1/PATCHES/6880880/p6880880_121010_Linux-x86-64.zip .
[oracle@racnode1 dbhome_1]$ unzip p6880880_121010_Linux-x86-64.zip
[oracle@racnode1 dbhome_1]$ cd OPatch

[oracle@racnode1 OPatch]$ ./opatch version
OPatch Version: 12.1.0.1.10
OPatch succeeded.

[oracle@racnode1 dbhome_1]$ cd ..
[oracle@racnode1 dbhome_1]$ ls -l *.zip
-rw-r--r-- 1 oracle oinstall 52774823 Feb 29 11:59 p6880880_121010_Linux-x86-64.zip

[oracle@racnode1 dbhome_1]$ rm -f p6880880_121010_Linux-x86-64.zip

Now repeat the same tasks for the GI and database $ORACLE_HOME directories on racnode2.

Task #4.6b: Create an Oracle Configuration Manager (OCM) Response File.

You must have an OCM response file with file permissions of 775.

[grid@racnode1 bin]$ pwd
/u01/app/12.1.0/grid/OPatch/ocm/bin

[grid@racnode1 bin]$ ./emocmrsp -no_banner -output /u01/app/12.1.0/grid/OPatch/ocm/ocm.rsp
Provide your email address to be informed of security issues, install and
initiate Oracle Configuration Manager. Easier for you if you use your My
Oracle Support Email address/User Name.
Visit http://www.oracle.com/support/policies.html for details.
Email address/User Name: 
You have not provided an email address for notification of security issues.
Do you wish to remain uninformed of security issues ([Y]es, [N]o) [N]: Y
The OCM configuration response file (/u01/app/12.1.0/grid/OPatch/ocm/ocm.rsp) was 
successfully created.

[grid@racnode1 bin]$ cd ..
[grid@racnode1 ocm]$ chmod 775 ocm.rsp
[grid@racnode1 ocm]$ ls -l
drwxr-x--- 2 grid oinstall     4096 Nov 13 14:15 bin
drwxr-x--- 2 grid oinstall     4096 Nov 13 14:15 doc
drwxr-x--- 2 grid oinstall     4096 Nov 13 14:15 lib
-rw-r----- 1 grid oinstall     638 Nov 13 14:15 ocm_platforms.txt
-rwxrwxr-x 1 grid oinstall     621 Feb 29 12:44 ocm.rsp
-rwxr-x--- 1 grid oinstall 42560960 Nov 13 14:15 ocm.zip

Repeat this step for the GI $ORACLE_HOME on racnode2.

Task #4.6c: Check the $ORACLE_HOME Inventories.

On racnode1, we need to check the GI and database $ORACLE_HOME inventories.

[grid@racnode1 OPatch]$ ./opatch lsinventory -detail -oh /u01/app/12.1.0/grid

The output is quite verbose, but can be viewed here.

[oracle@racnode1 OPatch]$ ./opatch lsinventory -detail -oh /u01/app/oracle/product/12.1.0/dbhome_1

Again, the output is quite verbose, but it can be viewed here.

Checking the inventories tells us there are no interim patches applied to either the GI or database $ORACLE_HOME directories.

Task #4.6d: Unzip the Patch.

The patch needs to be unzipped and accessible to all nodes in the cluster. It makes it easier that way. Fortunately I have some NAS storage which is mounted on both RAC nodes.

[grid@racnode1 grid]$ cd /nas1/PATCHES/22191349/grid
[grid@racnode1 grid]$ unzip p22191349_121020_Linux-x86-64.zip

Task #4.6e: Check For One-Off Patch Conflicts.

We already know that neither $ORACLE_HOME directories contain any patches at this point, so there can’t be any conflicts by definition. Still, we’ll run the conflict check anyway which must be done as the root user:

[root@racnode1 grid]# pwd
/nas1/PATCHES/22191349/grid

[root@racnode1 grid]# export PATH=$PATH:/u01/app/12.1.0/grid/OPatch
[root@racnode1 grid]# which opatchauto
/u01/app/12.1.0/grid/OPatch/opatchauto

[root@racnode1 grid]# /u01/app/12.1.0/grid/OPatch/opatchauto apply /nas1/PATCHES/22191349/grid/22191349 -analyze -ocmrf /u01/app/12.1.0/grid/OPatch/ocm/ocm.rsp
OPatch Automation Tool
Copyright (c)2014, Oracle Corporation. All rights reserved.
OPatchauto Version : 12.1.0.1.10
OUI Version        : 12.1.0.2.0
Running from       : /u01/app/12.1.0/grid
opatchauto log file: /u01/app/12.1.0/grid/cfgtoollogs/opatchauto/22191349/
opatch_gi_2016-02-29_14-02-15_analyze.log

NOTE: opatchauto is running in ANALYZE mode. There will be no change to your system.

OCM RSP file has been ignored in analyze mode.

Parameter Validation: Successful
Configuration Validation: Successful

Patch Location: /nas1/PATCHES/22191349/grid/22191349
Grid Infrastructure Patch(es): 21436941 21948341 21948344 21948354
DB Patch(es): 21948344 21948354

Patch Validation: Successful
Grid Infrastructure home: /u01/app/12.1.0/grid
DB home(s): /u01/app/oracle/product/12.1.0/dbhome_1

Analyzing patch(es) on "/u01/app/oracle/product/12.1.0/dbhome_1" ...
Patch "/nas1/PATCHES/22191349/grid/22191349/21948344" successfully analyzed on 
 "/u01/app/oracle/product/12.1.0/dbhome_1" for apply.
Patch "/nas1/PATCHES/22191349/grid/22191349/21948354" successfully analyzed on 
 "/u01/app/oracle/product/12.1.0/dbhome_1" for apply.
Analyzing patch(es) on "/u01/app/12.1.0/grid" ...
Patch "/nas1/PATCHES/22191349/grid/22191349/21436941" successfully analyzed on 
 "/u01/app/12.1.0/grid" for apply.
Patch "/nas1/PATCHES/22191349/grid/22191349/21948341" successfully analyzed on 
 "/u01/app/12.1.0/grid" for apply.
Patch "/nas1/PATCHES/22191349/grid/22191349/21948344" successfully analyzed on 
 "/u01/app/12.1.0/grid" for apply.
Patch "/nas1/PATCHES/22191349/grid/22191349/21948354" successfully analyzed on 
 "/u01/app/12.1.0/grid" for apply.

SQL changes, if any, are analyzed successfully on the following database(s): RAC1DB

Apply Summary:
Following patch(es) are successfully analyzed:
GI Home: /u01/app/12.1.0/grid: 21436941,21948341,21948344,21948354
DB Home: /u01/app/oracle/product/12.1.0/dbhome_1: 21948344,21948354

opatchauto succeeded.

As we suspected, no conflicts so we can proceed and apply the patch.

Task #4.6f: Apply the Patch.

There are some options for applying the patch. We’ll choose the ‘non-shared GI and database $ORACLE_HOME and ACFS is not configured’ option. Each node has to be patched separately on a running system, i.e. the cluster is up. Let’s patch racnode1 first and enjoy the nightmare together. 🙂

The patch is applied by the root user using this command template:

<GI_HOME>/OPatch/opatchauto apply <UNZIPPED_PATCH_LOCATION>/22191349 -ocmrf <ocm response file>

In our case, that translates into this:

[root@racnode1 ~]# /u01/app/12.1.0/grid/OPatch/opatchauto apply /nas1/PATCHES/22191349/grid/22191349 -ocmrf /u01/app/12.1.0/grid/OPatch/ocm/ocm.rsp

This had the effect of installing patches 21948344 and 21948354 into the database $ORACLE_HOME, but it failed to install any patches into the GI $ORACLE_HOME. The reason was there was insufficient disk space. The patch log file contained these messages:

...
[Feb 29, 2016 2:27:40 PM]   Running prerequisite checks...
[Feb 29, 2016 2:27:41 PM]   Space Needed : 8180.049MB
[Feb 29, 2016 2:27:41 PM]   Space Usable : 7786.043MB
[Feb 29, 2016 2:27:41 PM]   Required amount of space(8180.049MB) is not available.
[Feb 29, 2016 2:27:41 PM]   Prerequisite check "CheckSystemSpace" failed.
                            The details are:
                            Required amount of space(8180.049MB) is not available.
...

The /u01 file system did not have the 8 GB of free disk it needed. It would have been nice if the patch conflict command had told me this. Thanks Oracle! As is customary in patch README files, the command to rollback the patch is also documented. However, attempts to rollback the patch also failed because Clusterware wasn’t running and my attempts to re-start it on racnode1 repeatedly failed. A node reboot solved that problem and I tried to rollback the patch for a second time. That failed as well. This time the rollback command reported these errors in the log file:

COMMAND EXECUTION FAILURE :
Using configuration parameter file: /u01/app/12.1.0/grid/OPatch/opatchautotemp_racnode1/
patchwork/crs/install/crsconfig_params
Oracle Clusterware active version on the cluster is [12.1.0.2.0]. The cluster upgrade 
state is [ROLLING PATCH]. The cluster active patch level is [0].
PROC-28: Oracle Cluster Registry already in current version
CRS-1153: There was an error setting Oracle Clusterware to rolling patch mode.
CRS-4000: Command Start failed, or completed with errors.
2016/02/29 16:01:23 CLSRSC-430: Failed to start rolling patch mode

ERROR:
Died at /u01/app/12.1.0/grid/OPatch/opatchautotemp_racnode1/patchwork/crs/
install/crspatch.pm line 776.

Clusterware was already in ‘rolling patch mode’, so you’d think trying to set it to that mode wouldn’t be a problem, right? Wrong! Unfortunately, this is an unpublished pub. Fortunately, Oracle already knows about this and has a workaround documented in MOS, Doc ID 1943498.1. It boils down to running this command:

[root@racnode1 ~]# crsctl stop rollingpatch

This command appears to switch Clusterware back into NORMAL mode, so it can be switched into ROLLING PATCH mode when you attempt to rollback the patch or apply it. The log file also contains this rather useful tidbit of information:

--------------After fixing the cause of failure you have two options shown below:
Run 'opatchauto resume'
or
Manually run the commands listed below
---------------------------------------------------------------------------------
...

I wasn’t about to trust ‘opatchauto resume’, so I opted for running the 17 fully documented commands in sequential order. Some needed to be run as root, some as oracle and some as grid. Unbelievably, they all worked! We can see the patches which were applied by running opatch lsinventory commands:

[grid@racnode1 OPatch]$ ./opatch lsinventory

The output can be viewed here.

[oracle@racnode1 OPatch]$ ./opatch lsinventory

The output can be viewed here.

Note, opatch stores patch rollback information and files in a sub-directory within each $ORACLE_HOME called .patch_storage. This has the potential to consume a large quantity of disk space and thus reduce the free space available to the filesystem which contains the Oracle code installations. Running out of disk space in this filesystem is bad news. You should NOT just delete/archive the .patch_storage directory, so Oracle recommends one of 3 courses of action in the event disk space becomes an issue:

  1. Increase the size of the filesystem. (I know – genius right? 😂)
  2. Use symbolic links and relocate .patch_storage some place else. (This is what I did – worked fine)
  3. Run the command: opatch util cleanup. (Documented in MOS Doc ID 550522.1)

So, with the $ORACLE_HOME directories patched on racnode1, it’s time to attempt applying the patch on racnode2. Yes, I did check I had more than 8 GB of free space in the /u01 filesystem before I started. Thanks for asking. When applying the patch actually works as intended, it looks like this:

[root@racnode2 ~]# . oraenv
ORACLE_SID = [+ASM2] ?
The Oracle base remains unchanged with value /u01/app/grid

[root@racnode2 trace]# export PATH=$PATH:/u01/app/12.1.0/grid/OPatch
[root@racnode2 trace]# which opatchauto
/u01/app/12.1.0/grid/OPatch/opatchauto

[root@racnode2 ~]# /u01/app/12.1.0/grid/OPatch/opatchauto apply /nas1/PATCHES/22191349/grid/22191349 -ocmrf /u01/app/12.1.0/grid/OPatch/ocm/ocm.rsp

OPatch Automation Tool
Copyright (c)2014, Oracle Corporation. All rights reserved.
OPatchauto Version  : 12.1.0.1.10
OUI Version         : 12.1.0.2.0
Running from        : /u01/app/12.1.0/grid
opatchauto log file : /u01/app/12.1.0/grid/cfgtoollogs/opatchauto/22191349/
                       opatch_gi_2016-03-01_13-17-42_deploy.log

Parameter Validation: Successful
Configuration Validation: Successful

Patch Location: /nas1/PATCHES/22191349/grid/22191349
Grid Infrastructure Patch(es): 21436941 21948341 21948344 21948354
DB Patch(es): 21948344 21948354

Patch Validation: Successful

Grid Infrastructure home: /u01/app/12.1.0/grid
DB home(s): /u01/app/oracle/product/12.1.0/dbhome_1

Performing prepatch operations on CRS Home... Successful

Applying patch(es) to "/u01/app/oracle/product/12.1.0/dbhome_1" ...
Patch "/nas1/PATCHES/22191349/grid/22191349/21948344" successfully applied 
 to "/u01/app/oracle/product/12.1.0/dbhome_1".
Patch "/nas1/PATCHES/22191349/grid/22191349/21948354" successfully applied 
 to "/u01/app/oracle/product/12.1.0/dbhome_1".

Applying patch(es) to "/u01/app/12.1.0/grid" ...
Patch "/nas1/PATCHES/22191349/grid/22191349/21436941" successfully applied 
 to "/u01/app/12.1.0/grid".
Patch "/nas1/PATCHES/22191349/grid/22191349/21948341" successfully applied 
 to "/u01/app/12.1.0/grid".
Patch "/nas1/PATCHES/22191349/grid/22191349/21948344" successfully applied 
 to "/u01/app/12.1.0/grid".
Patch "/nas1/PATCHES/22191349/grid/22191349/21948354" successfully applied 
 to "/u01/app/12.1.0/grid".

Performing postpatch operations on CRS Home... Successful

SQL changes, if any, are applied successfully on the following database(s): RAC1DB

Apply Summary:
Following patch(es) are successfully installed:
GI Home: /u01/app/12.1.0/grid: 21436941,21948341,21948344,21948354
DB Home: /u01/app/oracle/product/12.1.0/dbhome_1: 21948344,21948354

opatchauto succeeded.

Again, running the opatch lsinventory command in each $ORACLE_HOME will verify the patches have been applied.

Task #4.6g: Run datapatch.

The final step involves running a script called datapatch which is located in the OPatch directory under $ORACLE_HOME. The documentation says for a RAC environment, run this from only one node. We’re left to assume we do this after all the nodes have been patched. There are instructions for both a “Standalone DB” and “Single/Multitenant (CDB/PDB) DB”. We know our user RAC database is not a CDB, but the GIMR database (_MGMTDB) is. Since the GI $ORACLE_HOME was patched and that codeset runs the _MGMTDB database, again we have to assume we need to run the datapatch script for that database as well. It would be nice if Oracle’s documentation made that clear.

The RAC1DB instances are already up and running, so we’ll dive straight in and run the datapatch script from racnode1:

[oracle@racnode1 OPatch]$ pwd
/u01/app/oracle/product/12.1.0/dbhome_1/OPatch

[oracle@racnode1 OPatch]$ ./datapatch -verbose
SQL Patching tool version 12.1.0.2.0 on Wed Mar 2 10:16:57 2016
Copyright (c) 2015, Oracle. All rights reserved.

Log file for this invocation: /u01/app/oracle/cfgtoollogs/sqlpatch/
  sqlpatch_2324_2016_03_02_10_16_57/sqlpatch_invocation.log

Connecting to database...OK
Bootstrapping registry and package to current versions...done
Determining current state...done

Current state of SQL patches:
Bundle series PSU:
ID 160119 in the binary registry and ID 160119 in the SQL registry

Adding patches to installation queue and performing prereq checks...
Installation queue:
Nothing to roll back
Nothing to apply

SQL Patching tool complete on Wed Mar 2 10:19:11 2016

Finally, we need to verify everything worked:

SQL> select patch_id, action, status from dba_registry_sqlpatch;

PATCH_ID  ACTION          STATUS
--------- --------------- ---------------
21948354  APPLY           SUCCESS 

Next, we’ll run the datapatch script in the _MGMTDB database. The script requires that all pluggable databases be open, so let’s double check that before we run the script:

[grid@racnode1 ~]$ . oraenv
ORACLE_SID = [-MGMTDB] ?
The Oracle base remains unchanged with value /u01/app/grid

[grid@racnode1 ~]$ sqlplus / as sysdba

SQL> select name, open_mode from v$pdbs;

NAME                           OPEN_MODE
------------------------------ ----------
PDB$SEED                       READ ONLY
CLUSTER1                       READ WRITE

[grid@racnode1 ~]$ cd $ORACLE_HOME/OPatch
[grid@racnode1 OPatch]$ pwd
/u01/app/12.1.0/grid/OPatch

[grid@racnode1 OPatch]$ ./datapatch -verbose
SQL Patching tool version 12.1.0.2.0 on Wed Mar 2 10:48:36 2016
Copyright (c) 2015, Oracle. All rights reserved.
Log file for this invocation: /u01/app/grid/cfgtoollogs/sqlpatch/
  sqlpatch_14626_2016_03_02_10_48_36/sqlpatch_invocation.log

Connecting to database...OK

Note: Datapatch will only apply or rollback SQL fixes for PDBs
      that are in an open state, no patches will be applied to closed PDBs.
      Please refer to Note: Datapatch: Database 12c Post Patch SQL Automation
      (Doc ID 1585822.1)

Bootstrapping registry and package to current versions...done
Determining current state...done

Current state of SQL patches:
Bundle series PSU:
ID 160119 in the binary registry and ID 160119 in PDB CDB$ROOT, ID 160119 
in PDB PDB$SEED, ID 160119 in PDB CLUSTER1

Adding patches to installation queue and performing prereq checks...

Installation queue:

For the following PDBs: CDB$ROOT PDB$SEED CLUSTER1
   Nothing to roll back
   Nothing to apply

SQL Patching tool complete on Wed Mar 2 10:50:57 2016

Finally, we need to verify everything worked:

SQL> select patch_id, action, status from dba_registry_sqlpatch;

PATCH_ID  ACTION          STATUS
--------  --------------- ---------------
21948354  APPLY           SUCCESS

All appears to be well. Job done!

Task #4.7: Apply a PSU (January 2016 PSU – Patch 22191676).

During the writing of Part 11, the January 2016 PSUs were released by Oracle. These are cumulative and bundle together many patches which were released previously. As always, we read the README file which comes with the patch very carefully to see what we’re getting into.

Note, this is a non-rolling patch and requires downtime to apply it.

The January 2016 PSU contains these patches:

Patch # Description Applicable Homes
21948354 DB PSU 12.1.0.2.160119 Both DB Homes and Grid Home
21948344 OCW PSU 12.1.0.2.160119 Both DB Homes and Grid Home
22139226 OJVM 12.1.0.2.160119 Only DB Homes
21948341 ACFS PSU 12.1.0.2.160119 Only Grid Home
21436941 DBWLM PSU 12.1.0.2.5 Only Grid Home

Since we’ve already applied Patch #22191349, we can check to see if there’s any overlap. These are the results of running opatch lsinventory for the GI and database $ORACLE_HOME directories:

Patch # Applied to Homes
21948354 DB & GI
21948344 DB & GI
21948341 GI
21436941 GI

So there is a significant overlap with only Patch #22139226 adding anything new. We could apply just that one patch or we could test the application of the whole patch to make sure it’s smart enough to skip already applied patches and apply just the one we want (#22139226). So let’s do that. We’ll follow the same approach as before, even though we know some of the ground work has already been done.

Task #4.7a: Update opatch.

We must use opatch version 12.1.0.1.7 or later. We already have version 12.1.0.1.10.

Task #4.7b: Create an Oracle Configuration Manager (OCM) Response File.

Since things have changed, we’ll go ahead and re-generate the ocm.rsp file. Just for fun, I ran the diff command to compare the old ocm.rsp file and the new one and it reported they were different. So we’ll use a new one just to be on the safe side.

[grid@racnode1 bin]$ pwd
/u01/app/12.1.0/grid/OPatch/ocm/bin 

[grid@racnode1 bin]$ ./emocmrsp -no_banner -output /u01/app/12.1.0/grid/OPatch/ocm/ocm.rsp
Provide your email address to be informed of security issues, install and
initiate Oracle Configuration Manager. Easier for you if you use your My
Oracle Support Email address/User Name.
Visit http://www.oracle.com/support/policies.html for details.
Email address/User Name: <just press Enter here>

You have not provided an email address for notification of security issues.
Do you wish to remain uninformed of security issues ([Y]es, [N]o) [N]:  Y
The OCM configuration response file (/u01/app/12.1.0/grid/OPatch/ocm/ocm.rsp) was 
successfully created.

[grid@racnode1 bin]$ cd ..
[grid@racnode1 ocm]$ chmod 775 ocm.rsp

Repeat this step on racnode2.

Task #4.7c: Check the $ORACLE_HOME Inventories.

As before, the point of this step is to generate documentation of what the inventories looked like before the patch was applied. The documentation also says if this command fails, you should contact Oracle Support so presumably a successful run must mean the inventories are healthy.

Task #4.7d: Unzip the Patch.

[grid@racnode1 ~]$ cd /nas1/PATCHES/22191676/grid
[grid@racnode1 grid]$ unzip p22191676_121020_Linux-x86-64.zip

Task #4.7e: Check For One-Off Patch Conflicts.

This is where the patch software should detect the presence of the existing patches and report that they’ll be skipped. Thus leaving only Patch #22191349 to be applied to the database $ORACLE_HOME. Here’s the command to check for conflicts:

[root@racnode1 ~]# /u01/app/12.1.0/grid/OPatch/opatchauto apply /nas1/PATCHES/22191676/grid/22191676 -analyze -ocmrf /u01/app/12.1.0/grid/OPatch/ocm/ocm.rsp

So what did this command report? Well, the main opatchauto log file reported this:

...
The following patch(es) are duplicate patches with patches installed in the 
Oracle Home.
 [ 21948354]
You have already installed same patch(es) with same UPI(s) or same version(s).
These patch(es) will be skipped.

Opatchauto skipped installing the above patch in /u01/app/12.1.0/grid
...

The GI $ORACLE_HOME log file (referenced in the main opatchauto log file) reported this:

...
The following patch(es) are duplicate patches with patches installed in the 
Oracle Home.
[ 21436941 21948341 21948344]
You have already installed same patch(es) with same UPI(s) or same version(s).
These patch(es) will be skipped.
...

So, for the GI $ORACLE_HOME,  Patches #21948354, #21948344, #21948341 and #21436941 will all be skipped. This is correct!

The main opatchauto log file also reported this:

...
The following patch(es) are duplicate patches with patches installed in the 
Oracle Home.
 [ 21948354]
You have already installed same patch(es) with same UPI(s) or same version(s).
These patch(es) will be skipped.

Opatchauto skipped installing the above patch in 
 /u01/app/oracle/product/12.1.0/dbhome_1
...

The database $ORACLE_HOME log file (referenced in the main opatchauto log file) reported this:

...
Patches to apply -> [ 21948344 22139226  ]
Duplicate patches to filter -> [ 21948344  ]
The following patches are duplicate and are skipped:
[ 21948344  ]
OPatch found that the following patches are not required.
All of the constituents of the composite patch [ 21948354 ] are already installed 
in the Oracle Home.
...

So, for the database $ORACLE_HOME, Patches #21948354 and 21948344 will be skipped leaving Patch #22139226 to be applied. Again, this is correct!

Task #4.7f: Apply the Patch.

Since we know there are no patches to apply to the GI $ORACLE_HOME, we can choose the option to only apply the patch to the database $ORACLE_HOME. The command template for that is:

opatchauto apply <UNZIPPED_PATCH_LOCATION>/22191676 -oh <oracle_home1_path>,<oracle_home2_path> -ocmrf <ocm response file>

Since we’ll be using the opatchauto command from the database $ORACLE_HOME, we should probably generate an OCM response file there:

[oracle@racnode1 bin]$ pwd
/u01/app/oracle/product/12.1.0/dbhome_1/OPatch/ocm/bin

[oracle@racnode1 bin]$ ./emocmrsp -no_banner -output /u01/app/oracle/product/12.1.0/dbhome_1/OPatch/ocm/ocm.rsp
Provide your email address to be informed of security issues, install and
initiate Oracle Configuration Manager. Easier for you if you use your My
Oracle Support Email address/User Name.
Visit http://www.oracle.com/support/policies.html for details.
Email address/User Name: <just press Enter here>

You have not provided an email address for notification of security issues.
Do you wish to remain uninformed of security issues ([Y]es, [N]o) [N]:  Y
The OCM configuration response file (/u01/app/oracle/product/12.1.0/dbhome_1/
 OPatch/ocm/ocm.rsp) was successfully created.

[oracle@racnode1 bin]$ cd ..
[oracle@racnode1 ocm]$ chmod 775 ocm.rsp

First, we need to stop the RAC database:

[oracle@racnode1 ~]$ . oraenv
ORACLE_SID = [RAC1DB1] ? 
The Oracle base remains unchanged with value /u01/app/oracle

[oracle@racnode1 ~]$ srvctl stop database -d RAC1DB

Next, stop Clusterware on the remote node (racnode2):

[root@racnode2 ~]# . oraenv
ORACLE_SID = [+ASM2] ? 
The Oracle base remains unchanged with value /u01/app/grid

[root@racnode2 ~]# crsctl stop crs

Next, ensure Clusterware is still running on the local node (racnode1):

[root@racnode1 ~]# . oraenv
ORACLE_SID = [+ASM1] ? 
The Oracle base remains unchanged with value /u01/app/grid

[root@racnode1 ~]# crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online

Next, run the opatchauto command according to the template referenced earlier:

[root@racnode1 ~]# opatchauto apply /nas1/PATCHES/22191676/grid/22191676 -oh /u01/app/oracle/product/12.1.0/dbhome_1 -ocmrf /u01/app/oracle/product/12.1.0/dbhome_1/OPatch/ocm/ocm.rsp

The screen output can be viewed here. Overall, it says Patch #22139226 was applied, but that some SQL was not run because the RAC database was down. The good news is the SQL commands/scripts which were not executed are documented in the log file. All they do is query the existence of the SPFILE, startup an instance in upgrade mode, set CLUSTER_DATABASE=FALSE, run the datapatch script (in the $ORACLE_HOME/OPatch directory), then set CLUSTER_DATABASE=TRUE. We’ll apply the patch to racnode2, then run these SQL commands/scripts.

Following the same sequence of steps for racnode2, first generate a new OCM response file:

[oracle@racnode2 bin]$ pwd
/u01/app/oracle/product/12.1.0/dbhome_1/OPatch/ocm/bin

[oracle@racnode2 bin]$ ./emocmrsp -no_banner -output /u01/app/oracle/product/12.1.0/dbhome_1/OPatch/ocm/ocm.rsp
Provide your email address to be informed of security issues, install and
initiate Oracle Configuration Manager. Easier for you if you use your My
Oracle Support Email address/User Name.
Visit http://www.oracle.com/support/policies.html for details.
Email address/User Name: <just press Enter here>

You have not provided an email address for notification of security issues.
Do you wish to remain uninformed of security issues ([Y]es, [N]o) [N]:  Y
The OCM configuration response file (/u01/app/oracle/product/12.1.0/dbhome_1/
  OPatch/ocm/ocm.rsp) was successfully created.

[oracle@racnode2 bin]$ cd ..
[oracle@racnode2 ocm]$ chmod 775 ocm.rsp

Next, re-start Clusterware on racnode2:

[root@racnode2 22191676]# . oraenv
ORACLE_SID = [+ASM2] ? 
The Oracle base remains unchanged with value /u01/app/grid
[root@racnode2 22191676]# crsctl start crs

After a few moments, check the Clusterware software stack has re-started:

[root@racnode2 22191676]# crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online

Next, shutdown Clusterware on the remote node (racnode1):

[root@racnode1 ~]# . oraenv
ORACLE_SID = [RAC1DB1] ? +ASM1
The Oracle base has been changed from /u01/app/oracle to /u01/app/grid

[root@racnode1 ~]# crsctl stop crs

Next, set your environment and run the opatchauto command:

[root@racnode2 ~]# . oraenv
ORACLE_SID = [root] ? RAC1DB2
The Oracle base has been set to /u01/app/oracle

[root@racnode2 ~]# export PATH=$PATH:$ORACLE_HOME/OPatch
[root@racnode2 ~]# which opatchauto
/u01/app/oracle/product/12.1.0/dbhome_1/OPatch/opatchauto

Next, run the opatchauto command:

[root@racnode2 ~]# opatchauto apply /nas1/PATCHES/22191676/grid/22191676 -oh /u01/app/oracle/product/12.1.0/dbhome_1 -ocmrf /u01/app/oracle/product/12.1.0/dbhome_1/OPatch/ocm/ocm.rsp

The screen output was the same as the opatchauto run on racnode1 and Patch #22139226 was successfully applied. Now we’ll run the SQL commands which were missed due to the RAC database not being up and running.

The first SQL script just checks for the existence of an SPFILE, so we’ll skip that.

The second SQL script alters the CLUSTER_DATABASE parameter FALSE in the SPFILE:

[oracle@racnode2 ~]$ srvctl start instance -d RAC1DB -n racnode2
[oracle@racnode2 ~]$ sqlplus / as sysdba
SQL> @/u01/app/oracle/product/12.1.0/dbhome_1/cfgtoollogs/oplan/dataPatchfileForSetClusterFalse.sql
System altered.

The third SQL script starts up the instance in upgrade mode:

[oracle@racnode2 ~]$ srvctl stop instance -d RAC1DB -n racnode2
[oracle@racnode2 ~]$ sqlplus / as sysdba
Connected to an idle instance.
SQL> @/u01/app/oracle/product/12.1.0/dbhome_1/cfgtoollogs/oplan/dataPatchfileForStartupUpgrade.sql
ORACLE instance started.

Total System Global Area 2147483648 bytes
Fixed Size                  2926472 bytes
Variable Size            1207961720 bytes
Database Buffers          922746880 bytes
Redo Buffers               13848576 bytes
Database mounted.
Database opened.

The fourth script runs the datapatch script:

[oracle@racnode2 ~]$ /bin/bash -c 'ORACLE_HOME=/u01/app/oracle/product/12.1.0/dbhome_1 ORACLE_SID=RAC1DB2 /u01/app/oracle/product/12.1.0/dbhome_1/OPatch/datapatch'
SQL Patching tool version 12.1.0.2.0 on Thu Mar  3 19:30:35 2016
Copyright (c) 2015, Oracle.  All rights reserved.

Connecting to database...OK
Determining current state...done
Adding patches to installation queue and performing prereq checks...done
Installation queue:
  Nothing to roll back
  The following patches will be applied:
    22139226 (Database PSU 12.1.0.2.160119, Oracle JavaVM Component (Jan2016))

Installing patches...
Patch installation complete.  Total patches installed: 1

Validating logfiles...done
SQL Patching tool complete on Thu Mar  3 19:34:52 2016

The fifth SQL script alters the CLUSTER_DATABASE parameter TRUE in the SPFILE and shuts down the instance:

[oracle@racnode2 ~]$ sqlplus / as sysdba
SQL> @/u01/app/oracle/product/12.1.0/dbhome_1/cfgtoollogs/oplan/dataPatchfileForSetClusterTrue.sql

System altered.

Database closed.
Database dismounted.
ORACLE instance shut down.

We can now re-start the Clusterware on racnode1:

[root@racnode1 ~]# crsctl start crs
CRS-4123: Oracle High Availability Services has been started.

After waiting for a few moments, we can check the status of CRS:

[root@racnode1 ~]# crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online

Finally, we can re-start the RAC database:

[oracle@racnode2 ~]$ srvctl start database -d RAC1DB
[oracle@racnode2 ~]$ srvctl status database -d RAC1DB
Instance RAC1DB1 is running on node racnode1
Instance RAC1DB2 is running on node racnode2

Note:

  1. Always read and understand the patch README file before doing anything else.
  2. The README file will not always tell you exactly what to do. Sometimes you have to use common sense and your DBA gene to figure stuff out.
  3. Always test applying a patch to a sandbox or test system for 3 main reasons:
    • To develop your own procedure so running it in production becomes a mechanical process with little to no thinking required.
    • To verify that it actually works.
    • To determine how long it takes so you can schedule an appropriate production outage if necessary.

Task #4.8: Add a Node to the Cluster.

We have a 2 node RAC cluster called cluster1 comprised of servers racnode1 and racnode2. Now let’s expand that cluster to 3 nodes by adding racnode3. After starting up racnode3 using Oracle VM Manager and connecting to the console, there are multiple steps to prepare racnode3 to join the cluster. This mostly involves changes to the OS, networking and shared storage.

Task #4.8a: Install the Cloud Control Management Agent.

Let’s get this out of the way first. Installation instructions for the Management Agent can be found here.

Task #4.8b: Fix the Networking.

Remember that racnode3 is a clone of racnode1, so it’s networking will be identical to racnode1. We need to change that so racnode3 has its own networking setup. An example of how to do that is documented here.

Task #4.8c: Remove IPv6.

If you’re not using IPv6 addressing, its presence on the vNICs will only confuse matters. So it’s best to remove it. Instructions on how to do that can be found here.

Task #4.8d: Fix the Hostname.

When racnode3 powers up, you’ll notice it’s called racnode1. We need to change its identity to racnode3. The procedure to do that is documented here.

Task #4.8e: Edit /etc/hosts.

On racnode3, update the /etc/hosts file so it’s identical to the other RAC nodes. You know it makes sense. A copy of this file can be found here. We will be configuring shared storage on the Openfiler server a little later, so in preparation for that add this entry to the /etc/hosts file on openfiler:

200.200.20.13    racnode3-storage

Task #4.8f: Edit /etc/resolv.conf.

Copy the /etc/resolv.conf file from either of the other 2 RAC nodes. It should contain just these entries:

# Generated by NetworkManager
search mynet.com
nameserver 200.200.10.16

Task #4.8g: Edit /etc/security/limits.conf.

Ensure this file has these entries:

####################################
# for oracle user
####################################
oracle   soft   nofile    8192
oracle   hard   nofile    65536
oracle   soft   nproc     2048
oracle   hard   nproc     16384
oracle   soft   stack     10240
oracle   hard   stack     32768
oracle   soft   core      unlimited
oracle   hard   core      unlimited
oracle   hard   memlock   5500631
####################################
# for grid user
####################################
grid    soft    nofile    8192
grid    hard    nofile    65536
grid    soft    nproc     2048
grid    hard    nproc     16384
grid    soft    stack     10240
grid    hard    stack     32768
grid    soft    core      unlimited
grid    hard    core      unlimited

Task #4.8h: Rename /etc/ntp.conf.

We’re using Oracle’s Cluster Time Synchronization Service rather than NTP to keep time within the cluster. However, the mere presence of this file can cause problems, so we need to rename it. Here are the steps to do that:

[root@racnode3 ~]# ls -l /etc/ntp.conf
-rw-r--r--. 1 root root 1770 Oct 26 10:38 /etc/ntp.conf

[root@racnode3 ~]# service ntpd status
ntpd is stopped

[root@racnode3 ~]# chkconfig ntpd off
[root@racnode3 ~]# mv /etc/ntp.conf /etc/ntp.conf.ORIG

Task #4.8i: Configure User Equivalency.

User equivalency or passwordless SSH needs to be setup for the racnode3 users oracle and grid. Instructions for configuring SSH can be found here.

Task #4.8j: Configure Shared Storage Within Openfiler.

First, we need to add racnode3-storage to the Network Access Configuration section. Open up the Openfiler management console, click on System, then add the entry for racnode3-storage using these values:

Name Network/Host Netmask Type
racnode3-storage 200.200.20.13 255.255.255.255 Share

The screen should look similar to this:

Next, Click Volumes. The screen should change to this:

Then click iSCSI Targets over on the right hand side. The screen should change to this:

The following procedure should be repeated for iSCSI targets iqn.2006-01.com.openfiler:c1vg-vol01 through iqn.2006-01.com.openfiler:c1vg-vol10.

Use the Select iSCSI Target pull down menu to select the iqn.2006-01.com.openfiler:c1vg-vol01 (it should already be selected by default), then click the Change button. Click on Network ACL and use the Access pull down menu to select Allow for racnode3-storage.  The screen should look similar to this:

Click the Update button, then click on Target Configuration and select the next iSCS target.

Task #4.8k: Configure Shared Storage at the OS Level.

First, verify the iSCSI client rpm is installed, then configure the iSCSI client to start on boot up of the node:

[root@racnode3 ~]# rpm -qa | grep iscsi
iscsi-initiator-utils-6.2.0.873-14.0.1.el6.x86_64

[root@racnode3 ~]# chkconfig iscsid on
[root@racnode3 ~]# chkconfig iscsi on

Next, discover the iSCSI Targets coming from the openfiler server:

[root@racnode3 ~]# iscsiadm -m discovery -t sendtargets -p openfiler-storage
Starting iscsid:                                           [ OK ]
200.200.20.6:3260,1 iqn.2006-01.com.openfiler:c1vg-vol10
200.200.10.6:3260,1 iqn.2006-01.com.openfiler:c1vg-vol10
200.200.20.6:3260,1 iqn.2006-01.com.openfiler:c1vg-vol09
200.200.10.6:3260,1 iqn.2006-01.com.openfiler:c1vg-vol09
200.200.20.6:3260,1 iqn.2006-01.com.openfiler:c1vg-vol08
200.200.10.6:3260,1 iqn.2006-01.com.openfiler:c1vg-vol08
200.200.20.6:3260,1 iqn.2006-01.com.openfiler:c1vg-vol07
200.200.10.6:3260,1 iqn.2006-01.com.openfiler:c1vg-vol07
200.200.20.6:3260,1 iqn.2006-01.com.openfiler:c1vg-vol06
200.200.10.6:3260,1 iqn.2006-01.com.openfiler:c1vg-vol06
200.200.20.6:3260,1 iqn.2006-01.com.openfiler:c1vg-vol05
200.200.10.6:3260,1 iqn.2006-01.com.openfiler:c1vg-vol05
200.200.20.6:3260,1 iqn.2006-01.com.openfiler:c1vg-vol04
200.200.10.6:3260,1 iqn.2006-01.com.openfiler:c1vg-vol04
200.200.20.6:3260,1 iqn.2006-01.com.openfiler:c1vg-vol03
200.200.10.6:3260,1 iqn.2006-01.com.openfiler:c1vg-vol03
200.200.20.6:3260,1 iqn.2006-01.com.openfiler:c1vg-vol02
200.200.10.6:3260,1 iqn.2006-01.com.openfiler:c1vg-vol02
200.200.20.6:3260,1 iqn.2006-01.com.openfiler:c1vg-vol01
200.200.10.6:3260,1 iqn.2006-01.com.openfiler:c1vg-vol01

Next, log into each of the iSCSI targets using the openfiler’s storage network address on eth1 (200.200.20.6):

[root@racnode3 ~]# iscsiadm -m node -T iqn.2006-01.com.openfiler:c1vg-vol01 -p 200.200.20.6 -l
Logging in to [iface: default, target: iqn.2006-01.com.openfiler:c1vg-vol01, 
  portal: 200.200.20.6,3260] (multiple)
Login to [iface: default, target: iqn.2006-01.com.openfiler:c1vg-vol01, 
  portal: 200.200.20.6,3260] successful.

[root@racnode3 ~]# iscsiadm -m node -T iqn.2006-01.com.openfiler:c1vg-vol02 -p 200.200.20.6 -l
Logging in to [iface: default, target: iqn.2006-01.com.openfiler:c1vg-vol02, 
  portal: 200.200.20.6,3260] (multiple)
Login to [iface: default, target: iqn.2006-01.com.openfiler:c1vg-vol02, 
  portal: 200.200.20.6,3260] successful.

[root@racnode3 ~]# iscsiadm -m node -T iqn.2006-01.com.openfiler:c1vg-vol03 -p 200.200.20.6 -l
Logging in to [iface: default, target: iqn.2006-01.com.openfiler:c1vg-vol03, 
  portal: 200.200.20.6,3260] (multiple)
Login to [iface: default, target: iqn.2006-01.com.openfiler:c1vg-vol03, 
  portal: 200.200.20.6,3260] successful.

[root@racnode3 ~]# iscsiadm -m node -T iqn.2006-01.com.openfiler:c1vg-vol04 -p 200.200.20.6 -l
Logging in to [iface: default, target: iqn.2006-01.com.openfiler:c1vg-vol04, 
  portal: 200.200.20.6,3260] (multiple)
Login to [iface: default, target: iqn.2006-01.com.openfiler:c1vg-vol04, 
  portal: 200.200.20.6,3260] successful.

[root@racnode3 ~]# iscsiadm -m node -T iqn.2006-01.com.openfiler:c1vg-vol05 -p 200.200.20.6 -l
Logging in to [iface: default, target: iqn.2006-01.com.openfiler:c1vg-vol05, 
  portal: 200.200.20.6,3260] (multiple)
Login to [iface: default, target: iqn.2006-01.com.openfiler:c1vg-vol05, 
  portal: 200.200.20.6,3260] successful.

[root@racnode3 ~]# iscsiadm -m node -T iqn.2006-01.com.openfiler:c1vg-vol06 -p 200.200.20.6 -l
Logging in to [iface: default, target: iqn.2006-01.com.openfiler:c1vg-vol06, 
  portal: 200.200.20.6,3260] (multiple)
Login to [iface: default, target: iqn.2006-01.com.openfiler:c1vg-vol06, 
  portal: 200.200.20.6,3260] successful.

[root@racnode3 ~]# iscsiadm -m node -T iqn.2006-01.com.openfiler:c1vg-vol07 -p 200.200.20.6 -l
Logging in to [iface: default, target: iqn.2006-01.com.openfiler:c1vg-vol07, 
  portal: 200.200.20.6,3260] (multiple)
Login to [iface: default, target: iqn.2006-01.com.openfiler:c1vg-vol07, 
  portal: 200.200.20.6,3260] successful.

[root@racnode3 ~]# iscsiadm -m node -T iqn.2006-01.com.openfiler:c1vg-vol08 -p 200.200.20.6 -l
Logging in to [iface: default, target: iqn.2006-01.com.openfiler:c1vg-vol08, 
  portal: 200.200.20.6,3260] (multiple)
Login to [iface: default, target: iqn.2006-01.com.openfiler:c1vg-vol08, 
  portal: 200.200.20.6,3260] successful.

[root@racnode3 ~]# iscsiadm -m node -T iqn.2006-01.com.openfiler:c1vg-vol09 -p 200.200.20.6 -l
Logging in to [iface: default, target: iqn.2006-01.com.openfiler:c1vg-vol09, 
  portal: 200.200.20.6,3260] (multiple)
Login to [iface: default, target: iqn.2006-01.com.openfiler:c1vg-vol09, 
  portal: 200.200.20.6,3260] successful.

[root@racnode3 ~]# iscsiadm -m node -T iqn.2006-01.com.openfiler:c1vg-vol10 -p 200.200.20.6 -l
Logging in to [iface: default, target: iqn.2006-01.com.openfiler:c1vg-vol10, 
  portal: 200.200.20.6,3260] (multiple)
Login to [iface: default, target: iqn.2006-01.com.openfiler:c1vg-vol10, 
  portal: 200.200.20.6,3260] successful.

Next, configure automatic iSCSI client login:

[root@racnode3 ~]# iscsiadm -m node -T iqn.2006-01.com.openfiler:c1vg-vol01 -p 200.200.20.6 --op update -n node.startup -v automatic
[root@racnode3 ~]# iscsiadm -m node -T iqn.2006-01.com.openfiler:c1vg-vol02 -p 200.200.20.6 --op update -n node.startup -v automatic
[root@racnode3 ~]# iscsiadm -m node -T iqn.2006-01.com.openfiler:c1vg-vol03 -p 200.200.20.6 --op update -n node.startup -v automatic
[root@racnode3 ~]# iscsiadm -m node -T iqn.2006-01.com.openfiler:c1vg-vol04 -p 200.200.20.6 --op update -n node.startup -v automatic
[root@racnode3 ~]# iscsiadm -m node -T iqn.2006-01.com.openfiler:c1vg-vol05 -p 200.200.20.6 --op update -n node.startup -v automatic
[root@racnode3 ~]# iscsiadm -m node -T iqn.2006-01.com.openfiler:c1vg-vol06 -p 200.200.20.6 --op update -n node.startup -v automatic
[root@racnode3 ~]# iscsiadm -m node -T iqn.2006-01.com.openfiler:c1vg-vol07 -p 200.200.20.6 --op update -n node.startup -v automatic
[root@racnode3 ~]# iscsiadm -m node -T iqn.2006-01.com.openfiler:c1vg-vol08 -p 200.200.20.6 --op update -n node.startup -v automatic
[root@racnode3 ~]# iscsiadm -m node -T iqn.2006-01.com.openfiler:c1vg-vol09 -p 200.200.20.6 --op update -n node.startup -v automatic
[root@racnode3 ~]# iscsiadm -m node -T iqn.2006-01.com.openfiler:c1vg-vol10 -p 200.200.20.6 --op update -n node.startup -v automatic

Task #4.8l: Configure Shared Storage at the ASM Level.

First, check the ASM rpms are installed:

[root@racnode3 ~]# rpm -qa | grep oracleasm
oracleasmlib-2.0.12-1.el6.x86_64
oracleasm-support-2.1.8-1.el6.x86_64

Next, configure Oracle ASM:

[root@racnode3 ~]# oracleasm configure -i
Configuring the Oracle ASM library driver.
This will configure the on-boot properties of the Oracle ASM library
driver. The following questions will determine whether the driver is
loaded on boot and what permissions it will have. The current values
will be shown in brackets ('[]'). Hitting <ENTER> without typing an
answer will keep that current value. Ctrl-C will abort.

Default user to own the driver interface []: grid
Default group to own the driver interface []: asmadmin
Start Oracle ASM library driver on boot (y/n) [n]: y
Scan for Oracle ASM disks on boot (y/n) [y]: y

Writing Oracle ASM library driver configuration: done

Next, initialize Oracle ASM:

[root@racnode1 ~]# oracleasm init
Creating /dev/oracleasm mount point: /dev/oracleasm
Loading module "oracleasm": oracleasm
Configuring "oracleasm" to use device physical block size
Mounting ASMlib driver filesystem: /dev/oracleasm

Finally, instantiate the ASM Disks:

[root@racnode3 ~]# oracleasm scandisks
Reloading disk partitions: done
Cleaning any stale ASM disks...
Scanning system for ASM disks...
Instantiating disk "ASMDISK01"
Instantiating disk "ASMDISK02"
Instantiating disk "ASMDISK03"
Instantiating disk "ASMDISK04"
Instantiating disk "ASMDISK05"
Instantiating disk "ASMDISK06"
Instantiating disk "ASMDISK07"
Instantiating disk "ASMDISK08"
Instantiating disk "ASMDISK09"
Instantiating disk "ASMDISK10"

Task #4.8m: Run the cluvfy Utility.

As the grid user, run the following cluvfy utility command from the original software installation node (racnode1):

[grid@racnode1 ~]$ cluvfy stage -pre nodeadd -n racnode3 -fixup -verbose

The output is very extensive and showed all the tests were passed except one. That being the test for swap space. We can safely ignore that:

Check: Swap space
  Node Name     Available                 Required                  Status
  ------------  ------------------------  ------------------------  ----------
  racnode3      4GB (4194300.0KB)         5.8306GB (6113860.0KB)    failed
  racnode1      4GB (4194300.0KB)         5.8306GB (6113860.0KB)    failed
Result: Swap space check failed

Task #4.8n: Run the GI addnode.sh.

As the grid user on racnode1, set the GI environment and run the addnode.sh script:

[grid@racnode1 ~]$ . oraenv
ORACLE_SID = [+ASM1] ? 
The Oracle base remains unchanged with value /u01/app/grid

[grid@racnode1 ~]$ cd $ORACLE_HOME/addnode
[grid@racnode1 addnode]$ pwd
/u01/app/12.1.0/grid/addnode

[grid@racnode1 ~]$ ./addnode.sh

This invokes the Cluster Add Node GUI:

Click the Add button and enter these values for the Public Hostname and Virtual Hostname:

Click OK, then click Next:

Predictably it complained about swap space:

Click the Ignore All check box, then click Next. Click Yes:

Click Install:

The installation starts and the familiar progress bar is displayed:

After a few minutes, the run root script box appears:

In a root windown on racnode3, run the root.sh script:

[root@racnode3 grid]# ./root.sh

The output can be viewed here.

When the root.sh script has completed, click OK:

Click Close and you’re done. Now let’s check that Grid Infrastructure really is up and running on racnode3.

First, run the post node add option of the cluvfy command:

[grid@racnode3 ~]$ cluvfy stage -post nodeadd -n racnode3

The output should not produce any errors.

Next, verify racnode3 is a member of the cluster:

[grid@racnode3 ~]$ olsnodes -n -i -s -t -a
racnode1       1       racnode1-vip.mynet.com       Active Hub     Unpinned
racnode2       2       racnode2-vip.mynet.com       Active Hub     Unpinned
racnode3       3       racnode3-vip.mynet.com       Active Hub     Unpinned

Next, check CRS is up and running:

[grid@racnode3 ~]$ crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online

Next, check the status of the SCAN VIPs:

[grid@racnode3 ~]$ srvctl status scan
SCAN VIP scan1 is enabled
SCAN VIP scan1 is running on node racnode1
SCAN VIP scan2 is enabled
SCAN VIP scan2 is running on node racnode3
SCAN VIP scan3 is enabled
SCAN VIP scan3 is running on node racnode2

Next, check the status of the nodeapps:

[grid@racnode3 ~]$ srvctl status nodeapps
VIP racnode1-vip.mynet.com is enabled
VIP racnode1-vip.mynet.com is running on node: racnode1
VIP racnode2-vip.mynet.com is enabled
VIP racnode2-vip.mynet.com is running on node: racnode2
VIP racnode3-vip.mynet.com is enabled
VIP racnode3-vip.mynet.com is running on node: racnode3
Network is enabled
Network is running on node: racnode1
Network is running on node: racnode2
Network is running on node: racnode3
ONS is enabled
ONS daemon is running on node: racnode1
ONS daemon is running on node: racnode2
ONS daemon is running on node: racnode3

Finally, check the status of ASM:

[grid@racnode3 ~]$ srvctl status asm
ASM is running on racnode2,racnode3,racnode1

Task #4.8o: Run the Database addnode.sh

With racnode3 now officially a member of the cluster, we now need to make it a database server. This is done in two stages. The first is to clone the database software using an existing cluster node, followed by adding a RAC database instance to racnode3.

[oracle@racnode1 addnode]$ pwd
/u01/app/oracle/product/12.1.0/dbhome_1/addnode

[oracle@racnode1 addnode]$ ./addnode.sh "CLUSTER_NEW_NODES={racnode3}"

Note, a non-domain qualified hostname is sufficient, i.e. racnode3 rather than racnode3.mynet.com.

The addnode script invokes a database installer GUI. Click the checkbox for racnode3:

Click Next:

As usual, it’s complaining about swap space which we know we can ignore. Click the Ignore All checkbox, then click Next:

Click Yes:

On the Summary screen, click Install:

The database $ORACLE_HOME on racnode1 is cloned to racnode3:

Towards the end of the cloning process, you’ll be asked to run a root.sh script on racnode3:

[root@racnode3 dbhome_1]# pwd
/u01/app/oracle/product/12.1.0/dbhome_1

[root@racnode3 dbhome_1]# ./root.sh
Performing root user operation.

The following environment variables are set as:
   ORACLE_OWNER= oracle
   ORACLE_HOME= /u01/app/oracle/product/12.1.0/dbhome_1

Enter the full pathname of the local bin directory: [/usr/local/bin]:
The contents of "dbhome" have not changed. No need to overwrite.
The contents of "oraenv" have not changed. No need to overwrite.
The contents of "coraenv" have not changed. No need to overwrite.

Entries will be added to the /etc/oratab file as needed by
Database Configuration Assistant when a database is created
Finished running generic part of root script.

Now product-specific root actions will be performed.

Once the root.sh script completes, return to the GUI, click OK to confirm the completion of the root.sh script, then click Close:

Task #4.8p: Run the Database Configuration Assistant (dbca).

All that’s happened is the Oracle database software has been installed onto racnode3. To extend the RAC database to a third instance, we need to run the Database Configuration Assistant from a node where an instance is already up and running, e.g. racnode1:

[oracle@racnode1 ~]$ . oraenv
ORACLE_SID = [RAC1DB1] ?
The Oracle base remains unchanged with value /u01/app/oracle

[oracle@racnode1 ~]$ which dbca
/u01/app/oracle/product/12.1.0/dbhome_1/bin/dbca

[oracle@racnode1 ~]$ dbca

Select Instance Management, then click Next:

Select Add an Instance, then click Next:

Select RAC1DB, then click Next:

Click Next:

Accept the default instance name of RAC1DB3 and click Next:

On the Summary screen, click Finish:

The GUI does its thing:

Finally, click Close:

We should now have a new instance up and running on racnode3. Let’s find out:

After adding a RAC1DB3 entry to the local /etc/oratab file:

[oracle@racnode3 dbhome_1]$ . oraenv
ORACLE_SID = [oracle] ? RAC1DB3
The Oracle base has been set to /u01/app/oracle

[oracle@racnode3 dbhome_1]$ sqlplus / as sysdba

SQL> select instance_name from v$instance;

INSTANCE_NAME
----------------
RAC1DB3

We know we’re using ASM, but we had no direct control over where the new instance’s files were created. So let’s make sure they exist in the correct place:

SQL> select lf.member 
     from v$logfile lf
     where lf.group# in 
     (
     select l.group# from v$log l where l.thread# = 3
     );

MEMBER
--------------------------------------------------------------------------------
+DATA/RAC1DB/ONLINELOG/group_6.270.905859405
+FRA/RAC1DB/ONLINELOG/group_6.481.905859405
+DATA/RAC1DB/ONLINELOG/group_5.271.905859405
+FRA/RAC1DB/ONLINELOG/group_5.482.905859407

SQL> select file_name          
     from dba_data_files
     where tablespace_name = 'UNDOTBS3';

FILE_NAME
--------------------------------------------------------------------------------
+DATA/RAC1DB/DATAFILE/undotbs3.269.905859401

Finally, let’s make sure the cluster agrees the RAC1DB3 instance is part of our setup:

[oracle@racnode3 ~]$ srvctl status database -d RAC1DB
Instance RAC1DB1 is running on node racnode1
Instance RAC1DB2 is running on node racnode2
Instance RAC1DB3 is running on node racnode3

And for those of you who are paranoid like myself, let’s just see the process running:

[oracle@racnode3 dbhome_1]$ ps -ef | grep -v grep | grep pmon
oracle   17052     1 0 11:37 ?       00:00:00 ora_pmon_RAC1DB3
grid     23605     1 0 Mar05 ?       00:00:11 asm_pmon_+ASM3

Task #4.9: Delete a Node From the Cluster.

Note, before we dive into deleting the node we just added, there are a few items worth mentioning so we’re all on the same page:

  1. Our RAC database is admin-managed and we will be removing the instance and the Oracle Database software. The procedure is different for a policy-managed database.
  2. Before removing an instance, ensure any services associated with that instance are relocated to other instances.
  3. Ensure the services are configured such that the deleted instance is neither a preferred nor available instance.

Task #4.9a: Run the Database Configuration Assistant (dbca).

From a node other than the one which runs the instance you wish to delete, fire up dbca:

[oracle@racnode1 ~]$ . oraenv
ORACLE_SID = [RAC1DB1] ? 
The Oracle base remains unchanged with value /u01/app/oracle

[oracle@racnode1 ~]$ dbca

Select Instance Management, then click Next:

Select Delete an Instance, then click Next:

Select RAC1DB, then click Next:

Select RAC1DB3, then click Next:

Click Finish:

Click OK to proceed:

The familiar progress bar tracks progress, then rather bizarrely a dialog box appears with a red cross, the digit 1 and an OK button. Clicking the OK button closes the dbca. Question is, did the removal of the RAC1DB3 instance succeed? Let’s find out:

The instance isn’t running on racnode3:

[oracle@racnode3 ~]$ ps -ef | grep -v grep | grep pmon
grid     23605     1 0 Mar05 ?       00:00:13 asm_pmon_+ASM3

Let’s check the configuration of the RAC database:

[oracle@racnode3 ~]$ srvctl config database -d RAC1DB
Database unique name: RAC1DB
Database name: RAC1DB
Oracle home: /u01/app/oracle/product/12.1.0/dbhome_1
Oracle user: oracle
Spfile: +DATA/RAC1DB/PARAMETERFILE/spfile.268.898784645
Password file: +DATA/RAC1DB/PASSWORD/pwdrac1db.256.898779707
Domain: mynet.com
Start options: open
Stop options: immediate
Database role: PRIMARY
Management policy: AUTOMATIC
Server pools: 
Disk Groups: FRA,DATA
Mount point paths: 
Services: 
Type: RAC
Start concurrency: 
Stop concurrency: 
OSDBA group: dba
OSOPER group: oper
Database instances: RAC1DB1,RAC1DB2
Configured nodes: racnode1,racnode2
Database is administrator managed

The cluster doesn’t see the RAC1DB3 instance as a known resource:

[oracle@racnode3 ~]$ srvctl status database -d RAC1DB
Instance RAC1DB1 is running on node racnode1
Instance RAC1DB2 is running on node racnode2

The Online Redo Logs allocated to the instance have been deleted from the database:

SQL> select lf.member 
     from v$logfile lf
     where lf.group# in 
     (
     select l.group# from v$log l where l.thread# = 3
     );

no rows selected

Let’s double check thread #3 has really gone:

SQL> select inst_id, group#, thread#, members from gv$log;

   INST_ID     GROUP#    THREAD#    MEMBERS
---------- ---------- ---------- ----------
         1          1          1          2
         1          2          1          2
         1          3          2          2
         1          4          2          2
         2          1          1          2
         2          2          1          2
         2          3          2          2
         2          4          2          2

8 rows selected.

The UNDO tablespace has also been removed:

SQL> select tablespace_name 
     from   dba_tablespaces
     where  tablespace_name like 'UNDO%';

TABLESPACE_NAME
------------------------------
UNDOTBS1
UNDOTBS2

Looks like the instance RAC1DB3 has been deleted.

The Oracle Database software is still installed and the racnode3 node is still part of the cluster:

[oracle@racnode3 dbhome_1]$ pwd
/u01/app/oracle/product/12.1.0/dbhome_1

[grid@racnode3 ~]$ olsnodes -n -i -s -t -a
racnode1       1       racnode1-vip.mynet.com       Active Hub     Unpinned
racnode2       2       racnode2-vip.mynet.com       Active Hub     Unpinned
racnode3       3       racnode3-vip.mynet.com       Active Hub     Unpinned

Task #4.9b: Deinstall the Oracle Database Software.

To update the Oracle Inventory on racnode3, run this:

[oracle@racnode3 bin]$ pwd
/u01/app/oracle/product/12.1.0/dbhome_1/oui/bin

[oracle@racnode3 bin]$ ./runInstaller -updateNodeList ORACLE_HOME=/u01/app/oracle/product/12.1.0/dbhome_1 "CLUSTER_NODES={racnode3}" -local
Starting Oracle Universal Installer...

Checking swap space: must be greater than 500 MB.   Actual 4044 MB    Passed
The inventory pointer is located at /etc/oraInst.loc
'UpdateNodeList' was successful.

To delete the database $ORACLE_HOME on racnode3, run this:

[oracle@racnode3 deinstall]$ pwd
/u01/app/oracle/product/12.1.0/dbhome_1/deinstall

[oracle@racnode3 deinstall]$ ./deinstall -local

The output is quite verbose and can be viewed here. There was one failure. The script could not delete the directory /u01/app/oracle because another session was in that directory. Lesson learned! Apart from that, the Oracle Database software was completely removed from racnode3.

Finally, update the Oracle Inventories on the remaining nodes by running this:

[oracle@racnode1 bin]$ pwd
/u01/app/oracle/product/12.1.0/dbhome_1/oui/bin

[oracle@racnode1 bin]$ ./runInstaller -updateNodeList ORACLE_HOME=/u01/app/oracle/product/12.1.0/dbhome_1 "CLUSTER_NODES={racnode1,racnode2}" LOCAL_NODE=racnode1
Starting Oracle Universal Installer...

Checking swap space: must be greater than 500 MB.   Actual 4032 MB    Passed
The inventory pointer is located at /etc/oraInst.loc
'UpdateNodeList' was successful.

The Oracle Database software installation has been deleted, but racnode3 is still seen as part of the cluster:

[oracle@racnode3 oracle]$ pwd
/u01/app/oracle

[oracle@racnode3 oracle]$ ls -la
drwxrwxr-x 2 oracle oinstall 4096 Mar  7 21:09 .
drwxr-xr-x 6 root   oinstall 4096 Mar  4 10:59 ..

[grid@racnode1 ~]$ olsnodes -n -i -s -t -a
racnode1        1       racnode1-vip.mynet.com        Active  Hub     Unpinned
racnode2        2       racnode2-vip.mynet.com        Active  Hub     Unpinned
racnode3        3       racnode3-vip.mynet.com        Active  Hub     Unpinned

Task #4.9c: Deinstall the Oracle Grid Infrastructure Software.

To update the Oracle Inventory on racnode3, run this:

[grid@racnode3 bin]$ pwd
/u01/app/12.1.0/grid/oui/bin

[grid@racnode3 bin]$ ./runInstaller -updateNodeList ORACLE_HOME=/u01/app/12.1.0/grid "CLUSTER_NODES={racnode3}" CRS=TRUE -silent -local
Starting Oracle Universal Installer...

Checking swap space: must be greater than 500 MB.   Actual 4044 MB    Passed
The inventory pointer is located at /etc/oraInst.loc
'UpdateNodeList' was successful.

To delete the GI $ORACLE_HOME on racnode3, run the following command.

Note, this will shutdown Clusterware and ASM on racnode3 and relocate the LISTENER_SCANn to another node.

[grid@racnode3 ~]$ /u01/app/12.1.0/grid/deinstall/deinstall -local

The output is quite verbose and can be viewed here. In addition, part way through a generated script needs to be run as the root user. Its output can be viewed here.

To update the Oracle Inventories of the remaining nodes, run these commands:

[grid@racnode1 bin]$ pwd
/u01/app/12.1.0/grid/oui/bin

[grid@racnode1 bin]$ ./runInstaller -updateNodeList ORACLE_HOME=/u01/app/12.1.0/grid "CLUSTER_NODES={racnode1,racnode2}" CRS=TRUE -silent
Starting Oracle Universal Installer...

Checking swap space: must be greater than 500 MB.   Actual 4028 MB    Passed
The inventory pointer is located at /etc/oraInst.loc
'UpdateNodeList' was successful.

[oracle@racnode1 bin]$ pwd
/u01/app/oracle/product/12.1.0/dbhome_1/oui/bin

[oracle@racnode1 bin]$ ./runInstaller -updateNodeList ORACLE_HOME=/u01/app/oracle/product/12.1.0/dbhome_1 "CLUSTER_NODES={racnode1,racnode2}"
Starting Oracle Universal Installer...

Checking swap space: must be greater than 500 MB.   Actual 4028 MB    Passed
The inventory pointer is located at /etc/oraInst.loc
'UpdateNodeList' was successful.

[root@racnode1 ~]# crsctl delete node -n racnode3
CRS-4661: Node racnode3 successfully deleted.

Finally, let’s verify racnode3 is no longer part of the cluster:

[grid@racnode1 ~]$ cluvfy stage -post nodedel -n racnode3 -verbose
Performing post-checks for node removal 

Checking CRS integrity...
The Oracle Clusterware is healthy on node "racnode1"

CRS integrity check passed
Clusterware version consistency passed.

Result: Node removal check passed
Post-check for node removal was successful.

[grid@racnode1 ~]$ olsnodes -n -i -s -t -a
racnode1        1       racnode1-vip.mynet.com        Active  Hub     Unpinned
racnode2        2       racnode2-vip.mynet.com        Active  Hub     Unpinned

So there it is, racnode3 successfully removed from the cluster.

That about wraps it up for Part 11. In Part 12 we tackle Oracle Recovery Manager. You never know, you might want to back something up and restore it again. See you next time.

If you have any comments or questions about this post, please use the Contact form here.

Oracle VM Tips

 

icon-oraclevm

 

Oracle VM can be a little tricky to drive, so I use this post to document some gotchas and what I did to resolve them.

 

 

Use the following links to jump to the item you need.

Index:

Error Starting Up VMs:

When starting up VMs from OVM Manager, you may encounter this error:

Error: Device 1 (vif) could not be connected. Hotplug scripts not working

Note, the Device could be 0, 2, etc. depending upon the number of vNICs you have configured.

What’s happening is the vNICs are not being initialized by the Hotplug scripts within the specified time limit. It could also be the side effect of a Xen memory leak which is a known bug. If the VM can be started without any vNICs, then you’ll know you have this issue. Here are 3 things you can try if you’re in this situation.

Solution #1: Add the vNICs one at a time.

RACNODE2_VM has an ID of 0004fb00000600000703eab3e0c76af5.

We know from creating the VM storage repositories earlier, that the VM_Filesystems_Repo has an ID of 0004fb000003000059416081b6e25e36. If we log onto the Oracle VM server itself, we can locate this VM’s configuration file called vm.cfg:

[root@ovmsvr]# cd /
[root@ovmsvr]# cd OVS
[root@ovmsvr]# cd Repositories
[root@ovmsvr]# cd 0004fb000003000059416081b6e25e36
[root@ovmsvr]# cd VirtualMachines
[root@ovmsvr]# cd 0004fb00000600000703eab3e0c76af5
[root@ovmsvr]# pwd
/OVS/Repositories/0004fb000003000059416081b6e25e36/VirtualMachines/
0004fb00000600000703eab3e0c76af5

[root@ovmsvr]# ls -l
total 0
-rw------- 1 root root 981 Dec 17 22:03 vm.cfg

The vm.cfg file contains the VM configuration data. The first line in vm.cfg specifies the vNIC MAC addresses. For example:

vif = ['mac=00:21:f6:d2:45:a0,bridge=c0a80000', 'mac=00:21:f6:e6:f6:71,
bridge=101393ebe0', 'mac=00:21:f6:2f:de:c5,bridge=10a42e83d3']

Make a copy of this line and comment it out so you have a record of what it looked like originally. Then edit the original line, removing all but the first mac and bridge pair. Then start the VM. If that works, stop the VM, edit this line to include the first 2 mac and bridge pairs, then start the VM again. If that works, try starting the VM with this line containing all 3 mac and bridge pairs (i.e. as the line looked originally). This has worked for me, but not always.

Solution #2: Restart the ovs-agent service.

The problem may be caused by a memory leak which is a known bug. This is documented on MOS, Doc ID 2077214.1. What that note effectively tells you to do is restart the ovs-agent service:

[root@ovmsvr ~]# service ovs-agent restart
Stopping Oracle VM Agent:                                 [ OK ]
Starting Oracle VM Agent:                                 [ OK ]

This is a quick and easy workaround and has worked for me.

Solution #3: Edit and re-start.

The problem could be the value of DEVICE_CREATE_TIMEOUT which defaults to 100. According to MOS, Doc ID 1089604.1, this timeout value can be increased. Thus giving the Hotplug scripts more time to complete their tasks. This is done by editing this file, /etc/xen/xend-config.sxp.

Find this line:

#(device-create-timeout 100)

and change it to this:

(device-create-timeout 150)

I have had some success with this solution, but VMs have refused to start after making this change. I got around that by using solution #2.

Note, to make this change effective restart of the OVM Server, which takes ALL your VMs down!

 

Error Logging Into Oracle VM Manager:

If you’ve not logged into OVM Manager for a while, you may find it’s taken an extended vacation. This can take the form of having to wait a really long time after you’ve entered your username and password, then it throws an “unexpected” error with this accompanying text:

java.lang.NullPointerException

I know, right? How unusual for Java to have had a problem. I mean it’s so rock solid and stable!

Now, you’re probably wanting to log in because you have something to do and don’t have a week to trawl through countless log files trying to find out why Java has had a hissy fit. So the quickest and simplest way to get logged into OVM Manager after seeing this error is to stop and re-start it:

[root@ovmmgr ~]# service ovmm stop
Stopping Oracle VM Manager                                 [  OK  ]
[root@ovmmgr ~]# service ovmm start
Starting Oracle VM Manager                                 [  OK  ]

Once OVM Manager is up and running, trying logging in again.

 

VM Stuck on ‘Stopping’:

Every now and then stopping a VM puts the VM into a ‘Stopping’ state in OVM Manager and that’s where it stays. Aborting the stop or restart and trying again has no effect other than to wind you up. Brilliant! Fortunately there’s a back door method to kick the VM to death, then resurrect it safely. For this you will need the ID for the VM. This can be obtained via OVM Manager.

Click the arrowhead to the left of the VM name. This opens up the Configuration tab. Make a note of the long VM ID string.

Next, log into the OVM server as root and locate the directory where the VM configuration file is located (vm.cfg). The path will be something like this:

/OVS/Repositories/<ID directory name>/VirtualMachines/<VM ID>/vm.cfg

Next, run two xm commands. One to destroy the VM (it doesn’t actually destroy it – rather it destroys the processes running the VM) and the other to create the VM using its configuration file: Here’s an example (where the string ending in 8f4 is the ID of the VM):

[root@ovmsvr ~ ]# xm destroy 0004fb0000060000129f6b1374e4c8f4

[root@ovmsvr ~ ]# xm create -c /OVS/Repositories/0004fb000003000059416081b6e25e36/VirtualMachines/0004fb0000060000129f6b1374e4c8f4/vm.cfg

Whenever I’ve tried this, the command line has hung until I stop and start the VM again via OVM Manager. Then I got the cursor back. YMMV.

Build Your Own Oracle Infrastructure: Part 10 – Install Oracle GI & Database Software

After building and configuring the hardware infrastructure, it’s time to install some Oracle software.
If all the prior tasks have been completed successfully, the software installations should be easy.

In Part 10, we’ll install the OEM Management Agent,  Oracle 12c Grid Infrastructure and the database software. 

Quick links to all the tasks:

Task #1: Install the OEM Cloud Control Management Agent.

To avoid the hassle of adding targets manually to Cloud Control later, we’ll install the Management Agent now so it’ll just pick up targets as they materialize. Just 4 steps should do it.

Task #1a: Create a Management Agent Software Directory.

We need to create a base directory for the Management Agent software. It’s required that the base directory and all its parent directories are owned by the installing user (oracle). Therefore, we can’t install the Management Agent into /u01/app/oracle because /u01/app is owned by the grid user. So, on both racnode1 and racnode2, create a directory for the Management Agent software installation using these commands:

[root@racnode1 ~]# mkdir -p /u01/oracle/agent
[root@racnode1 ~]# chown -R oracle:oinstall /u01/oracle
[root@racnode1 ~]# chmod 755 /u01/oracle/agent

Task #1b: Manually Create an Oracle Inventory.

Since the Management Agent software will be the first Oracle software installed on the RAC server nodes, there won’t be a pre-existing Oracle Inventory it can write to. That causes a problem when Cloud Control tries to deploy the Management Agent. To get around that, do this:

[root@racnode1 ~]# touch /etc/oraInst.loc
[root@racnode1 ~]# chmod 644 /etc/oraInst.loc
[root@racnode1 ~]# mkdir /u01/app/oraInventory
[root@racnode1 ~]# chown grid:oinstall /u01/app/oraInventory
[root@racnode1 ~]# chmod 775 /u01/app/oraInventory

Then edit /etc/oraInst.loc to include these lines:

inventory_loc=/u01/app/oraInventory
inst_group=oinstall

Task #1c: Edit /etc/hosts on the OEM Cloud Control Server.

If you haven’t already, add these lines to the /etc/hosts file on the OEM Cloud Control server, oraemcc.mynet.com:

200.200.10.11    racnode1.mynet.com            racnode1
200.200.20.12    racnode2.mynet.com            racnode2

A fully documented /etc/hosts file can be viewed here.

Task 1d: Use Cloud Control.

For this walkthrough we’ll use racnode2 as the example. You could deploy Management Agents to both racnode servers at the same time or install them one at a time. Your call.

Log into Cloud Control, click on SetupAdd Target ⇒ Add Targets Manually:

Click Add Host:

Click on the Add menu, choose Manually then enter hostname (racnode2.mynet.com) and platform (Linux x86-64):

Click Next, then use these values to populate the screen:

Field Value
Installation Base Directory /u01/oracle/agent
Instance Directory /u01/oracle/agent/agent_inst

In addition, create a new Named Credential for the oracle OS user then click OK:

With all the fields populated, the screen should look similar to this:

Click Next:

Click Deploy Agent:

The deploy proceeds until it starts whining about running a root script. This can be ignored for now. Click the Continue menu then choose Continue, All hosts:

Now you can run the root.sh script as specified on the screen (/u01/oracle/agent/core/12.1.0.5.0/root.sh):

[root@racnode2 12.1.0.5.0]# ./root.sh
Finished product-specific root actions.
/etc exist
Creating /etc/oragchomelist file...

Then click the Done button and you’re done. Management Agent Deployed!

You can check all is well with the Management Agent by running these commands as the oracle user:

[oracle@racnode2 ~]$ cd /u01/oracle/agent/agent_inst/bin

[oracle@racnode2 bin]$ ./emctl status agent
Oracle Enterprise Manager Cloud Control 12c Release 5
Copyright (c) 1996, 2015 Oracle Corporation. All rights reserved.
---------------------------------------------------------------
Agent Version          : 12.1.0.5.0
OMS Version            : 12.1.0.5.0
Protocol Version       : 12.1.0.1.0
Agent Home             : /u01/oracle/agent/agent_inst
Agent Log Directory    : /u01/oracle/agent/agent_inst/sysman/log
Agent Binaries         : /u01/oracle/agent/core/12.1.0.5.0
Agent Process ID       : 4519
Parent Process ID      : 4258
Agent URL              : https://racnode2.mynet.com:3872/emd/main/
Local Agent URL in NAT : https://racnode2.mynet.com:3872/emd/main/
Repository URL         : https://oraemcc.mynet.com:4903/empbs/upload
Started at             : 2016-01-26 13:55:19
Started by user        : oracle
Operating System       : Linux version 3.8.13-118.2.2.el6uek.x86_64 (amd64)
Last Reload            : (none)
Last successful upload                       : 2016-02-02 21:50:25
Last attempted upload                        : 2016-02-02 21:50:25
Total Megabytes of XML files uploaded so far : 8.58
Number of XML files pending upload           : 0
Size of XML files pending upload(MB)         : 0
Available disk space on upload filesystem    : 32.51%
Collection Status                            : Collections enabled
Heartbeat Status                             : Ok
Last attempted heartbeat to OMS              : 2016-02-02 21:52:44
Last successful heartbeat to OMS             : 2016-02-02 21:52:44
Next scheduled heartbeat to OMS              : 2016-02-02 21:53:44
---------------------------------------------------------------
Agent is Running and Ready

Task #2: Run the Grid Infrastructure Installer.

With the Grid Infrastructure software copied to racnode1 and unzipped, ensure your DISPLAY environment variable is set to the IP address or hostname of your workstation:

Note, the Grid Infrastructure software is installed by the grid user.

[grid@racnode1 grid]$ export DISPLAY=<your-workstation-or-IP>:0.0

Test this is working by running an X Windows program, making sure the display opens on your workstation. Most people tend to use xclock as do I from time to time. I prefer xeyes though because it’s funnier:

 [grid@racnode1 grid]$ xeyes 

With the display set correctly, invoke the Grid Infrastructure installer:

[grid@racnode1 grid]$ ./runInstaller

Select Install and Configure Oracle Grid Infrastructure for a Cluster, then click Next:

Select Configure a Standard cluster, then click Next:

Select Advanced Installation, then click Next:

Use these values to populate the next screen, then click Next:

Field Value
Cluster Name cluster1
SCAN Name cluster1-scan.mynet.com
SCAN Port 1521
Configure GNS Uncheck this box

Use these values to populate the next screen, then click SSH connectivity:

Public Hostname Virtual Hostname
racnode1.mynet.com racnode1-vip.mynet.com
racnode2.mynet.com racnode2-vip.mynet.com

After clicking SSH connectivity, enter the OS password of the grid user, then click Setup. Let the installer re-do the SSH setup. It uses RSA keys. Again, let Oracle take care of this. If the installation goes wrong due to a user equivalency problem, it won’t be your fault! Once SSH is re-configured, click Next.

Use these values to populate the next screen, then click Next:

Interface Name Subnet Use for
eth0 200.200.10.0 Public
eth1 200.200.20.0 Do Not Use
eth2 200.200.30.0 Private

Note, although we are using eth1 to access shared storage, Oracle Grid Infrastructure has no business on this network so it must be flagged as “Do Not Use”.

Select Use Standard ASM for storage, then click Next:

Click the Change Discovery Path button:

Change the Disk Discovery Path to /dev/oracleasm/disks/* then click OK:

Use these values to populate the next screen, then click Next:

Field Value
Disk group name CRS
Redundancy External
Disk Path Check /dev/oracleasm/disks/ASMDISK01

Click Use same password for these accounts, enter a password for SYSASM twice, then click Next:

Select Do not use Intelligent Platform Management Interface (IPMI):

Click the check box to Register with Enterprise Manager (EM) Cloud Control. Use these values to populate the screen, then click Next:

Field Value
OMS host oraemcc.mynet.com
OMS port 4903
EM Admin User Name SYSMAN
EM Admin Password (referenced here)

The 3 OS groups should be pre-populated with these values. Click Next:

OS Group OS Group Name
Oracle ASM Administrator (OSASM) Group asmadmin
Oracle ASM DBA (OSDBA for ASM) Group asmdba
Oracle ASM Operator (OSOPER for ASM) Group (Optional) asmoper

Use these values to populate the next screen, then click Next:

Field Value
Oracle base /u01/app/grid
Software location /u01/app/12.1.0/grid

Make sure the Automatically run configuration scripts check box is unchecked. We’ll be running root scripts as the root user from the command line later. Click Next:

This next screen shows the results of the prerequisite checks. As you can see, the swap space issue came up which we know we can ignore. The screen also shows a Device Checks for ASM failure. What this boils down to is the installer’s inability to detect shared storage on both the RAC nodes.

Here is the full text of the failure:

Device Checks for ASM - This is a prerequisite check to verify that the specified 
devices meet the requirements for ASM.

Check Failed on Nodes: [racnode2,  racnode1]
Verification result of failed node: racnode2
Expected Value : cvuqdisk-1.0.9-1
Actual Value   : cvuqdisk-1.0.9-1
 Details:
 - 
PRVF-9802 : Attempt to get 'udev' information from node "racnode2" failed No UDEV rule
found for device(s) specified  - Cause:  An attempt to read the ''udev'' permission or
rule file failed, or the permission or rule file did not contain any rules for the 
specified device or devices.  
- Action:  Make sure that the ''udev'' permission or rule directory is created, the 
''udev'' permission or rule file is available and accessible by the user running the 
check and that the permission or rule file contains the correct ''udev'' rule for the 
device or devices being checked.
Back to Top

Verification result of failed node: racnode1
Expected Value : cvuqdisk-1.0.9-1
Actual Value   : cvuqdisk-1.0.9-1
 Details:
 - 
PRVF-9802 : Attempt to get 'udev' information from node "racnode1" failed No UDEV rule
found for device(s) specified  - Cause:  An attempt to read the ''udev'' permission or
rule file failed, or the permission or rule file did not contain any rules for the 
specified device or devices.  
- Action:  Make sure that the ''udev'' permission or rule directory is created, the 
''udev'' permission or rule file is available and accessible by the user running the 
check and that the permission or rule file contains the correct ''udev'' rule for the 
device or devices being checked.

What this essentially means is a check was made for the presence of the CVUQDISK package on both racnode1 and racnode2 (remember we installed that here). The installer is looking for version 1.0.9-1 and it found version 1.0.9-1, but apparently that wasn’t good enough. Further checks were made for udev rules regarding the shared disk selected earlier (ASMDISK01) and the installer couldn’t find those either. I actually tried this installation both with and without udev rules and it didn’t make any difference to the reporting of this ‘failure’. I even tried re-installing the CVUQDISK package on both nodes and that also made no difference. My thought was if this was such a significant problem we’d find out soon enough because the GI installation would fail. Hint – it doesn’t fail so ignore this ‘failure’ if you see it.

Click Yes:

On the Summary screen, click Install:

The installation thunders along:

Until a window pops up inviting you to run a root.sh script on both nodes.

Note, it is important that you run this script to completion in the order of the nodes listed, i.e. racnode1 first and racnode2 second.

The root.sh scripts take quite a while to run and their output is quite verbose. You can see what it looks like here. Once the root.sh script completes on racnode2, click OK. The installation will continue. Again, this part of the installation takes some time:

When the installation completes, you’ll see this screen:

That’s Grid Infrastructure installed. Next, install the database software.

Task #3: Run the Database Installer.

Using FileZilla or your favorite file copy utility, copy the 2 Oracle Database software zip files to racnode1 and unzip them.

Note, the database software is installed by the oracle user.

[oracle@racnode1 ~]$ mkdir -p ./media/db_12.1.0.2
[oracle@racnode1 ~]$ cd ./media/db_12.1.0.2

[oracle@racnode1 db_12.1.0.2]$ unzip linuxamd64_12102_database_1of2.zip
[oracle@racnode1 db_12.1.0.2]$ unzip linuxamd64_12102_database_2of2.zip

[oracle@racnode1 db_12.1.0.2]$ cd database

Ensure your DISPLAY environment variable is set to the IP address or hostname of your workstation:

[oracle@racnode1 database]$ export DISPLAY=<your-workstation-or-IP>:0.0

With the display setup correctly, invoke the database installer:

[oracle@racnode1 database]$ ./runInstaller

Uncheck the I wish to receive security updates via My Oracle Support option, then click Next:

Choose to remain uninformed by clicking Yes:

Select the Install database software only option, then click Next:

Select the Oracle Real Application Clusters database installation option, then click Next:

Make sure both nodes are selected. If you want the installer to re-do the SSH (user equivalency) setup, click the SSH connectivity button. When complete, click Next:

Select your preferred language:

Select Enterprise Edition, then click Next:

Use the following values to populate the next screen, then click Next:

Field Value
Oracle base /u01/app/oracle
Software location /u01/app/oracle/product/12.1.0/dbhome_1

The next screen should pre-populate with the correct OS groups. If so, click Next:

OS Group Privilege OS Group
Database Administrator (OSDBA) group dba
Database Operator (OSOPER) group (Optional) oper
Database Backup and Recovery (OSBACKUPDBA) group backupdba
Data Guard administrative (OSDGDBA) group dgdba
Encryption Key Management administrative (OSKMDBA) group kmdba

As with the Grid Infrastructure installation, the installer complains about swap space. We can safely ignore this prerequisite check failure by checking Ignore All and clicking Next:

Click Yes:

On the Summary screen, click Install:

The familiar progress bar screen appears:

After a short while, this window appears inviting you to run the root.sh script on racnode1 and racnode2. Always run the root.sh script to completion in the order the nodes are listed. When complete, click OK:

The root.sh script output looks like this:

[root@racnode1 ~] cd /u01/app/oracle/product/12.1.0/dbhome_1

[root@racnode1 dbhome_1]# ./root.sh
Performing root user operation.
The following environment variables are set as:
   ORACLE_OWNER= oracle
   ORACLE_HOME= /u01/app/oracle/product/12.1.0/dbhome_1
Enter the full pathname of the local bin directory: [/usr/local/bin]:
The contents of "dbhome" have not changed. No need to overwrite.
The contents of "oraenv" have not changed. No need to overwrite.
The contents of "coraenv" have not changed. No need to overwrite.
Entries will be added to the /etc/oratab file as needed by
Database Configuration Assistant when a database is created
Finished running generic part of root script.
Now product-specific root actions will be performed.
[root@racnode2 ~] cd /u01/app/oracle/product/12.1.0/dbhome_1

[root@racnode2 dbhome_1]# ./root.sh
Performing root user operation.
The following environment variables are set as:
   ORACLE_OWNER= oracle
   ORACLE_HOME= /u01/app/oracle/product/12.1.0/dbhome_1
Enter the full pathname of the local bin directory: [/usr/local/bin]:
The contents of "dbhome" have not changed. No need to overwrite.
The contents of "oraenv" have not changed. No need to overwrite.
The contents of "coraenv" have not changed. No need to overwrite.
Entries will be added to the /etc/oratab file as needed by
Database Configuration Assistant when a database is created
Finished running generic part of root script.
Now product-specific root actions will be performed.

Finally, the successful completion screen is displayed:

We now have everything we need in place to create an Oracle RAC database. Obviously it’s important to test and verify that what you have installed is actually up and running correctly. We’ll run a series of tests after we create a RAC database in Part 11. See you there!

If you have any comments or questions about this post, please use the Contact form here.

Build Your Own Oracle Infrastructure: Part 9 – Build Shared Storage Server.

P9-logo

The last major hurdle we need to overcome before installing some Oracle software and building a RAC database is to configure shared storage.

Shared storage is the key to making Oracle RAC work. The same storage concurrently visible to and updatable from multiple servers really is the heart of the system.

Enterprise class dedicated SAN or NAS filers are very cool but very expensive. What we need is a scaled down economical solution which does the same thing. The answer? Openfiler!

Quick links to all the tasks:

Task #1: Install Openfiler.

Download the Openfiler 2.99.1 ISO and burn it to a DVD using ImgBurn. Then use the DVD to boot the shared storage server computer. Installing Openfiler is very simple and similar to installing Oracle Linux. Since we’re not installing Openfiler in a virtual environment, it’s a bit tricky to capture screen shots of a bare metal install. So I’ll have to rely on a text description. Fear not, this install is ridiculously easy and completes in 8 simple steps.

Task #1a: Press Enter to install in graphical mode.

After the server boots from DVD, simply hit the Enter key to get the install started.

Task #1b: Click Next.

Once the GUI comes up, just hit the Enter key to continue.

Task #1c: Select keyboard.

Select your desired keyboard layout from the menu and click Next.

Task #1d: Select installation drive.

The shared storage server has 4 drives in total, identified as sda, sdb, sdc and sdd:

Device Physical Drive Usage
sda Samsung 850 EVO 120GB 2.5-Inch SATA III Internal SSD Openfiler OS
sdb Western Digital 300 GB VelociRaptor SATA 3 Internal HD Data
sdc Western Digital 300 GB VelociRaptor SATA 3 Internal HD Data
sdd Samsung 850 EVO 120GB 2.5-Inch SATA III Internal SSD Data

Select the option to Remove all partitions on selected drives and create default layout.
Leave the first drive, sda, checked and uncheck all the other drives.
Finally, select sda as the boot installation drive and click Next.

Task #1e: Configure network devices.

The shared storage server has two NICs. Click Edit and configure them with these values:

Interface Field Value
eth0 IP Address 200.200.10.6
Prefix (Netmask) 255.255.255.0
eth1 IP Address 200.200.20.6
Prefix (Netmask) 255.255.255.0

Note, do not forget to disable IPv6 support and ensure both NICs are Active on Boot.

Set the remaining networking fields manually with these values, then click Next:

Field Value
Hostname openfiler.mynet.com
Gateway 200.200.10.1
Primary DNS 200.200.10.1
Secondary DNS 8.8.8.8

Task #1f: Choose timezone.

Select your preferred timezone from the menu. Leave System clock uses UTC checked. Click Next.

Task #1g: Choose root password.

Enter a root password twice. Write it down somewhere safe. Now! Don’t argue, just do it. Click Next.

Task #1h: Install the software.

A familiar looking left to right progress bar shows the installation’s progress. It only takes about a minute, then you’ll see a congratulations screen. Click Reboot.

Note, this reboot sometimes sticks at a completely blank screen. Don’t be alarmed. If this happens hard power off the machine, then power it back on and all will be well.

When the machine comes back up, it will display the URL for the Openfiler console, which in our case will be, https://200.200.10.6:446/. The default username is “openfiler” and the default password is, wait for it, “password”. No quotes of course.

Once you get logged in, you’ll see a status screen similar to this:

Told you it was easy! 😀

Task #2: Configure Openfiler.

Installing Openfiler was quite easy. Configuring the storage it manages is a little involved. There are 8 steps to this. Pay close attention. Maximum attention to detail is needed here!!!

Task #2a: Edit /etc/hosts.

Add the following to the openfiler server’s /etc/hosts file:

200.200.20.11 racnode1-storage
200.200.20.12 racnode2-storage

Then test network connectivity both ways from each racnode server:

[root@racnode1 ~]# ping -I eth1 openfiler-storage
[root@racnode2 ~]# ping -I eth1 openfiler-storage

[root@openfiler ~]# ping -I eth1 racnode1-storage
[root@openfiler ~]# ping -I eth1 racnode2-storage

Task #2b: Enable iSCSI.

On the status screen, click Services to go to the Manage Services page:

For iSCSI target, click Enable and Start. For iSCSI Initiator, click Enable and start. The screen should now look like this:

Task #2c: Network Access Configuration.

Click on System to display the networking configuration. Use the following values to to populate the Network Access Configuration section:

Name Network/Host Netmask Type
racnode1-storage 200.200.20.11 255.255.255.255 Share
racnode2-storage 200.200.20.12 255.255.255.255 Share

The screen should now look similar to this:

Note, it’s tempting to put the hostname in the Network/Host column especially if you’ve added the correct entries in the Openfiler server’s /etc/hosts file. However, don’t! It doesn’t work when you come to discover the storage from the Oracle RAC nodes. You MUST use the IP address!

Task #2d: Partition the Disk.

Click Volumes to display the Volume Group Management screen which currently shows no volume groups having been created:

Click Block Devices which shows the physical disks attached to the server:

The /dev/sda device is where Openfiler is installed. For the first Oracle RAC cluster, we’ll be using the other Samsung SSD, /dev/sdd. Click on /dev/sdd:

In the Create a partition in /dev/sdd section, use these values then click Create:

Field Value
Mode Primary
Partition Type Physical Volume
Starting Cylinder 1
Ending Cylinder 14593

The screen should now look like this:

Note, despite the values you used, Openfiler insists on keeping 5% of the disk free.

Task #2e: Create a Volume Group.

Click Volume Groups. Enter a new volume group name, cluster1-vg and check the box to select the /dev/sdd1 partition:

Then click Add volume group and the screen display changes to this:

Task #2f: Create Logical Volumes.

The logical volumes we create will eventually end up as ASM disks visible to the Oracle RAC node servers.Click Add Volume.

We can now use this screen to create volumes within the cluster1-vg volume group. Use the following values to create 10 x 10 GB volumes:

Volume Name Volume Description Required Space (MB) Filesystem / Volume type
c1vg-vol01 Cluster1 VG Volume #1 10240 block (iSCSI,FC,etc)
c1vg-vol02 Cluster1 VG Volume #2 10240 block (iSCSI,FC,etc)
c1vg-vol03 Cluster1 VG Volume #3 10240 block (iSCSI,FC,etc)
c1vg-vol04 Cluster1 VG Volume #4 10240 block (iSCSI,FC,etc)
c1vg-vol05 Cluster1 VG Volume #5 10240 block (iSCSI,FC,etc)
c1vg-vol06 Cluster1 VG Volume #6 10240 block (iSCSI,FC,etc)
c1vg-vol07 Cluster1 VG Volume #7 10240 block (iSCSI,FC,etc)
c1vg-vol08 Cluster1 VG Volume #8 10240 block (iSCSI,FC,etc)
c1vg-vol09 Cluster1 VG Volume #9 10240 block (iSCSI,FC,etc)
c1vg-vol10 Cluster1 VG Volume #10 10240 block (iSCSI,FC,etc)

When you’re done, your screen should look similar to this:

Task #2g: Create iSCSI Targets for Logical Volumes.

An iSCSI target is a way of identifying the logical volumes. Since we have 10 logical volumes, we’ll have 10 iSCSI targets. An iSCSI target is actually identified by a Target IQN or iSCSI Qualified Name. The following procedure will need to be repeated for each of the 10 logical volumes.

Click Volumes, then iSCSI Targets.

The default Target IQNs are not very meaningful, so we’ll add some meaning by changing their names according to the values below:

Logical Volume Name Target IQN
c1vg-vol01 iqn.2006-01.com.openfiler:c1vg-vol01
c1vg-vol02 iqn.2006-01.com.openfiler:c1vg-vol02
c1vg-vol03 iqn.2006-01.com.openfiler:c1vg-vol03
c1vg-vol04 iqn.2006-01.com.openfiler:c1vg-vol04
c1vg-vol05 iqn.2006-01.com.openfiler:c1vg-vol05
c1vg-vol06 iqn.2006-01.com.openfiler:c1vg-vol06
c1vg-vol07 iqn.2006-01.com.openfiler:c1vg-vol07
c1vg-vol08 iqn.2006-01.com.openfiler:c1vg-vol08
c1vg-vol09 iqn.2006-01.com.openfiler:c1vg-vol09
c1vg-vol10 iqn.2006-01.com.openfiler:c1vg-vol10

After changing the Target IQN, click the Add button. The screen display changes to this:

Ignore the Settings section as we won’t be changing any of those values.

Context is extremely important at this stage and you need to make sure your focus is on the correct object. Therefore, make sure you select the Target IQN you just created from the Select iSCSI Target menu. If it isn’t, as will be the case after you create the first one, select the Target IQN you need, then click the Change button. Selecting the Target IQN from the menu isn’t sufficient. You MUST click the Change button to make it the ‘active’ Target IQN.

Click LUN Mapping. The screen changes to this:

Click the Map button next to the Logical Volume you wish to map the currently active Target IQN to. First time through, the Map New LUN to Target will show “iqn.2006-01.com.openfiler:c1vg-vol01” and click the Map button on the line beginning, Cluster 1 VG Volume #1. The screen changes to this:

Next, click Network ACL (Access Control List). Here you define which hosts (Oracle RAC nodes) can have access to this Target IQN. Since Openfiler only knows about two nodes on the storage network (200.200.20.x), racnode1-storage and racnode2-storage are listed.

Use the Access menu to change Deny to Allow. Click Update. The screen will refresh but nothing else happens which can be a little deceiving. As long as you hit Update and the screen refreshes (don’t blink or you’ll miss it), you’re fine. Click Target Configuration to go back to this screen and repeat the procedure for c1vg-vol02 through c1vg-vol10.

Task #2h: Review iSCSI targets.

Once you’re done creating a Target IQN, LUN Mapping and Network ACL for all 10 logical volumes, you can check your handiwork. Return to the home page by clicking Status, then click iSCSI Targets:

Task #3: Configure iSCSI Targets in Oracle Linux.

Our attention now turns to making the iSCSI targets visible and usable on the Oracle RAC node servers. To do that, we use the iSCSI initiator client which we installed back in Part 7. There are 6 steps in all, so here we go.

Task #3a: Verify iSCSI client (ALL nodes).

Verify the iSCSI client is installed on both Oracle RAC nodes:

[root@racnode1 ~]# rpm -qa | grep iscsi
iscsi-initiator-utils-6.2.0.873-14.0.1.el6.x86_64
[root@racnode2 ~]# rpm -qa | grep iscsi
iscsi-initiator-utils-6.2.0.873-14.0.1.el6.x86_64

Then configure the iSCSI client to start on boot up on both Oracle RAC nodes:

[root@racnode1 ~]# chkconfig iscsid on
[root@racnode1 ~]# chkconfig iscsi on
[root@racnode2 ~]# chkconfig iscsid on
[root@racnode2 ~]# chkconfig iscsi on

Trying to start the services has little effect at this stage, but running the iscsiadm discovery command makes everything spring into life. You’ll see.

Perform this step on all nodes in the cluster.

Task #3b: Discover iSCSI targets (ALL nodes).

The iscsiadm discovery command starts the iscsid service and then discovers the iSCSI targets coming from Openfiler.

[root@racnode1 ~]# iscsiadm -m discovery -t sendtargets -p openfiler-storage
Starting iscsid:                                          [ OK ]
200.200.20.6:3260,1 iqn.2006-01.com.openfiler:c1vg-vol10
200.200.10.6:3260,1 iqn.2006-01.com.openfiler:c1vg-vol10
200.200.20.6:3260,1 iqn.2006-01.com.openfiler:c1vg-vol09
200.200.10.6:3260,1 iqn.2006-01.com.openfiler:c1vg-vol09
200.200.20.6:3260,1 iqn.2006-01.com.openfiler:c1vg-vol08
200.200.10.6:3260,1 iqn.2006-01.com.openfiler:c1vg-vol08
200.200.20.6:3260,1 iqn.2006-01.com.openfiler:c1vg-vol07
200.200.10.6:3260,1 iqn.2006-01.com.openfiler:c1vg-vol07
200.200.20.6:3260,1 iqn.2006-01.com.openfiler:c1vg-vol06
200.200.10.6:3260,1 iqn.2006-01.com.openfiler:c1vg-vol06
200.200.20.6:3260,1 iqn.2006-01.com.openfiler:c1vg-vol05
200.200.10.6:3260,1 iqn.2006-01.com.openfiler:c1vg-vol05
200.200.20.6:3260,1 iqn.2006-01.com.openfiler:c1vg-vol04
200.200.10.6:3260,1 iqn.2006-01.com.openfiler:c1vg-vol04
200.200.20.6:3260,1 iqn.2006-01.com.openfiler:c1vg-vol03
200.200.10.6:3260,1 iqn.2006-01.com.openfiler:c1vg-vol03
200.200.20.6:3260,1 iqn.2006-01.com.openfiler:c1vg-vol02
200.200.10.6:3260,1 iqn.2006-01.com.openfiler:c1vg-vol02
200.200.20.6:3260,1 iqn.2006-01.com.openfiler:c1vg-vol01
200.200.10.6:3260,1 iqn.2006-01.com.openfiler:c1vg-vol01

Note, we’re using the portal openfiler-storage (the -p option) which is on the storage network (200.200.20.x). However, the discovery output lists targets on the storage network AND the public network (200.200.10.x). I think this is an Openfiler quirk/bug. Needless to say, we’re only interested in the targets on the storage network.

Repeat this step on ALL nodes in the cluster.

Task #3c: Login to iSCSI targets (ALL nodes).

Using the iSCSI target name returned by the discovery command, use the iscsiadm command to log into each iSCSI target, making sure you reference the storage network IP addresses (200.200.20.x):

[root@racnode1 scripts]# iscsiadm -m node -T iqn.2006-01.com.openfiler:c1vg-vol01 -p 200.200.20.6 -l
Logging in to [iface: default, target: iqn.2006-01.com.openfiler:c1vg-vol01, portal: 200.200.20.6,3260] (multiple)
Login to [iface: default, target: iqn.2006-01.com.openfiler:c1vg-vol01, portal: 200.200.20.6,3260] successful.

[root@racnode1 scripts]# iscsiadm -m node -T iqn.2006-01.com.openfiler:c1vg-vol02 -p 200.200.20.6 -l
Logging in to [iface: default, target: iqn.2006-01.com.openfiler:c1vg-vol02, portal: 200.200.20.6,3260] (multiple)
Login to [iface: default, target: iqn.2006-01.com.openfiler:c1vg-vol02, portal: 200.200.20.6,3260] successful.

[root@racnode1 scripts]# iscsiadm -m node -T iqn.2006-01.com.openfiler:c1vg-vol03 -p 200.200.20.6 -l
Logging in to [iface: default, target: iqn.2006-01.com.openfiler:c1vg-vol03, portal: 200.200.20.6,3260] (multiple)
Login to [iface: default, target: iqn.2006-01.com.openfiler:c1vg-vol03, portal: 200.200.20.6,3260] successful.

[root@racnode1 scripts]# iscsiadm -m node -T iqn.2006-01.com.openfiler:c1vg-vol04 -p 200.200.20.6 -l
Logging in to [iface: default, target: iqn.2006-01.com.openfiler:c1vg-vol04, portal: 200.200.20.6,3260] (multiple)
Login to [iface: default, target: iqn.2006-01.com.openfiler:c1vg-vol04, portal: 200.200.20.6,3260] successful.

[root@racnode1 scripts]# iscsiadm -m node -T iqn.2006-01.com.openfiler:c1vg-vol05 -p 200.200.20.6 -l
Logging in to [iface: default, target: iqn.2006-01.com.openfiler:c1vg-vol05, portal: 200.200.20.6,3260] (multiple)
Login to [iface: default, target: iqn.2006-01.com.openfiler:c1vg-vol05, portal: 200.200.20.6,3260] successful.

[root@racnode1 scripts]# iscsiadm -m node -T iqn.2006-01.com.openfiler:c1vg-vol06 -p 200.200.20.6 -l
Logging in to [iface: default, target: iqn.2006-01.com.openfiler:c1vg-vol06, portal: 200.200.20.6,3260] (multiple)
Login to [iface: default, target: iqn.2006-01.com.openfiler:c1vg-vol06, portal: 200.200.20.6,3260] successful.

[root@racnode1 scripts]# iscsiadm -m node -T iqn.2006-01.com.openfiler:c1vg-vol07 -p 200.200.20.6 -l
Logging in to [iface: default, target: iqn.2006-01.com.openfiler:c1vg-vol07, portal: 200.200.20.6,3260] (multiple)
Login to [iface: default, target: iqn.2006-01.com.openfiler:c1vg-vol07, portal: 200.200.20.6,3260] successful.

[root@racnode1 scripts]# iscsiadm -m node -T iqn.2006-01.com.openfiler:c1vg-vol08 -p 200.200.20.6 -l
Logging in to [iface: default, target: iqn.2006-01.com.openfiler:c1vg-vol08, portal: 200.200.20.6,3260] (multiple)
Login to [iface: default, target: iqn.2006-01.com.openfiler:c1vg-vol08, portal: 200.200.20.6,3260] successful.

[root@racnode1 scripts]# iscsiadm -m node -T iqn.2006-01.com.openfiler:c1vg-vol09 -p 200.200.20.6 -l
Logging in to [iface: default, target: iqn.2006-01.com.openfiler:c1vg-vol09, portal: 200.200.20.6,3260] (multiple)
Login to [iface: default, target: iqn.2006-01.com.openfiler:c1vg-vol09, portal: 200.200.20.6,3260] successful.

[root@racnode1 scripts]# iscsiadm -m node -T iqn.2006-01.com.openfiler:c1vg-vol10 -p 200.200.20.6 -l
Logging in to [iface: default, target: iqn.2006-01.com.openfiler:c1vg-vol10, portal: 200.200.20.6,3260] (multiple)
Login to [iface: default, target: iqn.2006-01.com.openfiler:c1vg-vol10, portal: 200.200.20.6,3260] successful.

This has the effect of Oracle Linux recognizing the iSCSI targets as disk devices and assigns them a device name to each one. A partial fdisk -l listing confirms what has happened:

Disk /dev/sda: 10.7 GB, 10737418240 bytes
64 heads, 32 sectors/track, 10240 cylinders
Units = cylinders of 2048 * 512 = 1048576 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Disk /dev/sdb: 10.7 GB, 10737418240 bytes
64 heads, 32 sectors/track, 10240 cylinders
Units = cylinders of 2048 * 512 = 1048576 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Disk /dev/sdc: 10.7 GB, 10737418240 bytes
64 heads, 32 sectors/track, 10240 cylinders
Units = cylinders of 2048 * 512 = 1048576 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Disk /dev/sdd: 10.7 GB, 10737418240 bytes
64 heads, 32 sectors/track, 10240 cylinders
Units = cylinders of 2048 * 512 = 1048576 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Disk /dev/sde: 10.7 GB, 10737418240 bytes
64 heads, 32 sectors/track, 10240 cylinders
Units = cylinders of 2048 * 512 = 1048576 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Disk /dev/sdf: 10.7 GB, 10737418240 bytes
64 heads, 32 sectors/track, 10240 cylinders
Units = cylinders of 2048 * 512 = 1048576 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Disk /dev/sdg: 10.7 GB, 10737418240 bytes
64 heads, 32 sectors/track, 10240 cylinders
Units = cylinders of 2048 * 512 = 1048576 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Disk /dev/sdh: 10.7 GB, 10737418240 bytes
64 heads, 32 sectors/track, 10240 cylinders
Units = cylinders of 2048 * 512 = 1048576 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Disk /dev/sdi: 10.7 GB, 10737418240 bytes
64 heads, 32 sectors/track, 10240 cylinders
Units = cylinders of 2048 * 512 = 1048576 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Disk /dev/sdj: 10.7 GB, 10737418240 bytes
64 heads, 32 sectors/track, 10240 cylinders
Units = cylinders of 2048 * 512 = 1048576 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

By checking the entries in /dev/disk/by-path, we can determine the mapping between the iSCSI target and the local disk device:

[root@racnode1 by-path]# pwd
/dev/disk/by-path

[root@racnode1 by-path]# ls -l | grep iscsi
lrwxrwxrwx 1 root root 9 Dec 16 15:33 ip-200.200.20.6:3260-iscsi-iqn.2006-01.com.openfiler:c1vg-vol01-lun-0 -> ../../sda
lrwxrwxrwx 1 root root 9 Dec 16 15:33 ip-200.200.20.6:3260-iscsi-iqn.2006-01.com.openfiler:c1vg-vol02-lun-0 -> ../../sdb
lrwxrwxrwx 1 root root 9 Dec 16 15:33 ip-200.200.20.6:3260-iscsi-iqn.2006-01.com.openfiler:c1vg-vol03-lun-0 -> ../../sdc
lrwxrwxrwx 1 root root 9 Dec 16 15:33 ip-200.200.20.6:3260-iscsi-iqn.2006-01.com.openfiler:c1vg-vol04-lun-0 -> ../../sdd
lrwxrwxrwx 1 root root 9 Dec 16 15:34 ip-200.200.20.6:3260-iscsi-iqn.2006-01.com.openfiler:c1vg-vol05-lun-0 -> ../../sde
lrwxrwxrwx 1 root root 9 Dec 16 15:34 ip-200.200.20.6:3260-iscsi-iqn.2006-01.com.openfiler:c1vg-vol06-lun-0 -> ../../sdf
lrwxrwxrwx 1 root root 9 Dec 16 15:34 ip-200.200.20.6:3260-iscsi-iqn.2006-01.com.openfiler:c1vg-vol07-lun-0 -> ../../sdg
lrwxrwxrwx 1 root root 9 Dec 16 15:35 ip-200.200.20.6:3260-iscsi-iqn.2006-01.com.openfiler:c1vg-vol08-lun-0 -> ../../sdh
lrwxrwxrwx 1 root root 9 Dec 16 15:35 ip-200.200.20.6:3260-iscsi-iqn.2006-01.com.openfiler:c1vg-vol09-lun-0 -> ../../sdi
lrwxrwxrwx 1 root root 9 Dec 16 15:35 ip-200.200.20.6:3260-iscsi-iqn.2006-01.com.openfiler:c1vg-vol10-lun-0 -> ../../sdj

Repeat this step on ALL nodes in the cluster.

Task #3d: Configure automatic iscsi client login (ALL nodes).

To ensure the iscsi client on each node logs into the iSCSI targets each time the system is booted or the iscsid service is restarted, automatic login must be configured. To do that, run these commands, again making sure you reference the storage network IP addresses (200.200.20.x):

[root@racnode1 scripts]# iscsiadm -m node -T iqn.2006-01.com.openfiler:c1vg-vol01 -p 200.200.20.6 --op update -n node.startup -v automatic
[root@racnode1 scripts]# iscsiadm -m node -T iqn.2006-01.com.openfiler:c1vg-vol02 -p 200.200.20.6 --op update -n node.startup -v automatic
[root@racnode1 scripts]# iscsiadm -m node -T iqn.2006-01.com.openfiler:c1vg-vol03 -p 200.200.20.6 --op update -n node.startup -v automatic
[root@racnode1 scripts]# iscsiadm -m node -T iqn.2006-01.com.openfiler:c1vg-vol04 -p 200.200.20.6 --op update -n node.startup -v automatic
[root@racnode1 scripts]# iscsiadm -m node -T iqn.2006-01.com.openfiler:c1vg-vol05 -p 200.200.20.6 --op update -n node.startup -v automatic
[root@racnode1 scripts]# iscsiadm -m node -T iqn.2006-01.com.openfiler:c1vg-vol06 -p 200.200.20.6 --op update -n node.startup -v automatic
[root@racnode1 scripts]# iscsiadm -m node -T iqn.2006-01.com.openfiler:c1vg-vol07 -p 200.200.20.6 --op update -n node.startup -v automatic
[root@racnode1 scripts]# iscsiadm -m node -T iqn.2006-01.com.openfiler:c1vg-vol08 -p 200.200.20.6 --op update -n node.startup -v automatic
[root@racnode1 scripts]# iscsiadm -m node -T iqn.2006-01.com.openfiler:c1vg-vol09 -p 200.200.20.6 --op update -n node.startup -v automatic
[root@racnode1 scripts]# iscsiadm -m node -T iqn.2006-01.com.openfiler:c1vg-vol10 -p 200.200.20.6 --op update -n node.startup -v automatic

Note, these commands do not produce any output. Which is a little unfriendly, but don’t be scared!

Repeat this step on ALL nodes in the cluster.

Task #3e: Partition the disks (ONE node only).

Now that the iSCSI targets coming from the Openfiler shared storage server are visible as disk devices, we need to partition them. Each drive will have a single partition. Repeat the following command for /dev/sda through /dev/sdj:

[root@racnode1 scripts]# fdisk /dev/sda
Device contains neither a valid DOS partition table, nor Sun, SGI or OSF disklabel
Building a new DOS disklabel with disk identifier 0xdce55f18.
Changes will remain in memory only, until you decide to write them.
After that, of course, the previous content won't be recoverable.

Warning: invalid flag 0x0000 of partition table 4 will be corrected by w(rite)

WARNING: DOS-compatible mode is deprecated. It's strongly recommended to
         switch off the mode (command 'c') and change display units to
         sectors (command 'u').

Command (m for help): n
Command action
   e   extended
   p   primary partition (1-4)
p

Partition number (1-4): 1
First cylinder (1-10240, default 1):
Using default value 1
Last cylinder, +cylinders or +size{K,M,G} (1-10240, default 10240):
Using default value 10240

Command (m for help): w
The partition table has been altered!
Calling ioctl() to re-read partition table.
Syncing disks.

Note, partitioning the disk devices is only done from ONE node.

Task #3f: Configure persistent device names for iSCSI targets (ALL nodes).

The discovery of the iSCSI targets by Oracle Linux just happened to map c1vg-vol01-lun-0 to /dev/sda, c1vg-vol02-lun-0 to /dev/sdb and so on. After the next reboot, these mappings may change. We need to avoid that and instead have a consistent iSCSI target to disk device mapping each time the server starts. This is achieved using an Oracle Linux device mapping utility called udev.

Note, using Oracle’s ASM library driver (Task #4 Steps #1 & #2)  can also achieve the same thing udev does, i.e. a consistent mapping between iSCSI targets and disk devices.

Thus, configuring udev is, strictly speaking, unnecessary.

That said, you can use udev or ASMlib or both if you want to. This procedure uses both, just to demonstrate both ways of doing the same thing, one with Oracle’s help (ASMlib) and one without (udev).

Fortunately, each iSCSI target has a unique ID which remains consistent across reboots. So we need to find out what that is. Repeat the following command for /dev/sda through /dev/sdj, making a note of the output for each disk device:

[root@racnode1 ~]# scsi_id -g -u -d /dev/sda
14f504e46494c45524941587a31772d686f4f742d7650774a

Next, create a udev rules script called 99-oracle-iscsi.rules and place it in the /etc/udev/rules.d directory where udev can find it.

Note, the 99 signifies the order in which the script will run. Lower numbered scripts run first. The “oracle-iscsi” part is just our way of identifying the script as being related to Oracle. The “.rules” part makes it a rules script.

The rules script will contain 10 lines, one for each disk device. Each line will look like this:

KERNEL=="sd?1", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -u -d /dev/$parent",
RESULT=="14f504e46494c45524941587a31772d686f4f742d7650774a", NAME="iscsi/asm-disk01", 
OWNER="root", GROUP="disk", MODE="0660"

Only two entries need to be changed for each disk device. The RESULT parameter needs to be set to the ID output by the scsi_id command for each disk device. The NAME parameter needs to be set to the naming convention you wish to follow. In the above example, I’m using iscsi/asm-disk01 for /dev/sda. The NAME parameter for /dev/sdb will therefore be, iscsi/asm-disk02 and so on.

Once the file is saved, it should have this ownership and permissions:

[root@racnode1 rules.d]# pwd
/etc/udev/rules.d

[root@racnode1 rules.d]# ls -l | grep oracle
-rw-r--r-- 1 root root 2030 Dec 16 17:06 99-oracle-iscsi.rules

To test the udev rules script, run this command for each disk device (/block/sda/sda1 through /block/sdj/sdj1):

[root@racnode1 ~]# udevadm test /block/sda/sda1

Then check the /dev/iscsi directory and the persistent iscsi disk devices should be listed:

[root@racnode1 iscsi]# pwd
/dev/iscsi

[root@racnode1 iscsi]# ls -l
brw-rw---- 1 root disk 8,   1 Dec 16 17:12 asm-disk01
brw-rw---- 1 root disk 8,  17 Dec 16 17:16 asm-disk02
brw-rw---- 1 root disk 8,  33 Dec 16 17:19 asm-disk03
brw-rw---- 1 root disk 8,  49 Dec 16 17:20 asm-disk04
brw-rw---- 1 root disk 8,  65 Dec 16 17:20 asm-disk05
brw-rw---- 1 root disk 8,  81 Dec 16 17:20 asm-disk06
brw-rw---- 1 root disk 8,  97 Dec 16 17:20 asm-disk07
brw-rw---- 1 root disk 8, 113 Dec 16 17:21 asm-disk08
brw-rw---- 1 root disk 8, 129 Dec 16 17:21 asm-disk09
brw-rw---- 1 root disk 8, 145 Dec 16 17:21 asm-disk10

Repeat this step for ALL nodes in the cluster.

Now reboot ALL nodes.

Task #4: Configure Oracle ASMlib.

Only a couple of steps to follow here to configure ASMLib.

Task #4a: Configure Oracle ASM library driver (ALL nodes).

Verify the oracleasm packages are installed:

[root@racnode1 scripts]# rpm -qa | grep oracleasm
oracleasmlib-2.0.12-1.el6.x86_64
oracleasm-support-2.1.8-1.el6.x86_64

Then, as the root user, run these commands:

[root@racnode1 ~]# oracleasm configure -i
Configuring the Oracle ASM library driver.
This will configure the on-boot properties of the Oracle ASM library
driver. The following questions will determine whether the driver is
loaded on boot and what permissions it will have. The current values
will be shown in brackets ('[]'). Hitting <ENTER> without typing an
answer will keep that current value. Ctrl-C will abort.

Default user to own the driver interface []: grid
Default group to own the driver interface []: asmadmin
Start Oracle ASM library driver on boot (y/n) [n]: y
Scan for Oracle ASM disks on boot (y/n) [y]: y

Writing Oracle ASM library driver configuration: done

[root@racnode1 ~]# oracleasm init
Creating /dev/oracleasm mount point: /dev/oracleasm
Loading module "oracleasm": oracleasm
Configuring "oracleasm" to use device physical block size
Mounting ASMlib driver filesystem: /dev/oracleasm

Repeat this step for ALL nodes in the cluster.

Task #4b: Create ASM disks (ONE node only).

Using the oracleasm createdisk command, create an ASM disk for each iSCSI device created in Task #3 Step #6 (/dev/iscsi/asm-disk01 through /dev/iscsi/asm-disk10):

[root@racnode1 disks]# oracleasm createdisk ASMDISK01 /dev/iscsi/asm-disk01
Writing disk header: done
Instantiating disk: done

Note, if you chose not to use udev, then the command to create an ASM disk using a block device would be:

[root@racnode1 disks]# oracleasm createdisk ASMDISK01 /dev/sda1

Note, the remnants of ASM disk headers can be left behind even when a drive has been re-partitioned. This can prevent the new ASM disk from being instantiated. You will see this when that happens:

[root@racnode1 disks]# oracleasm createdisk ASMDISK01 /dev/iscsi/asm-disk01
Device "/dev/iscsi/asm-disk01" is already labeled for ASM disk "ASMDISK01"

If you see this, you need to destroy the old disk header. A simple way to do that is documented here. Once the header is wiped, the oracleasm createdisk command can create a new one. Once all the ASM disks have been created, they can be listed using this command:

[root@racnode1 disks]# oracleasm listdisks
ASMDISK01
ASMDISK02
ASMDISK03
ASMDISK04
ASMDISK05
ASMDISK06
ASMDISK07
ASMDISK08
ASMDISK09
ASMDISK10

They can also be seen at the OS level.

Note, the user and group ownership is grid:asmadmin:

[root@racnode1 disks]# pwd
/dev/oracleasm/disks

[root@racnode1 disks]# ls -l
brw-rw---- 1 grid asmadmin 8,   1 Dec 16 18:31 ASMDISK01
brw-rw---- 1 grid asmadmin 8,  17 Dec 16 18:23 ASMDISK02
brw-rw---- 1 grid asmadmin 8,  33 Dec 16 18:23 ASMDISK03
brw-rw---- 1 grid asmadmin 8,  49 Dec 16 18:23 ASMDISK04
brw-rw---- 1 grid asmadmin 8,  65 Dec 16 18:24 ASMDISK05
brw-rw---- 1 grid asmadmin 8,  81 Dec 16 18:26 ASMDISK06
brw-rw---- 1 grid asmadmin 8,  97 Dec 16 18:26 ASMDISK07
brw-rw---- 1 grid asmadmin 8, 113 Dec 16 18:26 ASMDISK08
brw-rw---- 1 grid asmadmin 8, 129 Dec 16 18:26 ASMDISK09
brw-rw---- 1 grid asmadmin 8, 145 Dec 16 18:26 ASMDISK10

Note, you can determine the mapping between the ASM disks and the block devices by examining the major and minor device numbers. For example, the major number of ASMDISK01 is 8 and its minor number is 1. Similarly, the major number for ASMDISK02 is also 8 and its minor number is 17. Compare the major and minor numbers with a directory listing of the block devices:

[root@racnode1 disks]# ls -l /dev/sd*1
brw-rw---- 1 root disk 8,   1 Jan 26 13:55 /dev/sda1
brw-rw---- 1 root disk 8,  17 Jan 26 13:55 /dev/sdb1
brw-rw---- 1 root disk 8,  33 Jan 26 13:55 /dev/sdc1
brw-rw---- 1 root disk 8,  49 Jan 26 13:55 /dev/sdd1
brw-rw---- 1 root disk 8,  65 Jan 26 13:55 /dev/sde1
brw-rw---- 1 root disk 8,  81 Jan 26 13:55 /dev/sdf1
brw-rw---- 1 root disk 8,  97 Jan 26 13:55 /dev/sdg1
brw-rw---- 1 root disk 8, 113 Jan 26 13:55 /dev/sdh1
brw-rw---- 1 root disk 8, 129 Jan 31 14:05 /dev/sdi1
brw-rw---- 1 root disk 8, 145 Jan 26 13:55 /dev/sdj1

So we can see that ASMDISK01 maps to /dev/sda1, ASMDISK02 maps to /dev/sdb1 and so on.

The other node will not see the ASM disks immediately. Use this command to instantiate the disks on racnode2:

[root@racnode2 ~]# oracleasm scandisks
Reloading disk partitions: done
Cleaning any stale ASM disks...
Scanning system for ASM disks...
Instantiating disk "ASMDISK08"
Instantiating disk "ASMDISK10"
Instantiating disk "ASMDISK05"
Instantiating disk "ASMDISK09"
Instantiating disk "ASMDISK07"
Instantiating disk "ASMDISK01"
Instantiating disk "ASMDISK04"
Instantiating disk "ASMDISK02"
Instantiating disk "ASMDISK03"
Instantiating disk "ASMDISK06"

Note, for Oracle Database 12c there is a new mechanism called the ASM Filter Driver. Obviously we haven’t used that here. Maybe when we re-visit this section in the future we can migrate to using that.

So there you have it. Shared storage configured and ready to go. Everything is now in place for a successful install of the Grid Infrastructure and database software. See you in Part 10 for that.

If you have any comments or questions about this post, please use the Contact form here.

Build Your Own Oracle Infrastructure: Part 8 – SSH, DNS & CVU.

2

It’s acronym time boys and girls! Before we get to the fun part of configuring the shared storage in Part 9, we need setup a few more slightly less fun items.

Secure Shell or SSH and Domain Name System or DNS.

After that we take the thrill ride of running the Cluster Verification Utility or CVU for the first time. This stuff is pretty simple, so let’s crack on.

Quick links to all the tasks:

Task #1: Setup SSH.

The RAC node from where you install the Grid Infrastructure and Oracle Database software effectively becomes the node from which you perform ALL subsequent installs. Hence, the software only needs to reside on that one server and is copied from that server to all the other nodes in the cluster as part of the installation process. Consequently, this node needs remote access to the same user account on all the other nodes without being prompted for a password.

User equivalency is effectively the setup of identical user accounts on all nodes. Identical means the same username, UID, GIDs and password. The accounts which require user equivalency are oracle and grid.

The installation of the Grid Infrastructure and database software can setup SSH for you, but you’ll need SSH configured before you get to that stage in order for the CVU to weave its magic on remote nodes. That’s why we’re setting it up ahead of the software installs. There are 8 steps in total.

Task #1a: Login as oracle and create an ssh directory (all nodes).

[root@racnode1 ~]# su - oracle
[oracle@racnode1 ~]$ pwd
/home/oracle

[oracle@racnode1 ~]$ mkdir .ssh
[oracle@racnode1 ~]$ chmod 700 .ssh

[oracle@racnode1 ~]$ ls -la | grep ssh 
drwx------ 2 oracle oinstall 4096 Dec 13 15:41 .ssh  

*** Repeat for all nodes in the cluster. ***

Task #1b: Generate RSA Keys (all nodes).

[oracle@racnode1 ~]$ ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/home/oracle/.ssh/id_rsa):
Enter passphrase (empty for no passphrase): <just press the Enter key>
Enter same passphrase again: <just press the Enter key>
Your identification has been saved in /home/oracle/.ssh/id_rsa.
Your public key has been saved in /home/oracle/.ssh/id_rsa.pub.
The key fingerprint is:
3a:46:e5:58:c8:c1:f9:de:63:4f:d7:e4:29:d9:aa:b7 oracle@racnode1.mynet.com
The key's randomart image is:
<snipped>

*** Repeat for all nodes in the cluster. ***

Task #1c: Generate DSA Keys (all nodes).

[oracle@racnode1 ~]$ ssh-keygen -t dsa
Generating public/private dsa key pair.
Enter file in which to save the key (/home/oracle/.ssh/id_dsa):
Enter passphrase (empty for no passphrase): <just press the Enter key>
Enter same passphrase again: <just press the Enter key>
Your identification has been saved in /home/oracle/.ssh/id_dsa.
Your public key has been saved in /home/oracle/.ssh/id_dsa.pub.
The key fingerprint is:
32:40:a2:1f:e6:85:23:2f:80:e6:a0:15:9f:5e:01:9e oracle@racnode1.mynet.com
The key's randomart image is:
<snipped>

*** Repeat for all nodes in the cluster. ***

Task #1d: Create the authorized_keys file (all nodes).

[oracle@racnode1 ~]$ touch ~/.ssh/authorized_keys

[oracle@racnode1 ~]$ ls -l .ssh 
-rw-r--r-- 1 oracle oinstall    0 Dec 13 15:55  authorized_keys 
-rw------- 1 oracle oinstall  668 Dec 13 15:47 id_dsa  
-rw-r--r-- 1 oracle oinstall  617 Dec 13 15:47 id_dsa.pub 
-rw------- 1  oracle oinstall 1675 Dec 13 15:46 id_rsa 
-rw-r--r-- 1 oracle oinstall   409 Dec 13 15:46 id_rsa.pub  

*** Repeat for all nodes in the cluster. ***

Task #1e: Capture Key Fingerprints to the authorized_keys file (first node).

First, ssh into the node you’re logged into and capture the output to authorized_keys.

[oracle@racnode1 ~]$ cd .ssh

[oracle@racnode1 .ssh]$ ssh racnode1 cat /home/oracle/.ssh/id_rsa.pub >> authorized_keys  
The authenticity of host 'racnode1 (200.200.10.11)' can't be  established. 
RSA key fingerprint is  35:a5:78:68:36:c3:c2:42:f5:df:da:5f:2c:56:2b:a7. 
Are you sure you want  to continue connecting (yes/no)? yes 
Warning: Permanently added 'racnode1,200.200.10.11' (RSA) to the list of known hosts. 
oracle@racnode1's password: <enter the oracle password>

[oracle@racnode1 .ssh]$ ssh racnode1 cat /home/oracle/.ssh/id_dsa.pub >> authorized_keys 

*** Repeat these steps by using ssh to connect to every other node in the cluster, again capturing the output into authorized_keys. ***

Task #1f: Copy the authorized_keys file to every node.

The authorized_keys file on the first node now contains all the data which all the other nodes need. Use scp to copy it to the other nodes.

[oracle@racnode1 .ssh]$ scp authorized_keys racnode2:/home/oracle/.ssh
oracle@racnode2's password: <enter the oracle password>
authorized_keys                          100% 2052     2.0KB/s   00:00

*** Repeat the scp to each node in the cluster. ***

Task #1g: Secure the authorized_keys file (all nodes).

[oracle@racnode1 .ssh]$ chmod 600 authorized_keys

*** Repeat for all nodes in the cluster. ***

Task #1h: Test passwordless connectivity both ways between all nodes.

[oracle@racnode1 .ssh]$ ssh racnode2 date
Sun Dec 13 16:30:13 CST 2015
[oracle@racnode2 .ssh]$ ssh racnode1 date
The authenticity of host 'racnode1 (200.200.10.11)' can't be established.
RSA key fingerprint is 35:a5:78:68:36:c3:c2:42:f5:df:da:5f:2c:56:2b:a7.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'racnode1,200.200.10.11' (RSA) to the list of known hosts.
Sun Dec 13 16:30:32 CST 2015

[oracle@racnode2 .ssh]$ ssh racnode1 date
Sun Dec 13 16:30:35 CST 2015

Once you have SSH setup (also known as “user equivalency”) for the oracle user, repeat the same 8 steps for the grid user.

Task #2: Configure DNS.

For the most part you can get away with using /etc/hosts files to resolve hostnames. However, with the introduction of Single Client Access Name (SCAN) in Oracle Database 11g Release 2, things got a little more complicated. A cluster SCAN must now resolve to 3 IP addresses, so a /etc/hosts files won’t help.

There is a ‘smoke and mirrors’ way to get a tweaked nslookup script to return 3 IP addresses, but why not just configure DNS properly? In a production environment, there would already be a dedicated DNS server, but in our environment using a whole VM just for DNS would be excessive to say the least. The smart way to go would be to use a server that’s used for something else and is always on. Top of the list is our Oracle Enterprise Manager Cloud Control server, oraemcc.mynet.com (IP: 200.200.10.16). It won’t mind and DNS won’t interfere with its other duties.

Whenever I’ve asked a production Network Administrator to set this up for me, I’ve sometimes (not always) got the eye roll and look of exasperation. Like I’m asking them to split the atom or something. In truth, setting up DNS isn’t that difficult, but it is fiddly. One semi colon, bracket or period out of place and the whole thing doesn’t work and more infuriatingly, won’t tell you why. So, with your keenest eye on the prize, let’s configure DNS using these 7 steps.

Task #2a: Install the bind package.

DNS functionality is provided by installing the bind package. This can be done using yum:

[root@oraemcc ~]# yum install bind-libs bind bind-utils

Task #2b: Edit /etc/named.conf.

The installation of bind will create the file /etc/named.conf. It will contain multiple lines, most of which you will not need to edit. Before you start to edit this file, take a copy of it in case you need to start over. You will also need to have several pieces of information to hand to correctly edit this and subsequent files. They are:

Information Value
Hostname of your DNS Server oraemcc.mynet.com
IP Address of your DNS Server 200.200.10.16
IP Address of your ISP’s DNS Servers 8.8.8.8 & 8.8.4.4
Internal network domain name mynet.com

The items which need to be edited are shown in red in this example /etc/named/conf file. Copy this file to your own system and make any necessary changes. When you edit your file be VERY CAREFUL with the double quotes and periods. Missing something seemingly so innocent will make you wish you hadn’t. Three important points about this file.

  1. The forwarders line is there so server names which are not on your internal network can still be resolved. These servers would likely be on the internet, so your ISP’s DNS servers handle those lookups instead.
  2. The zone “10.200.200.in-addr.arpa.” line is there to facilitate reverse lookups. The 10.200.200 part describes the first three octets of the IP address your DNS server is on, but in reverse order. Our DNS server has the IP address 200.200.10.16, hence 10.200.200.
  3. The ownership and file permissions are crucial. Get these wrong and DNS won’t startup.
[root@oraemcc ~]# ls -l /etc/named.conf
-rw-r----- 1 root named 1114 Aug 19 17:15 /etc/named.conf

Notice the file is owned by root, with named as the group. The file permissions are 640. If your file does not match this, then run these commands:

[root@oraemcc ~]# chgrp named /etc/named.conf
[root@orasemcc ~]# chmod 640 /etc/named.conf

Task #2c: Create /var/named/mynet.com.zone file.

The next step is to create your /var/named/mynet.com.zone file which will resolve hostnames on your domain, mynet.com.

The items which need to be edited are shown in red in this example /var/named/mynet.com.zone file. Copy this file to your own system and make any necessary changes. Ensure this file has the correct group ownership and file permissions:

[root@oraemcc ~]# chgrp named /var/named/mynet.com.zone
[root@oraemcc ~]# chmod 640 /var/named/mynet.com.zone

[root@oraemcc ~]# ls -l /var/named/mynet.com.zone
-rw-r----- 1 root named 1133 Dec 21 12:44 /var/named/mynet.com.zone

Task #2d: Create /var/named/10.200.200.in-addr.arpa file.

This step creates the /var/named/10.200.200.in-addr.arpa file which does reverse lookups for addresses on the 200.200.10.x network. It’s not strictly necessary to implement reverse lookups, but since we’re in the neighborhood we may as well.

The items which need to be edited are shown in red in this example /var/named/10.200.200.in-addr.arpa file. Copy this file to your own system and make any necessary changes. Again, ensure the file has the correct group and file permissions.

[root@oraemcc ~]# chgrp named /var/named/10.200.200.in-addr.arpa
[root@oraemcc ~]# chmod 640 /var/named/10.200.200.in-addr.arpa

[root@oraemcc ~]# ls -l /var/named/10.200.200.in-addr.arpa 
-rw-r----- 1 root  named  916 Dec 21 12:26 10.200.200.in-addr.arpa  

Task #2e: Edit /etc/resolv.conf.

Create or edit a /etc/resolv.conf file on each server which will use your DNS setup. These are the only lines which should be in this file:

search mynet.com
nameserver 200.200.10.16

It is important that is file be identical across all the RAC server nodes as it’s one of the things the CVU checks for. This is the file which Network Manager trashes each time networking starts up on a server. So it’s important that this file remain intact and consistent.

Task #2f: Start the DNS service.

[root@oraemcc ~]# service named start
Starting named:                                            [  OK  ]

Checking the status of DNS generates some interesting output:

[root@oraemcc ~]# service named status
version: 9.8.2rc1-RedHat-9.8.2-0.37.rc1.el6_7.5
CPUs found: 2
worker threads: 2
number of zones: 21
debug level: 0
xfers running: 0
xfers deferred: 0
soa queries in progress: 0
query logging is OFF
recursive clients: 0/0/1000
tcp clients: 0/100
server is up and running
named (pid  9931) is running...

Task #2g: Test DNS is working.

We have configured DNS and have started the service, so now’s the time to make sure it actually works.

First, test a hostname lookup:

[root@racnode1 ~]# nslookup racnode2
Server:         200.200.10.16
Address:        200.200.10.16#53

Name:   racnode2.mynet.com
Address: 200.200.10.12

Next, test the lookup of the cluster SCAN:

[root@racnode1 ~]# nslookup cluster1-scan
Server:        200.200.10.16
Address:       200.200.10.16#53

Name:   cluster1-scan.mynet.com
Address: 200.200.10.122
Name:   cluster1-scan.mynet.com
Address: 200.200.10.120
Name:   cluster1-scan.mynet.com
Address: 200.200.10.121

Finally, test a reverse lookup on an IP address:

[root@racnode1 ~]# nslookup 200.200.10.120
Server:        200.200.10.16
Address:       200.200.10.16#53
120.10.200.200.in-addr.arpa     name = cluster1-scan.mynet.com.

Bingo! It works! 🎆

Task #3: Run the CVU.

The Cluster Verification Utility (CVU) script, runcluvfy.sh, is included with the 12c Grid Infrastructure software. You should have already downloaded and unzipped these files. If not, refer to this prior step.

Login to racnode1 as the grid user, locate the CVU script, then run the script with the parameters shown:

[grid@racnode1 ~]$ cd ./media/gi_12102/grid
[grid@racnode1 grid]$ ./runcluvfy.sh stage -pre crsinst -n racnode1,racnode2 -verbose 2>&1|tee cvu.out

This will run the CVU, echo its output to the screen and copy the output to a file called cvu.out.

The vast majority of the checks passed. Therefore, it only really makes sense to highlight the three which failed and more importantly what to do about them.

Failure #1: IPv6 Addressing.

The first error has to do with IPv6 addressing. Since we’re not using IPv6 (one day perhaps), we could ignore this error. However, IPv6 interface information will come up in the Grid Infrastructure installation which only serves to confuse matters. I therefore prefer to remove the IPv6 references. These are the errors:

Check: TCP connectivity of subnet "2002:4b49:1933::"

Source                          Destination                     Connected?
------------------------------ ------------------------------ ----------------
racnode1 : 2002:4b49:1933:0:221:f6ff:fe04:4298 racnode1 : 2002:4b49:1933:0:221:f6ff:fe04:4298 passed
racnode2 : 2002:4b49:1933:0:221:f6ff:fed2:45a0 racnode1 : 2002:4b49:1933:0:221:f6ff:fe04:4298 failed 
ERROR:
PRVG-11850 : The system call "connect" failed with error "13" while executing exectask on node 
"racnode2" Permission denied
racnode1 : 2002:4b49:1933:0:221:f6ff:fe04:4298 racnode2 : 2002:4b49:1933:0:221:f6ff:fed2:45a0 failed
ERROR:
PRVG-11850 : The system call "connect" failed with error "13" while executing exectask on node 
"racnode1" Permission denied
racnode2 : 2002:4b49:1933:0:221:f6ff:fed2:45a0 racnode2 : 2002:4b49:1933:0:221:f6ff:fed2:45a0 passed
Result: TCP connectivity check failed for subnet "2002:4b49:1933::"

You can confirm the presence of IPv6 addressing by checking the output of the ifconfig command. For example:

[root@racnode1 ~]# ifconfig eth0
eth0     Link encap:Ethernet HWaddr 00:21:F6:04:42:98
         inet addr:200.200.10.11 Bcast:200.200.10.255 Mask:255.255.255.0
         inet6 addr: fe80::221:f6ff:fe04:4298/64 Scope:Link
         inet6 addr: 2002:4b49:1933:0:221:f6ff:fe04:4298/64 Scope:Global
         UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
         RX packets:529808 errors:0 dropped:46361 overruns:0 frame:0
         TX packets:43797 errors:0 dropped:0 overruns:0 carrier:0
         collisions:0 txqueuelen:1000
         RX bytes:79241367 (75.5 MiB) TX bytes:18208947 (17.3 MiB)

The clues are the lines beginning with “inet6”. To disable IPv6 across the board, add the following line to the /etc/sysctl.conf file:

[root@racnode1 ~]# vi /etc/sysctl.conf

# disable IPv6 support on all network interfaces:
net.ipv6.conf.all.disable_ipv6 = 1

If you only wanted to disable IPv6 support for a specific interface, for example eth0, then the entry in /etc/sysctl.conf would look like this:

# disable IPv6 support on the eth0 network interfaces:
net.ipv6.conf.eth0.disable_ipv6 = 1

To have this change take effect, either reboot or run this command:

[root@racnode1 ~]# sysctl -p /etc/sysctl.conf

To confirm the change has taken effect, re-run the ipconfig command:

[root@racnode1 ~]# ifconfig eth0
eth0     Link encap:Ethernet HWaddr 00:21:F6:04:42:98
         inet addr:200.200.10.11 Bcast:200.200.10.255 Mask:255.255.255.0
         UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
         RX packets:6481 errors:0 dropped:456 overruns:0 frame:0
         TX packets:464 errors:0 dropped:0 overruns:0 carrier:0
         collisions:0 txqueuelen:1000
         RX bytes:1277621 (1.2 MiB) TX bytes:92893 (90.7 KiB)

Failure #2: Network Time Protocol (NTP).

The second failure reported in the CVU output had to do with Network Time Protocol (NTP). You will recall from the installation of Oracle Linux on racnode1, we chose not to synchronize the date and time over the network. That’s because I prefer to use Oracle’s Cluster Time Synchronization Service instead. These are the NTP errors reported by the CVU:

Starting Clock synchronization checks using Network Time Protocol(NTP)...
Checking existence of NTP configuration file "/etc/ntp.conf" across nodes
Node Name                             File exists?
------------------------------------ ------------------------
racnode2                             yes
racnode1                             yes

The NTP configuration file "/etc/ntp.conf" is available on all nodes
NTP configuration file "/etc/ntp.conf" existence check passed
No NTP Daemons or Services were found to be running
PRVF-5507 : NTP daemon or service is not running on any node but NTP 
configuration file exists on the following node(s):
racnode2,racnode1
Result: Clock synchronization check using Network Time Protocol(NTP) failed

When Grid Infrastructure is installed, it will detect if NTP is running. If it is, then Cluster Time Synchronization Service is started in “observer mode”. If NTP is not running, then Cluster Time Synchronization Service is started in “active mode” and that’s what we want. However, if NTP is down but the installer sees the NTP configuration file /etc/ntp.conf, then it assumes NTP will spontaneously start by itself and the world will end as we know it. To avoid that from happening, we need to ensure NTP is down, stays down and can’t find its configuration file. We do that by renaming it:

[root@racnode1 ~]# service ntpd status
ntpd is stopped

[root@racnode1 ~]# chkconfig ntpd off

[root@racnode1 ~]# mv /etc/ntp.conf /etc/ntp.conf.ORIG

Thus the world is saved. Hurrah! ☑

Failure #3: Insufficient Swap Space.

I originally built the racnodes with 4 GB of RAM, but things ran a bit slow so I upped the RAM to 6 GB. One of the great things about virtualization is the ability to add more resource to a VM through software. Assuming the resources are physically present and available.

Note, Oracle VM cannot over allocate physical server memory. You can only allocate what you physically have available in the Oracle VM Server.

Since I originally had 4 GB of RAM allocated, I also configured 4 GB of swap space. Increasing the amount of RAM means you should also increase the amount of swap space. Here’s the official Oracle sizing table:

RAM Swap Space
Between 1 GB & 2 GB 1.5x RAM
Between 2 GB & 16 GB Same size as RAM
Greater than 16 GB 16 GB

Here is what the CVU reported regarding memory and swap space:

Check: Total memory
Node Name     Available                 Required                 Status  
------------ ------------------------ ------------------------ ----------
racnode2     5.8369GB (6120432.0KB)   4GB (4194304.0KB)         passed  
racnode1     5.8369GB (6120432.0KB)   4GB (4194304.0KB)         passed  
Result: Total memory check passed

Check: Swap space
Node Name     Available                 Required                 Status  
------------ ------------------------ ------------------------ ----------
racnode2     4GB (4194300.0KB)         5.8369GB (6120432.0KB)   failed  
racnode1     4GB (4194300.0KB)         5.8369GB (6120432.0KB)   failed  
Result: Swap space check failed

There are a few ways to increase the amount of swap space on a Linux server. If you have sufficient free disk you can add an additional swap partition. If the current swap space is in a logical volume, you could potentially re-size it to increase the amount of swap space. The quickest and simplest way to increase the available swap space is to add a swap file. The commands to do that are here. Since this is a non-production system, swapping won’t be a problem so we can safely ignore this failure. If it were a production system, we’d obviously do the right thing….blame the Systems Administrator! 👺

Well, that about wraps it up for SSH, DNS and CVU.

See you next time for all things shared storage in Part 9.

If you have any comments or questions about this post, please use the Contact form here.

Build Your Own Oracle Infrastructure: Part 7 – Build Oracle RAC Servers.

Now it’s time to start the process of building our first Oracle RAC system. 
To do that, we will need two Oracle Database 12c server nodes running Oracle Linux.
This installment shows how to build that rather efficiently using a cool capability of Oracle VM.

Quick links to all the tasks:

Task #1: Install Oracle Linux on RACNODE1_VM.

In Oracle VM Manager, click on the Servers and VMs tab, highlight RACNODE1_VM, then click on the green arrow head Start icon to start the VM:

This will start the Oracle Linux 6 installation because this VM has the V52218-01.iso file loaded in its virtual CD/DVD drive and that device is listed first in the Boot Order. With the Install or Upgrade and existing system option highlighted, press Enter:

Skip the media test:

Click Next:

Choose your language, then click Next:

Choose your keyboard layout, then click Next:

Choose Basic Storage Devices, then click Next:

Click the ‘Yes, discard any data’ button, then click Next:

Enter the hostname “racnode1.mynet.com”, then click Configure Network:

If you’re following this series, then you should be getting good at configuring the network settings. So just use these values to configure the 3 network interfaces:

Field eth0 (Public) eth1 (Storage) eth2 (Private)
IP Address 200.200.10.11 200.200.20.11 200.200.30.11
Netmask (auto-populates) 24 24 24
Gateway 200.200.10.1 200.200.20.1 N/A
Primary DNS 200.200.10.1 N/A N/A
Secondary DNS 8.8.8.8 N/A N/A

With the network configured, choose your timezone then click Next:

Choose a root password, then click Next:

Choose Use All Space, check Review and modify partitioning layout, then click Next:

Using the right pointing arrow key, move the Xen Virtual Block Device (40960 MB) over to the Install Target Devices pane, then click Next:

Edit the following screen using these values, then click Next:

Logical Volume Size (MB)
lv_root 34312
lv_swap 6144

Click Format:

Click Write changes to disk:

Check Install boot loader on /dev/xvdb. Select Oracle Linux Server 6 (Label) and /dev/mapper/vg_racnode1-lv_root (Device). Click Next:

Select Database Server and Customize now, then click Next:

Use the following to complete the next screen, then click Next:

Category Option
Base System Leave the default options
Servers Leave the default options
Web Services Leave the defaults options
Databases Uncheck everything
System Management Leave the default options
Virtualization Leave the default options
Desktops Check everything
Applications Check Internet Browser
Development Leave the default options
UEK3 kernel repo Leave the default options
Lanugages Leave the default options

The Oracle Linux installation will start installing 1249 packages:

Eventually the installation will complete and you’ll see a rather splendid congratulations screen. Do NOT click Reboot yet:

Go back to Oracle VM Manager. Click on the Servers and VMs tab, highlight RACNODE1_VM, then click on the pencil icon to edit RACNODE1_VM. Click on the Disks tab. Click on the Eject a CDROM icon:

This will remove the V52218-01.iso from the CD/DVD drive, so when you reboot the Oracle Linux installation won’t start over. Click OK. Then return to the ‘congratulations’ screen and click Reboot. You may need to close the console session and open a new one. When the system comes back up, you’ll see this ‘welcome’ screen. Click Forward:

Accept the License Agreement and click Forward:

Choose the option to register at a later time and click Forward:

Click the ‘No thanks’ button, then click Forward:

Rather annoyingly, the harassment continues. Ignore the scaremongering and click Forward:

Don’t create a user here. We’ll be doing that later. Click Forward:

Set the system date and time if necessary. Uncheck the option to synchronize date and time over the network. We’ll be using Oracle’s Cluster Time Synchronization Service later on. If the time is wrong, then it’ll be Oracle’s fault not NTP’s fault. See what I did there? 🙂 Click Forward:

Uncheck Enable kdump, then click Finish:

Click the Yes to agree to a reboot:

Click OK to start the system reboot:

When the system comes back up, you’ll see the familiar login screen:

That’s Oracle Linux 6.6 installed. Time to configure it for Oracle Database 12c RAC duties.

Task #2: Configure Oracle Linux on racnode1.

For reference, the official Oracle Database 12c Release 1 RAC installation guide can be found here.
Our installation has been simplified into the following 12 steps. That’s right, our very own 12 step programme. 🍺

Task #2a: Update Oracle Linux.

Let’s grab the latest and greatest packages and update the Linux kernel. To do that we need to run a yum update. To do that, we need to create a yum repository first. With the yum repository in place, run the update:

[root@racnode1 ~]# yum update

This will take a while. It will ask you to confirm progress in a couple of places, so don’t go to lunch or it’ll be waiting for you when you get back.

Note, this update will upgrade the version of Oracle Linux from 6.6 to 6.7.

Task #2b: Install the Oracle 12c Release 1 Pre-Installation Package.

This package makes a number of changes to Linux which aids in setting things up the way you want them. It’s actually on the Oracle Linux 6 DVD in /server/Packages and can be installed using this command:

[root@racnode1 Packages]# rpm -i oracle-rdbms-server-12cR1-preinstall-1.0-12.el6.x86_64.rpm

Alternatively, you could install it using yum with this command:

[root@racnode1 ~]# yum install oracle-rdbms-server-12cR1-preinstall.x86_64

Either way, check it’s installed using this:

[root@racnode1 ~]# rpm -qa | grep oracle-rdbms-server
oracle-rdbms-server-12cR1-preinstall-1.0-14.el6.x86_64

A number of changes are made to the /etc/password and /etc/group files. The UIDs and GIDs it creates may not be what you want. After you have installed the package, you can use this script to correct the setup and add the additional users and groups you’ll need later.

As a final post package installation step, add these entries to the /etc/security/limits.conf file:

####################################
# for oracle user
####################################
oracle   soft   nofile    8192
oracle   hard   nofile    65536
oracle   soft   nproc     2048
oracle   hard   nproc     16384
oracle   soft   stack     10240
oracle   hard   stack     32768
oracle   soft   core      unlimited
oracle   hard   core      unlimited
oracle   hard   memlock   5500631
####################################
# for grid user
####################################
grid    soft    nofile    8192
grid    hard    nofile    65536
grid    soft    nproc     2048
grid    hard    nproc     16384
grid    soft    stack     10240
grid    hard    stack     32768
grid    soft    core      unlimited
grid    hard    core      unlimited

Now that the oracle and grid users exist, change their passwords.

Task #2c: Edit User Profiles.

Add this line to the .bash_profile for both oracle and grid:

umask 022

This next change is a personal preference thing, but I always unalias the ls and vi commands. I don’t need vim in my life and I find the different colors for different types of file completely unnecessary. Sanity can be returned to your default bash shell by adding the following lines to the .bash_profile for both oracle and grid:

unalias ls
unalias vi

Task #2d: Disable SELinux.

SELinux can cause havoc when trying to instantiate ASM disks and it needs to be disabled. Instructions for doing so can be found here.

Task #2e: Disable the Linux Firewall.

Generally speaking, firewalls are a good thing.

In a production environment, you’d want to work with your Network Administrator and have them open up the ports necessary for everything to work.

However, in our environment we can trust that we won’t hack ourselves and we’ll go ahead and disable the Linux firewall.

Task #2f: Disable Network Manager.

Network Manager has an annoying habit of jumping all over your /etc/resolv.conf file. Thus destroying your neatly crafted configuration. You need to disable Network Manager to prevent this from happening.

Task #2g: Install the iSCSI Initiator.

This can be selected as an installation option or you can add it now using this yum command:

[root@racnode1 ~]# yum install iscsi-initiator-utils

[root@racnode1 ~]# rpm -qa | grep iscsi
iscsi-initiator-utils-6.2.0.873-14.0.1.el6.x86_64

Task #2h: Install ASM Support.

Installing ASM support is done in two stages. First, install the ASMLib package referenced here. Second, use yum to install ASM Support:

[root@racnode1 ~]# rpm -i oracleasmlib-2.0.12-1.el6.x86_64.rpm
[root@racnode1 ~]# yum install oracleasm-support

[root@racnode1 ~]# rpm -qa | grep oracleasm
oracleasmlib-2.0.12-1.el6.x86_64
oracleasm-support-2.1.8-1.el6.x86_64

Task #2i: Install cvuqdisk.

Installing cvuqdisk will enable the Cluster Verification Utility to detect shared storage. Allegedly. It’s included in the 12c Grid Infrastructure downloads referenced here.

You’ll need to copy the two Grid Infrastructure zip files to racnode1. FileZilla does the job. I located these files within the home directory of the grid user:

[root@racnode1 ~]# cd ~grid/media/gi_12102

[root@racnode1 gi_12102]# ls -l
drwxr-xr-x 7 grid oinstall       4096 Dec 16 22:10 grid
-rw-r--r-- 1 grid oinstall 1747043545 Dec 12 15:33 linuxamd64_12102_grid_1of2.zip
-rw-r--r-- 1 grid oinstall  646972897 Dec 12 15:32 linuxamd64_12102_grid_2of2.zip

Unzipping these files created the grid directory. Within the grid directory is an rpm directory and that’s where you’ll find the cvuqdisk rpm:

[root@racnode1 gi_12102]# cd grid/rpm

[root@racnode1 rpm]# ls -l
-rwxr-xr-x 1 grid oinstall 8976 Jun 30  2014 cvuqdisk-1.0.9-1.rpm

Install the rpm using these commands:

[root@racnode1 rpm]# CVUQDISK_GRP=oinstall; export CVUQDISK_GRP

[root@racnode1 rpm]# rpm -iv cvuqdisk-1.0.9-1.rpm
Preparing packages for installation...
cvuqdisk-1.0.9-1

Task #2j: Build the /u01 file system.

RACNODE1_VM had 2 virtual disks allocated to it. The /dev/xvdb disk was used for the installation of Oracle Linux. The /dev/xvdc disk will be used to build the /u01 file system. That’s where all the Oracle software will be installed. The following series of commands will partition the /dev/xvdc disk, build and mount the /u01 file system:

[root@racnode1 ~]# fdisk /dev/xvdc

WARNING: DOS-compatible mode is deprecated. It's strongly recommended to
         switch off the mode (command 'c') and change display units to
         sectors (command 'u').

Command (m for help): n
Command action
   e   extended
   p   primary partition (1-4)
p

Partition number (1-4): 1
First cylinder (1-3916, default 1):
Using default value 1
Last cylinder, +cylinders or +size{K,M,G} (1-3916, default 3916):
Using default value 3916

Command (m for help): w
The partition table has been altered!
Calling ioctl() to re-read partition table.
Syncing disks.

[root@racnode1 ~]# mkfs -t ext4 -m 0 /dev/xvdc1

[root@racnode1 ~]# cd /
[root@racnode1 /]# mkdir /u01
[root@racnode1 /]# mount /dev/xvdc1 /u01

[root@racnode1 /]# df -h
Filesystem            Size  Used  Avail  Use% Mounted on
/dev/mapper/vg_racnode1-lv_root
                       35G   11G    23G   31% /

tmpfs                 2.0G   72K   2.0G    1% /dev/shm
/dev/xvdb1            477M  132M   316M   30% /boot
/dev/xvdc1             30G   44M    30G    1% /u01

Finally, edit the /etc/fstab file so the /u01 file system will be mounted each time the system restarts:

[root@racnode1 ~]# vi /etc/fstab

Add this line:

/dev/xvdc1              /u01                    ext4    defaults        0 0

Task #2k: Add Oracle software directories.

Now the /u01 file system is mounted, we can create the directories we’ll use to install the 12c Grid Infrastructure and Oracle Database 12c software:

[root@racnode1 ~]# mkdir -p /u01/app/12.1.0/grid
[root@racnode1 ~]# mkdir -p /u01/app/grid
[root@racnode1 ~]# mkdir -p /u01/app/oracle
[root@racnode1 ~]# chown -R grid:oinstall /u01
[root@racnode1 ~]# chown oracle:oinstall /u01/app/oracle
[root@racnode1 ~]# chmod -R 775 /u01/

Task #2l: Edit the /etc/hosts file.

Although we will be using DNS to resolve most things within our various networks, it’s very helpful to have all the addressing documented in each RAC node’s /etc/hosts file. All the entries required for every server’s /etc/hosts file can be found here.

Task #3: Clone RACNODE1_VM.

Now that we have the racnode1 server fully configured, we need a second server configured in a similar way. The hard way would be to create another VM then repeat all the steps up to this point to configure a second server. The easy way would be to clone RACNODE1_VM. Let’s go that route!

In Oracle VM Manager, click on the Servers and VMs tab, highlight RACNODE1_VM, then click the Stop icon (the red square). This will shutdown racnode1 and stop the VM.

Highlight RACNODE1_VM, right click then select the Clone or Move option:

Select Create a clone of this VM, then click Next:

Use the following values to complete the next screen, then click OK:

Field Value
Clone to a Virtual Machine
Clone Count 4
Name Index 2
Clone Name RACNODE_VM
Target Server Pool ServerPool1

This has the effect of creating 4 more VMs starting with RACNODE_VM.2 and ending with RACNODE_VM.5, all identical to RACNODE1_VM:

Note, the screen shows a 4096 MB memory allocation. This was increased to 6144 MB on a subsequent re-build Click here for further details.

Highlight RACNODE_VM.2 and click the pencil icon to edit the VM. Change the Name to RACNODE2_VM:

Click the Disks tab and change the names of the 2 virtual disks to RACNODE2_OS and RACNODE2_u01, then click OK:

Click the arrow head to the left of RACNODE2_VM to display its properties:

Task #4: Modify racnode2.

At this point, it’s time to start the VMs. In Oracle VM Manager, click on the Servers and VMs tab, highlight RACNODE1_VM, then click on the green arrow head Start icon to start the VM. Once RACNODE1_VM is up and running, repeat this procedure to start RACNODE2_VM. Connect to the console of RACNODE2_VM by clicking on the Launch Console icon.

Note, you may run into problems starting VMs. Click here for potential solutions/workarounds if you see an error similar to this:

Error: Device 1 (vif) could not be connected. Hotplug scripts not working

After RACNODE2_VM has started up, the first thing you’ll notice is the login screen announces you’re looking at racnode1.mynet.com. RACNODE2_VM is an identical clone of RACNODE1_VM, so it’s hardly surprising the server running inside RACNODE2_VM thinks it’s racnode1. Fortunately this is easy to fix by following the next couple of steps.

Task #4a: Fix the networking.

When cloning RACNODE1_VM, Oracle VM Manager was smart enough to allocate 3 new MAC addresses to the 3 vNICs allocated to each clone. However, the NIC configuration files in Linux don’t know that and will have to be edited manually.

Login to the ‘fake’ racnode1 and locate the NIC configuration files:

[root@racnode1 ~]# cd /etc/sysconfig/network-scripts

[root@racnode1 network-scripts]# ls -l ifcfg-eth*
-rw-r--r--. 3 root root 320 Dec 11 17:36 ifcfg-eth0
-rw-r--r--  3 root root 269 Dec 13 16:13 ifcfg-eth1
-rw-r--r--  3 root root 248 Dec 13 16:13 ifcfg-eth2
  • ifcfg-eth0 is the configuration file for the public network interface.
  • ifcfg-eth1 is the configuration file for the storage network interface.
  • ifcfg-eth2 is the configuration file for the private interconnect network interface.

Each file contains the exact same entries as the equivalent file on the real racnode1. Some of these parameters are consistent across all the servers (e.g. gateways and DNS) and some are specific to an individual server. We need to change the ones which are specific to an individual server. Those parameters are UUID, IPADDR and HWADDR.

The UUID is a Universal Unique IDentifier. It’s a number that Linux gives to each NIC. The entry in the ifcfg-eth* files looks like this:

UUID=eaba99ea-c88e-4cf2-b990-bee55e752e91

To get a new UUID, use this command:

[root@racnode1 ~]# uuidgen eth0

Substitute eth0 for eth1 and eth2 to generate new UUIDs for those interfaces. Once you have 3 new UUIDs, one for each NIC, update the UUID entry in the ifcfg-eth* files.

The IPADDR is the IP address assigned to a given NIC. The entry in the ifcfg-eth* files looks like this:

IPADDR=200.200.10.11

You can get the correct IP addresses for racnode2’s eth0, eth1 and eth2 interfaces from the /etc/hosts file. If clicking the link is too challenging, here’s the cheat sheet:

Server NIC IP Address
racnode2 eth0 200.200.10.12
racnode2 eth1 200.200.20.12
racnode2 eth2 200.200.30.12

Using these IP addresses, update the IPADDR entry in the ifcfg-eth* files.

The HWADDR is basically the MAC address which Oracle VM Manager allocates to the VM’s vNICs. You can see the MAC addresses allocated to RACNODE2_VM in a few different ways.

The vm.cfg contains the MAC addresses for each vNIC in eth0, eth1 and eth2 order. The location of the vm.cfg file is referenced here.

Alternatively, go back to this screen in Oracle VM Manager and click on the Networks tab:

Once again, the Ethernet Network name maps to vNICs like this:

Ethernet Network Linux NIC MAC Address
Management_Public eth0 00:21:F6:D2:45:A0
Shared_Storage_Public eth1 00:21:F6:E6:F6:71
GI_Interconnect_Private eth2 00:21:F6:2F:DE:C5

Using the MAC addresses you obtain from your environment, update the HWADDR entry in the ifcfg-eth* files.

To bring up the network interfaces using the updated and correct values for UUID, IPADDR and HWADDR, each interface must be stopped and re-started. Using eth0 as an example, stopping and re-starting the interface is done with these commands:

[root@racnode1 ~]# ifdown eth0
[root@racnode1 ~]# ifup eth0

Once all 3 interfaces have been re-started, you can view their status using this command:

[root@racnode1 ~]# ifconfig -a

Task #4b: Change the hostname.

The method for changing an Oracle Linux hostname is documented here.

Once the networking and hostname issues have been fixed, I strongly recommend a reboot. If the system comes back up with the correct name (racnode2.mynet.com) and the correct network configuration, you’re ready to move onto the dark arts discussed in Part 8 – SSH, DNS and CVU. See you there!

If you have any comments or questions about this post, please use the Contact form here.

Oracle Data Pump

12c-logo

 

 

 

 

 

Sometimes you just need a list of the available Data Pump utility parameters. Here they are.

Use the following links to jump to the item you need.

Index:

 

Oracle Database 12c Data Pump EXPORT Parameters:

  • ACCESS_METHOD
  • ATTACH
  • CLUSTER
  • COMPRESSION
  • COMPRESSION_ALGORITHM
  • CONTENT
  • DATA_OPTIONS
  • DIRECTORY
  • DUMPFILE
  • ENCRYPTION
  • ENCRYPTION_ALGORITHM
  • ENCRYPTION_MODE
  • ENCRYPTION_PASSWORD
  • ESTIMATE
  • ESTIMATE_ONLY
  • EXCLUDE
  • FILESIZE
  • FLASHBACK_SCN
  • FLASHBACK_TIME
  • FULL
  • HELP
  • INCLUDE
  • JOB_NAME
  • KEEP_MASTER
  • LOGFILE
  • LOGTIME
  • METRICS
  • NETWORK_LINK
  • NOLOGFILE
  • PARALLEL
  • PARFILE
  • QUERY
  • REMAP_DATA
  • REUSE_DUMPFILES
  • SAMPLE
  • SCHEMAS
  • STATUS
  • TABLES
  • TABLESPACES
  • TRANSPORT_FULL_CHECK
  • TRANSPORT_TABLESPACES
  • TRANSPORTABLE
  • VERSION

 

Oracle Database 12c Data Pump IMPORT Parameters:

  • ACCESS_METHOD
  • ATTACH
  • CLUSTER
  • CONTENT
  • DATA_OPTIONS
  • DIRECTORY
  • DUMPFILE
  • ENCRYPTION_PASSWORD
  • ESTIMATE
  • EXCLUDE
  • FLASHBACK_SCN
  • FLASHBACK_TIME
  • FULL
  • HELP
  • INCLUDE
  • JOB_NAME
  • KEEP_MASTER
  • LOGFILE
  • LOGTIME
  • METRICS
  • NETWORK_LINK
  • NOLOGFILE
  • PARALLEL
  • PARFILE
  • PARTITION_OPTIONS
  • QUERY
  • REMAP_DATA
  • REMAP_DATAFILE
  • REMAP_SCHEMA
  • REMAP_TABLE
  • REMAP_TABLESPACE
  • REUSE_DATAFILES
  • SCHEMAS
  • SKIP_UNUSABLE_INDEXES
  • SQLFILE
  • STATUS
  • STREAMS_CONFIGURATION
  • TABLE_EXISTS_ACTION
  • TABLES
  • TABLESPACES
  • TRANSFORM
  • TRANSPORT_DATAFILES
  • TRANSPORT_FULL_CHECK
  • TRANSPORT_TABLESPACES
  • TRANSPORTABLE
  • VERSION
  • VIEWS_AS_TABLES

Build Your Own Oracle Infrastructure: Part 6 – Build OEM Cloud Control Server.

OEM-12c

Now that the basic building blocks of the Oracle Database infrastructure are in place, it’s time to start building out the servers themselves.

A management console helps enormously with the administration and maintenance of servers and that’s what we’re going to build first.

Although hardly the last word in ease of use, intuitive interface design or economical deployment, Oracle Enterprise Manager (OEM) Cloud Control is, nonetheless, the standard Oracle management interface. So that’s what we’re going to use.

OEM installation and configuration has got progressively more complicated over the years. However, there is a simple installation option available which will get us up and running. The steps to get us from here to a shiny new OEM Cloud Control login screen require that we complete these 6 tasks.

Quick links to all the tasks:

Task #1: Install Oracle Linux.

There are 4 steps to complete this task.

Task #1a: Start ORAEMCC_VM.

Log into OVM Manager and navigate to the list of VMs. Highlight ORAEMCC_VM by clicking on it, then click the green arrow head icon to start the VM. The CDROM is first in the VM’s Boot Order. It contains the Oracle Linux 6 ISO image, so starting the VM has the effect of booting from the ISO. Hence we can start the installation of Oracle Linux. The screen should look similar to this:

Once the VM’s status shows Running, click on the Launch Console icon.

Task #1b: Follow the Oracle Linux Installation Screens.

Press Enter to Install or Upgrade an existing system:

Skip the media test:

Click Next.

Choose your preferred language. Click Next.

Choose your preferred keyboard layout. Click Next.

Select Basic Storage Devices. Click Next. Note the warm and fuzzy we’re all in this together phrase, “If you’re not sure which option is right for you, this is probably it”. Ahhh. I wish this person would write more of Oracle’s documentation.

Click Yes, discard any data. Click Next.

Enter the hostname, oraemcc.mynet.com. Your screen should look similar to this:

Click Configure Network. Highlight System eth0, then click Edit.

Change the Connection name to just eth0 (i.e. remove the word “System” – it’s not needed). Check both boxes for Connect automatically and Available to all users. Click the IPv4 Settings tab.

Use these values to populate this screen:

Field Value
Method Manual
Address 200.200.10.16
Netmask 24 (this auto-populates)
Gateway 200.200.10.1
DNS servers 200.200.10.1,8.8.8.8,8.8.4.4

Your screen should look similar to this:

Click Apply. Click Close. Click Next. Choose your timezone. Click Next.

Choose a root password. Click Next.

Choose Use All Space. Check Review and modify partitioning layout. Click Next.

Using the right pointing arrow key, move the Xen Virtual Block Device (51200 MB) over to the Install Target Devices pane. Leave the two 30 GB disk devices where they are. We’ll use those later. Click Next.

Click to highlight the lv_root logical volume, then click Edit to reduce its size to 34312 MB. Click to highlight the lv_swap logical volume, then click Edit to increase its size to 16384 MB (16 GB). Click Next.

Click Format.

Click Write changes to disk.

Check Install boot loader on /dev/xvdb. Select Oracle Linux Server 6 (Label) and /dev/mapper/vg_oraemcc-lv_root (Device). Click Next.

Select Database Server and Customize now. Click Next.

Use the following to complete this next screen:

Category Option
Base System Leave the default options
Servers Leave the default options
Web Services Leave the defaults options
Databases Uncheck everything
System Management Leave the default options
Virtualization Leave the default options
Desktops Check everything
Applications Check Internet Browser
Development Leave the default options
UEK3 kernel repo Leave the default options
Lanugages Leave the default options

The installation will progress, installing 1249 packages in total.

After a while, the package installation will complete and you’ll see this screen:

Task #1c: Eject the Oracle Linux 6 ISO.

Log into OVM Manager and navigate to the list of VMs. Highlight ORAEMCC_VM then click the pencil icon to edit the VM. Click the Disks tab. The screen should now look similar to this:

For the CD/DVD (Slot 0), click the Eject a CDROM icon. The screen should now look similar to this:

Task #1d: Complete Post Oracle Linux Installation.

Return to the ORAEMCC_VM console session. Click Reboot.

You may need to close the console session and open a new one. Review Task 1a for opening a console session from OVM Manager.

Select Boot from local drive and press Enter.

The system boots up and displays a Welcome screen. Click Forward.

Select Yes, I agree to the License Agreement. Click Forward.

Select No, I prefer to register at a later time. Get ready for an argument which you’ll win.

Click No thanks, I’ll connect later. Click Forward.

Ignore the chastisement about your system not being setup for software updates and click Forward.

We will not be creating a user account at this time, so ignore this screen and click Forward.

Check Synchronize date and time over the network. Click Forward.

Uncheck Enable kdump? Click Finish.

Kdump tells you it needs to reboot. Click Yes. The system reboots and you may have to close and re-open a console screen from OVM Manager.

Once the system comes back up, you should see a login screen similar to this:

Task #2: Configure Oracle Linux.

Once Oracle Linux is installed, it needs to be configured to ensure the installation of Oracle Database 12c and OEM Cloud Control 12c Release 5 succeed. The OEM Cloud Control documentation is verbose to say the least, so the following 8 steps will guide you through the bare essentials.

Task #2a: Check Package Requirements for Oracle Management Service.

This step is more informational than anything. According to the Oracle Enterprise Manager Cloud Control Basic Installation Guide, you are warned to install only these package versions to support the Oracle Management Service (OMS):

Oracle Linux Version Package Version
Oracle Linux 6.x make-3.81
Oracle Linux 6.x binutils-2.20.51.0.2-5.11
Oracle Linux 6.x gcc-4.4.4
Oracle Linux 6.x libaio-0.3.107
Oracle Linux 6.x glibc-common-2.12-1
Oracle Linux 6.x libstdc++-4.4.4
Oracle Linux 6.x libXtst-1.0.99.2-3.el6.x86_64.rpm
Oracle Linux 6.x sysstat-9.0.4
Oracle Linux 6.x glibc-devel-2.12-1.7.el6.i686 (This is a 32-bit package)
Oracle Linux 6.x glibc-devel-2.12-1.7.el6.x86_64 (This is a 64-bit package)
Oracle Linux 6.2 glibc-2.12-1.47.0.2

A vanilla install of Oracle Linux 6.6 does not create all these packages at these versions. For example, the binutils package is installed at binutils-2.20.51.0.2-5.42.el6.x86_64 which is different from the required version. It’s tempting to remove the installed version and replace it with the exact version listed in the table. However, doing that will remove hundreds of dependent packages, so I wouldn’t recommend it.

The warning in the installation guide says having different versions of these packages, “might cause the installation to fail”. Trust me on this, it won’t as you’ll see later on.

Task #2b: Check Package Requirements for Oracle Management Agent.

Again, this step is informational and is taken from the Oracle Enterprise Manager Cloud Control Basic Installation Guide:

Oracle Linux Version Package Version
Oracle Linux 6.x make-3.81
Oracle Linux 6.x binutils-2.20.51.0.2-5.11
Oracle Linux 6.x gcc-4.4.4
Oracle Linux 6.x libaio-0.3.107
Oracle Linux 6.x glibc-common-2.12-1.7
Oracle Linux 6.x libstdc++-4.4.4
Oracle Linux 6.x sysstat-9.0.4

It is a supported configuration to have a single server run the full OEM Cloud Control 12c Release 5 software stack, including a Management Agent for the Cloud Control server itself. That being the case, you’d think the package requirements for the OMS and Management Agent would be identical, but according to Oracle’s own documentation, they’re not! It’s either a documentation typo, which seems unlikely given the documentation’s longevity, or it doesn’t matter. Guess which one I’m going with. 😀

Task #2c: Update Oracle Linux.

Let’s grab the latest and greatest packages and update the Linux kernel. To do that we need to run a yum update. To do that, we need to create a yum repository first. With the yum repository in place, run the update:

[root@oraemcc ~]# yum update

This will take a while. It will ask you to confirm progress in a couple of places, so don’t go to lunch or it’ll be waiting for you when you get back.

Note, this update will upgrade the version of Oracle Linux from 6.6 to 6.7.

Since we’re using a 64 bit version of Oracle Linux, we don’t get the 32 bit version of the glibc-devel package required for the OMS. To install that, run this command:

[root@oraemcc ~]# yum install glibc-devel.i686

Task #2d: Install the Oracle 12c Release 1 Pre-Installation Package.

Since we’re using a single server for the entire OEM software stack, the server will also need to run an Oracle 12c database. Consequently, the operating system needs to be configured to support that as well as the OEM components. The easiest way to introduce multiple configuration changes which benefit an Oracle Database 12c installation is to install Oracle’s 12cR1 pre-installation package. This can be done in a couple of ways.

The package is on the Oracle Linux 6 Update 6 DVD you created with ImgBurn. It’s located in /Server/Packages and is called oracle-rdbms-server-12cR1-preinstall-1.0-12.el6.x86_64.rpm. To install it, run this command:

[root@oraemcc Packages]# rpm -i oracle-rdbms-server-12cR1-preinstall-1.0-12.el6.x86_64.rpm

Or you could just run a yum install command which is easier:

[root@oraemcc ~]# yum install oracle-rdbms-server-12cR1-preinstall.x86_64

The installation of this package makes a number of changes to /etc/password and /etc/group. The UIDs and GIDs it creates may not be what you want. You can use a script to correct them and add some additional standard users and groups. Using the Oracle pre-install rpm followed by this script will guarantee UID and GID consistency across the entire server infrastructure. There will be much rejoicing and the Oracle gods will smile upon thee. 😀

Add these entries to the end of the /etc/security/limits.conf file:

oracle   soft   nofile   8192
oracle   hard   nofile   65536
oracle   soft   nproc    2048
oracle   hard   nproc    16384
oracle   soft   stack    10240
oracle   hard   stack    32768
oracle   soft   core     unlimited
oracle   hard   core     unlimited

Despite the correct changes having been made to the system file /etc/security/limits.conf, I had to add the following line to the oracle user’s .bash_profile. Without it, the OEM installer failed a prerequisite check!

ulimit -n 4096

Now that the oracle user exists, set a password:

[root@oraemcc ~]# passwd oracle

Task #2e: Disable SELinux.

SELinux is enabled by default when you install Oracle Linux 6.6. It can have the nasty side effect of causing ASM disk instantiation failures. Hence, I always disable SELinux.

Task #2f: Disable Linux Firewall.

Generally speaking, firewalls are a good thing.

In a production environment, you’d want to work with your Network Administrator and have them open up the ports necessary for OEM Cloud Control to work. Information on those ports can be found here.

However, in our environment we can trust that we won’t hack ourselves and we’ll go ahead and disable the Linux firewall.

Task #2g: Create Additional File Systems.

The ORAEMCC_VM had 3 disk resources allocated to it. That means 3 disk devices should have shown up after we installed Oracle Linux. We already used /dev/xvdb for the Oracle Linux installation, so we need to use the other two disks to create the /u01 and /u02 filesystems.

This is the selective output from the fdisk command showing the other 2 disk devices:

[root@oraemcc ~]# fdisk -l

Disk /dev/xvdc: 32.2 GB, 32212254720 bytes
255 heads, 63 sectors/track, 3916 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x0005a6bc

Disk /dev/xvdd: 32.2 GB, 32212254720 bytes
255 heads, 63 sectors/track, 3916 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00083a9b

First, let’s partition /dev/xvdc:

[root@oraemcc ~]# fdisk /dev/xvdc

WARNING: DOS-compatible mode is deprecated. It's strongly recommended to
         switch off the mode (command 'c') and change display units to
         sectors (command 'u').


Command (m for help): n
Command action
   e   extended
   p   primary partition (1-4)
p

Partition number (1-4): 1

First cylinder (1-3916, default 1):
Using default value 1
Last cylinder, +cylinders or +size{K,M,G} (1-3916, default 3916):
Using default value 3916

Command (m for help): w

The partition table has been altered!
Calling ioctl() to re-read partition table.
Syncing disks.

Perform the same operation for disk device, /dev/xvdd.

Next, create an ext4 filesystem in the primary partition of each disk:

[root@oraemcc ~]# mkfs -t ext4 -m 0 /dev/xvdc1

mke2fs 1.43-WIP (20-Jun-2013)
Discarding device blocks: done
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
Stride=0 blocks, Stripe width=0 blocks
1966080 inodes, 7863809 blocks
0 blocks (0.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=4294967296
240 block groups
32768 blocks per group, 32768 fragments per group
8192 inodes per group
Superblock backups stored on blocks:
       32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208,
       4096000
Allocating group tables: done
Writing inode tables: done
Creating journal (32768 blocks): done
Writing superblocks and filesystem accounting information: done

Perform the same operation for disk partition, /dev/xvdd1.

Next create the file system mount points and mount the file systems:

[root@oraemcc /]# cd /
[root@oraemcc /]# mkdir /u01 /u02
[root@oraemcc /]# mount /dev/xvdc1 /u01
[root@oraemcc /]# mount /dev/xvdd1 /u02

[root@oraemcc /]# df -h

Filesystem           Size Used  Avail Use% Mounted on
/dev/mapper/vg_oraemcc-lv_root
                      33G   15G   17G  46% /
tmpfs                7.9G   72K  7.9G   1% /dev/shm
/dev/xvdb1           477M  123M  325M  28% /boot
/dev/xvdc1            30G   44M   30G   1% /u01
/dev/xvdd1            30G   44M   30G   1% /u02

Next, add the following entries to the /etc/fstab file so the mounted filesystems survive system restarts:

/dev/xvdc1             /u01                 ext4   defaults       0 0
/dev/xvdd1             /u02                 ext4    defaults      0 0

Finally, create the directories which we’ll need later for the software installation and database files:

[root@oraemcc /]# cd /u01

[root@oraemcc u01]# mkdir app
[root@oraemcc u01]# chmod 755 app
[root@oraemcc u01]# chown oracle:oinstall app

[root@oraemcc u01]# cd /u02
[root@oraemcc u02]# mkdir oradata
[root@oraemcc u02]# chmod 755 oradata
[root@oraemcc u02]# chown oracle:oinstall oradata

[root@oraemcc u02]# mkdir FRA
[root@oraemcc u02]# chmod 755 FRA
[root@oraemcc u02]# chown oracle:oinstall FRA

Task #2h: Edit /etc/hosts file.

Add this line to the local /etc/hosts file. If you don’t the OEM installer has a hissy fit and you don’t want that:

200.200.10.16   oraemcc.mynet.com     oraemcc

Task #3: Install Oracle Database 12c.

OEM Cloud Control uses its own repository which happens to be an Oracle Database. In order to have an Oracle Database up and running on this server, we need to install the Oracle Database 12c Release 1 (12.1.0.2) software. This is nice and easy. Nice and easy does it every time. (Sinatra reference 🎶 )

With X-Win32 installed and up and running on your workstation, fire up an X term connected to oraemcc.mynet.com as the oracle user and point your DISPLAY variable back to your workstation’s IP address:

[oracle@oraemcc ~]# export DISPLAY=<your-workstation-IP>:0.0

Locate the Oracle Database software you copied to the server and ensure it’s unzipped. Then simply run the installer script as the oracle user:

[oracle@oraemcc ~]$ pwd
/home/oracle

[oracle@oraemcc ~]$ cd media/rdbms_12cR1_12.1.0.2/database
[oracle@oraemcc database]$ ./runInstaller

The Oracle Database 12c Release 1 Installer opens up. Simply follow the screens.

Uncheck the option, I wish to receive security updates via My Oracle Support, then click Next:

Click Yes to confirm that you wish to remain uninformed. Love the wording. Condescending much?

Select the option, Install database software only, then click Next:

Select the option, Single instance database installation, then click Next:

Select your preferred language, then click Next:

Select the option, Enterprise Edition (6.4 GB), then click Next:

Choose the standard locations for Oracle base and Software location, then click Next:

Field Value
Oracle base /u01/app/oracle
Software location /u01/app/oracle/product/12.1.0/dbhome_1

Choose the standard Inventory Directory (/u01/app/oraInventory) and the standard oraInventory Group Name (oinstall), then click Next:

Choose the standard OS groups then click Next:

Field Value
Database Administrator (OSDBA) group dba
Database Operator (OSOPER) group (Optional) oper
Database backup and Recovery (OSBACKUPDBA) group backupdba
Data Guard administrative (OSDGDBA) group dgdba
Encryption Key Management administrative (OSKMDBA) group kmdba

The installer then runs its prerequisite checks which should all pass without incident. You should then see this summary screen. Click Install to start the installation.

You’ll be looking at this screen for a little while:

After several minutes you’ll be prompted to run some scripts as the root user:

Running the scripts looks like this:

[root@oraemcc ~]# cd /u01/app/oraInventory

[root@oraemcc oraInventory]# ./orainstRoot.sh
Changing permissions of /u01/app/oraInventory.
Adding read,write permissions for group.
Removing read,write,execute permissions for world.
Changing groupname of /u01/app/oraInventory to oinstall.
The execution of the script is complete.
[root@oraemcc oraInventory]# cd /u01/app/oracle/product/12.1.0/dbhome_1

[root@oraemcc dbhome_1]# ./root.sh
Performing root user operation.

The following environment variables are set as:
   ORACLE_OWNER= oracle
   ORACLE_HOME= /u01/app/oracle/product/12.1.0/dbhome_1

Enter the full pathname of the local bin directory: [/usr/local/bin]:
   Copying dbhome to /usr/local/bin ...
   Copying oraenv to /usr/local/bin ...
   Copying coraenv to /usr/local/bin ...

Creating /etc/oratab file...
Entries will be added to the /etc/oratab file as needed by
Database Configuration Assistant when a database is created

Finished running generic part of root script.
Now product-specific root actions will be performed.

After the scripts have been run, return to the Execute Configuration Scripts dialog and click OK. You’ll then see this final screen and your work here is done. For now.

Task #4: Create a Repository Database.

In the good old days, you could create an empty database and have the installation of OEM populate it with all the necessary repository objects. Seems that’s beyond our capabilities now as Oracle wants you to use a pre-configured database template instead. Hopefully you have downloaded the Oracle Enterprise Manager Repository Database Template referenced in Part 2 of this series.

Note, this template is highly version specific. It is designed to work with Oracle Database 12c Release 1 (12.1.0.2) and Oracle Enterprise Manager Cloud Control 12c Release 5.

This is how you use it in 3 easy steps:

Task #4a: Unzip the Template.

Copy the template to $ORACLE_HOME/assistants/dbca/templates:

[oracle@oraemcc ~]$ cd /u01/app/oracle/product/12.1.0/dbhome_1/assistants/dbca/templates
[oracle@oraemcc templates]$ cp ~/media/oemcc_12cR5/12.1.0.2.0_Database_Template_for_EM12_1_0_5_0_Linux_x64.zip .

Unzip the template zip file to the templates directory:

[oracle@oraemcc templates]$ unzip 12.1.0.2.0_Database_Template_for_EM12_1_0_5_0_Linux_x64.zip

Task #4b: Set the Oracle Environment.

Add the following line to the /etc/oratab file:

dummy:/u01/app/oracle/product/12.1.0/dbhome_1:N

Then run the oraenv script to set the Oracle environment variables:

[oracle@oraemcc templates]$ . oraenv
ORACLE_SID = [oracle] ? dummy
The Oracle base has been set to /u01/app/oracle

Then unset the ORACLE_SID environment variable since we won’t be needing that:

[oracle@oraemcc templates]$ unset ORACLE_SID

Task #4c: Run the Database Configuration Assistant.

With your oracle environment set, invoke the Database Configuration Assistant:

[oracle@oraemcc templates]$ dbca

Select the option, Create Database, then click Next:

Select the option, Advanced Mode, then click Next:

Select the option, 12.1.0.2.0_Database_Template_for_EM12_1_0_5_0_Small_deployment.dbc, then click Next:

Use EMCC.mynet.com for the Global Database Name. The SID defaults to EMCC and your screen should look similar to this:

Uncheck the option, Configure Enterprise Manager (EM) Database Express. The option, Register with Enterprise Manager (EM) Cloud Control, should also be unchecked. Click Next:

Select the option, Use the Same Administrative Password for All Accounts. Enter a password twice (write it down somewhere safe – do it now – seriously!), then click Next:

Check the box for Create a New Listener. Don’t be one of those people who uses the default name of LISTENER for the name of the listener. If you have to use the default port (1521) I won’t tell anyone. Click Next:

Complete this next screen with these values, then click Next:

Field Value
Database files Storage Type File System
Use Common Location for All Database Files /u02/oradata
Recovery files Storage Type File System
Fast Recovery Area /u02/FRA
Fast Recovery Area Size 5 GB

Ignore the next Pythonesque retort, “should at least be thrice the database size”. It’ll be telling you to fetch it a shrubbery next. 🌲 Click Yes to move on.

This next screen requires some help. The Oracle documentation says:

When you run Oracle Database Configuration Assistant to create the database, on the Database Content screen, by default the Run the following scripts option is selected to run the shpool script. The script sets the shared pool in the database. Make sure you RETAIN THE SELECTION. Do not deselect it.

There is no “Database Content” screen. Strike #1! 💥

There is no script called “shpool”. Strike #2! ⚡

It’s not selected anywhere, so you can’t deselect it. Strike #3! 🔥

Here’s what you do. Click the Browse button and locate a file called:

shpool_12.1.0.2.0_Database_SQL_for_EM12_1_0_5_0.sql

You’ll find it in the same place you unzipped the template file:

/u01/app/oracle/product/12.1.0/dbhome_1/assistants/dbca/templates

Your screen should end up looking like this. Click Next:

Leave the Memory settings at their pre-determined values unless you have a need to change anything. Click the Character Sets tab:

On the Character Sets tab select the option, Use Unicode (AL32UTF8). Click Next:

Ensure the option, Create Database, is selected and click Next:

DBCA goes through its validation checks which should all be fine. Click Next:

On the Create Database – Summary screen, click Finish:

You’ll be looking at the Progress Page for a while. Make yourself a cup of coffee.

Eventually you’ll see the completion screen and all will be well with the world. Click Close.

Don’t forget to remove the dummy line from /etc/oratab:

dummy:/u01/app/oracle/product/12.1.0/dbhome_1:N

Task #5: Install OEM Cloud Control.

Locate the Oracle Enterprise Manager Cloud Control 12c Release 5 software you copied to the server and ensure it’s unzipped. Then, as the oracle user, simply run the installer script:

[oracle@oraemcc ~]$ cd /home/oracle/media/oemcc_12cR5
[oracle@oraemcc oemcc_12cR5]$ ./runInstaller

Uncheck the option, I wish to receive security updates via My Oracle Support. Click Next.

Click Yes to move past the security issue notification warning. Then select the option, Skip, to skip over software updates. Click Next:

The installer will run a series of prerequisite checks to verify everything is in order prior to installing the software. Notice how everything succeeded despite the versions of the installed packages not matching the requirements in Oracle’s documentation. Told you! 😉 Click Next:

Select the option, Create a new Enterprise Manager System, along with the Simple option. Click Next:

Use the following values to populate this next screen, then click Next:

Field Value
Middleware Home Location /u01/app/oracle/product/middleware
Agent Base directory /u01/app/oracle/product/agent
Host Name oraemcc.mynet.com

Part of the basic installation of OEM includes simplified password management. A single Administrator password covers several important accounts. As the screen says, the passwords for these accounts can be changed later if required.

For the Database Connection Details, use these values:

Field Value
Database Host Name oraemcc.mynet.com
Port 1521 (referenced here)
Service/SID emcc (referenced here)
SYS Password (referenced here)

Check the option, Configure Oracle Software Library and leave the location as the default (/u01/app/oracle/product/swlib). Click Next:

The installer will then display a Review screen. Click Install:

The next screen, Installation Progress Details, ticks along for quite some time. Go to lunch, seriously. Nothing much of interest happens other than things hopefully succeeding:

Eventually you’ll be presented with this dialog requesting you run the allroot.sh script as the root user:

[root@oraemcc ~]# cd /u01/app/oracle/product/middleware/oms

[root@oraemcc oms]# ./allroot.sh
Starting to execute allroot.sh .........
Starting to execute /u01/app/oracle/product/middleware/oms/root.sh ......
Running Oracle 11g root.sh script...

The following environment variables are set as:
   ORACLE_OWNER= oracle
   ORACLE_HOME= /u01/app/oracle/product/middleware/oms

Enter the full pathname of the local bin directory: [/usr/local/bin]:
The file "dbhome" already exists in /usr/local/bin. Overwrite it? (y/n)
[n]: n
The file "oraenv" already exists in /usr/local/bin. Overwrite it? (y/n)
[n]: n
The file "coraenv" already exists in /usr/local/bin. Overwrite it? (y/n)
[n]: n

Entries will be added to the /etc/oratab file as needed by
Database Configuration Assistant when a database is created
Finished running generic part of root.sh script.


Now product-specific root actions will be performed.
/etc exist

Creating /etc/oragchomelist file...
/u01/app/oracle/product/middleware/oms
Finished execution of /u01/app/oracle/product/middleware/oms/root.sh ......

Starting to execute /u01/app/oracle/product/agent/core/12.1.0.5.0/root.sh ......
Finished product-specific root actions.
/etc exist

Finished execution of /u01/app/oracle/product/agent/core/12.1.0.5.0/root.sh ......

Return to the previous dialog box and click OK. A Finish screen will be displayed and you’re done!

Accessing the URL, https://oraemcc.mynet.com:7802/em, brings up the Cloud Control login screen (after your browser complains about security certificates):

You can use the User Name SYSMAN with the Administrator Password you chose earlier. The first time you login you’ll have to choose some screen preferences and accept the displayed agreement. After that, you’re off to the races:

Task #6: Start and Stop Cloud Control.

Now that we have everything installed, up and running this is a useful little reminder of how to start and stop the Cloud Control infrastructure.

Click the link you need:

Task 6-1: Start Cloud Control.

There are 4 steps to starting the Cloud Control infrastructure:

Task 6-1a: Start the database listener.

[oracle@oraemcc ~]$ . oraenv
ORACLE_SID = [oracle] ? EMCC
The Oracle base has been set to /u01/app/oracle

[oracle@oraemcc bin]$ lsnrctl start LISTENER_EMCC

Task 6-1b: Start the Repository database.

[oracle@oraemcc bin]$ sqlplus / as sysdba
SQL*Plus: Release 12.1.0.2.0 Production on Tue Jan 26 13:43:41 2016
Copyright (c) 1982, 2014, Oracle. All rights reserved.
Connected to an idle instance.

SQL> startup

Task 6-1c: Start the Oracle Management Server.

[oracle@oraemcc bin]$ pwd
/u01/app/oracle/product/middleware/oms/bin

[oracle@oraemcc bin]$ ./emctl start oms
Oracle Enterprise Manager Cloud Control 12c Release 5
Copyright (c) 1996, 2015 Oracle Corporation. All rights reserved.
Starting Oracle Management Server...
Starting WebTier...
WebTier Successfully Started
Oracle Management Server Already Started
Oracle Management Server is Up

[oracle@oraemcc bin]$ ./emctl status oms
Oracle Enterprise Manager Cloud Control 12c Release 5
Copyright (c) 1996, 2015 Oracle Corporation. All rights reserved.
WebTier is Up
Oracle Management Server is Up

Task 6-1d: Start the Management Agent.

[oracle@oraemcc bin]$ pwd
/u01/app/oracle/product/agent/core/12.1.0.5.0/bin

[oracle@oraemcc bin]$ ./emctl start agent
Oracle Enterprise Manager Cloud Control 12c Release 5
Copyright (c) 1996, 2015 Oracle Corporation. All rights reserved.
Agent is already running

[oracle@oraemcc bin]$ ./emctl status agent
Oracle Enterprise Manager Cloud Control 12c Release 5
Copyright (c) 1996, 2015 Oracle Corporation. All rights reserved.
---------------------------------------------------------------
Agent Version          : 12.1.0.5.0
OMS Version            : 12.1.0.5.0
Protocol Version       : 12.1.0.1.0
Agent Home             : /u01/app/oracle/product/agent/agent_inst
Agent Log Directory    : /u01/app/oracle/product/agent/agent_inst/sysman/log
Agent Binaries         : /u01/app/oracle/product/agent/core/12.1.0.5.0
Agent Process ID       : 4074
Parent Process ID      : 3936
Agent URL              : https://oraemcc.mynet.com:3872/emd/main/
Local Agent URL in NAT : https://oraemcc.mynet.com:3872/emd/main/
Repository URL         : https://oraemcc.mynet.com:4903/empbs/upload
Started at             : 2016-01-26 13:07:57
Started by user        : oracle
Operating System       : Linux version 3.8.13-118.2.1.el6uek.x86_64 (amd64)
Last Reload            : (none)
Last successful upload                       : 2016-01-26 13:50:29
Last attempted upload                        : 2016-01-26 13:50:29
Total Megabytes of XML files uploaded so far : 0.16
Number of XML files pending upload           : 0
Size of XML files pending upload(MB)         : 0
Available disk space on upload filesystem    : 21.13%
Collection Status                            : Collections enabled
Heartbeat Status                             : Ok
Last attempted heartbeat to OMS              : 2016-01-26 13:49:56
Last successful heartbeat to OMS             : 2016-01-26 13:49:56
Next scheduled heartbeat to OMS              : 2016-01-26 13:50:56
---------------------------------------------------------------
Agent is Running and Ready

Task 6-2: Stop Cloud Control.

There are 4 steps to stopping the Cloud Control infrastructure.

Task 6-2a: Stop the Oracle Management Server.

[oracle@oraemcc bin]$ pwd
/u01/app/oracle/product/middleware/oms/bin

[oracle@oraemcc bin]$ ./emctl stop oms -all
Oracle Enterprise Manager Cloud Control 12c Release 5  
Copyright (c) 1996, 2015 Oracle Corporation.  All rights reserved.
Stopping WebTier...
WebTier Successfully Stopped
Stopping Oracle Management Server...
Oracle Management Server Successfully Stopped
AdminServer Successfully Stopped
Oracle Management Server is Down

Task 6-2b: Stop the Management Agent.

[oracle@oraemcc bin]$ pwd
/u01/app/oracle/product/agent/core/12.1.0.5.0/bin

[oracle@oraemcc bin]$ ./emctl stop agent
Oracle Enterprise Manager Cloud Control 12c Release 5  
Copyright (c) 1996, 2015 Oracle Corporation.  All rights reserved.
Stopping agent ..... stopped.

Task 6-2c: Stop the Repository database.

[oracle@oraemcc ~]$ sqlplus / as sysdba
SQL*Plus: Release 12.1.0.2.0 Production on Tue Jan 26 15:25:33 2016
Copyright (c) 1982, 2014, Oracle.  All rights reserved.
Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options

SQL> shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.

Task 6-2d: Stop the database listener.

[oracle@oraemcc ~]$ lsnrctl stop LISTENER_EMCC
LSNRCTL for Linux: Version 12.1.0.2.0 - Production on 26-JAN-2016 15:26:16
Copyright (c) 1991, 2014, Oracle.  All rights reserved.
Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=oraemcc.mynet.com)(PORT=1521)))
The command completed successfully

Wow! That was quite the marathon. Many steps and not an insignificant amount of time has hopefully got Cloud Control up and running. In Part 7, we’ll build out the Oracle RAC server nodes. Stay tuned for updates.

If you have any comments or questions about this post, please use the Contact form here.

Build Your Own Oracle Infrastructure: Part 5 – Create VMs.

Create-VMs-2

With the hard work of configuring the OVM Server behind us, we can move onto the relatively simple task of creating VMs.

Just as with a physical computer, a virtual computer is made up of various resources. These include CPU, memory, disk, network interfaces and an operating system.

The trick here is to decide how to distribute the available physical resources across the multiple VMs. Let’s start by reviewing what those physical resources are. The OVM Server, ovmsvr.mynet.com, running on the HP ProLiant DL380 has the following hardware resources:

Resource Quantity
CPU 2 x Quad Core Intel Xeon E5640 (16 cores total)
RAM 72 GB (now 144 GB)
Disk 838 GB (VM_Filesystems_Repo)
NIC 2 x Dual Port, 4 total (eth0, eth1, eth2, eth3)

Allowing some resources for the OVM Server itself, we will distribute the following hardware across a total of 8 Virtual Machines:

VM# VM Name CPU Cores RAM Total Disk Usage
1 ORAEMCC_VM 2 16 GB 110 GB Oracle EM Cloud Control Server
2 ORASVR01_VM 2 8 GB 100 GB Oracle Database Server
3 ORASVR02_VM 2 8 GB 100 GB Oracle Database Server
4 RACNODE1_VM 1 6 GB 70 GB Oracle RAC Node Server
5 RACNODE2_VM 1 6 GB 70 GB Oracle RAC Node Server
6 RACNODE3_VM 1 6 GB 70 GB Oracle RAC Node Server
7 RACNODE4_VM 1 6 GB 70 GB Oracle RAC Node Server
8 RACNODE5_VM 1 6 GB 70 GB Oracle RAC Node Server
TOTAL 11 62 GB 660 GB

This resource allocation still gives us some wiggle room in case we need another server or two. Not likely, but when was the last time you thought you had everything you’ll ever need? That said, 8 VMs is quite a few so what are they all for?

The ORAEMCC_VM will be used to run the complete stack of Oracle Enterprise Manager Cloud Control 12c and an RMAN Catalog database.

ORASVR01_VM and ORASVR02_VM will be used for some Data Guard and GoldenGate data replication testing.

RACNODE1, RACNODE2, RACNODE4 and RACNODE5 will be used to build two Oracle RAC clusters with 2 nodes each. These will be used to test cluster to cluster data replication and failover. RACNODE3 will be used to test adding and removing an additional node from a running cluster.

Quick links to all the tasks:

Task #1: Create ORAEMCC_VM.

Creating the ORAEMCC_VM is a 4 step process.

Task #1a: Add CPU and Memory.

Open up Oracle VM Manager. Use the arrow head icon to expand Server Pools. Then expand ServerPool1, then right click on ovmsvr.mynet.com and select the option, Create Virtual Machine. You could also click on the Create Virtual Machine icon:

Create-VM-icon

Your screen should look similar to this:

Click Next. Use the following field values to populate this next screen:

Field Value
Repository VM_Filesystems_Repo
Name ORAEMCC_VM
Description VM for the Oracle Enterprise Manager Cloud Control Server
Operating System Oracle Linux 6
Mouse Device Type OS Default (default)
Keymap en-us (English, United States) (default)
Domain Type XEN_HVM
Start Policy Best Server (default)
Max. Memory (MB) 16384
Memory (MB) 16384
Max. Processors 2
Processors 2
Priority 50 (default)
Processor Cap % 100 (default)

The screen should look similar to this:

Click Next.

Task #1b: Add Networks.

Use the Network pull down menu and the Add VNIC button to add the Management_Public network to the VM. You can leave the option, Dynamically Assign MAC, selected unless you want to provide your own MAC address. In which case, select the option, Specify MAC Address and enter your own value. ORAEMCC_VM only needs one network since it won’t be using shared storage or the private interconnect. Your screen should now look similar to this:

Click Next.

Note, in case you were wondering where a dynamically assigned MAC address comes from, there is a predefined range which OVM Manager uses. You can see and change this range from this screen:

Task #1c: Add Storage.

Since we chose a Domain Type of XEN_HVM, we only get 4 slots to add disk storage to this VM. That’s OK because we only need 4. If you had chosen a Domain Type of XEN_PVM, then you could add up to 103 pieces of disk storage. For this VM, we only need to add a CD/DVD drive and storage for 3 filesystems.

First, we’ll add the CD/DVD drive. This device will be used to store the Oracle Linux ISO which we’ll boot from later. For Slot 0, use the Disk Type pull down menu and select CD/DVD. Your screen should now look similar to this:

Click the magnifying glass icon. This will open up the Select an ISO dialog box. Select the Oracle Linux 6 Update 6 ISO file, V52218-01.iso. Notice the name of the repository it’s coming from, OL6_Repo. Hopefully you’re beginning to see how things are connected in OVM. Isn’t it fun? Your screen should look similar to this:

Click OK to return to the Arrange Disks dialog. For Slot 1, use the Disk Type pull down menu and select Virtual Disk. Your screen should look similar to this:

Click the green + sign to open up the Create Virtual Disk dialog. Use these values:

Field Value
Repository VM_Filesystems_Repo
Virtual Disk Name ORAEMCC_OS
Size (GiB) 50.0
Description Disk for ORAEMCC_VM operating system.
Shareable Leave unchecked (default)
Allocation Type Sparse Allocation (default)

The disk space allocated for the operating system includes space for the OS itself, swap and /home. The Oracle Enterprise Manager Cloud Control installation binaries will be copied here too, hence the generous 50 GB size. Notice that this 50 GB will be allocated from the VM_Filesystems_Repo Storage Repository we created earlier. Your screen should look similar to this:

Click OK to return to the Arrange Disks dialog. Repeat this process to add 2 more Virtual Disks.

For slots 2 and 3 use these values:

Slot# Field Value
Slot 2 Repository VM_Filesystems_Repo
Virtual Disk Name ORAEMCC_u01
Size (GiB) 30.0
Description Disk for ORAEMCC_VM /u01 filesystem.
Shareable Leave unchecked (default)
Allocation Type Sparse Allocation (default)
Slot 3 Repository VM_Filesystems_Repo
Virtual Disk Name ORAEMCC_u02
Size (GiB) 30.0
Description Disk for ORAEMCC_VM /u02 filesystem.
Shareable Leave unchecked (default)
Allocation Type Sparse Allocation (default)

Note, the Oracle software will be installed into the /u01 filesystem. The data files for the Cloud Control repository database will be stored in the /u02 filesystem.

When you have completed adding Virtual Disks for Slots 2 and 3, your screen should look similar to this:

Click Next.

Task #1d: Add Boot Order.

Using the arrow icons, move the boot options from the left pane to the right pane in the order shown in this screen:

It’s important that CDROM is first in the list. When we start the VM for the very first time, it needs to boot from this device so we can install Oracle Linux.

Click Finish and your screen should look similar to this:

That’s all there is to creating a VM. At this point, the VM is effectively just metadata. It’s not a running server just yet. That comes later when we start the VM, boot from the ISO in the virtual CD/DVD drive and install the Oracle Linux operating system. Next we’ll create the first Oracle database server.

Task #2: Create ORASVR01_VM.

Creating an Oracle database server VM follows the same 4 steps.

Task #2a: Add CPU and Memory.

Click the Create Virtual Machine icon:

Create-VM-icon

Your screen should look similar to this:

Click Next and complete this next screen with these values:

Field Value
Repository VM_Filesystems_Repo
Name ORASVR01_VM
Description VM for Oracle Database Server #1.
Operating System Oracle Linux 6
Mouse Device Type OS Default (default)
Keymap en-us (English, United States) (default)
Domain Type XEN_HVM
Start Policy Best Server (default)
Max. Memory (MB) 8192
Memory (MB) 8192
Max. Processors 2
Processors 2
Priority 50 (default)
Processor Cap % 100 (default)

The screen should look similar to this:

Click Next.

Task #2b: Add Networks.

Use the Network pull down menu and the Add VNIC button to add the Management_Public network to the VM. You can leave the option, Dynamically Assign MAC, selected unless you want to provide your own MAC address. In which case, select the option, Specify MAC Address and enter your own value. ORASVR01_VM only needs one network since it won’t be using shared storage or the private interconnect. Your screen should now look similar to this:

Click Next.

Task #2c: Add Storage.

Using the previous Add Storage step as a guide, complete the addition of storage to ORASVR01_VM using these values:

Slot# Disk Type Name Size (GiB)
0 CD/DVD V52218-01.iso 3.59
1 Virtual Disk ORASVR01_OS 40.0
2 Virtual Disk ORASVR01_u01 30.0
3 Virtual Disk ORASVR01_u02 30.0

The /u01 filesystem will be used to install the Oracle Database software. The /u02 filesystem will be used to store the actual database data files.

Your screen should now look similar to this:

Click Next.

Task #2d: Add Boot Order.

Using the arrow icons, move the boot options from the left pane to the right pane in the order shown in this screen:

Click Finish and your screen should look similar to this:

That’s the first Oracle Database server VM created. Next we’ll create the first Oracle RAC node server.

Task #3: Create RACNODE1_VM.

You know the drill by now.

Task #3a: Add CPU and Memory.

Click the Create Virtual Machine icon:

Create-VM-icon

Your screen should look similar to this:

Click Next and complete this next screen with these values:

Field Value
Repository VM_Filesystems_Repo
Name RACNODE1_VM
Description VM for the Oracle Database RAC Node Server #1.
Operating System Oracle Linux 6
Mouse Device Type OS Default (default)
Keymap en-us (English, United States) (default)
Domain Type XEN_HVM
Start Policy Best Server (default)
Max. Memory (MB) 6144
Memory (MB) 6144
Max. Processors 1
Processors 1
Priority 50 (default)
Processor Cap % 100 (default)

The screen should look similar to this:

Click Next.

A few notes about memory allocation with Oracle VM.

  1. When you allocate X MB of RAM to a VM, by the time Linux sees it, it’s actually X – Y MB.
  2. To get a Linux VM to see exactly X MB of RAM is virtually impossible because Y is not consistent across identical VMs nor across reboots. I have no idea why, but that’s been my consistent experience.
  3. For example, a 4096 MB (4 GB) RAM allocation produces about 3.86 GB of RAM visible to Linux. 4224 MB produces about 3.98 GB. 4352 MB produces about 4.11 GB and 16,384 MB  produces about 15.67 GB.
  4. I upped the original memory allocation from 4096 MB (4 GB) to 6144 MB (6 GB) for 3 main reasons:
    1. 4096 MB only yielded 3.86 GB of visible RAM and that’s less than the 4 GB required by Oracle Grid Infrastructure 12c.
    2. With less than the 4 GB required, the Oracle Grid Infrastructure 12c installation (see Part 10) still succeeded, but took an excessively long time.
    3. Knowing the impact < 4 GB had on the Oracle Grid Infrastructure 12c installation, more memory would be needed to run an Oracle RAC database instance.
  5. Oracle VM cannot over allocate physical server memory. Therefore,  the total RAM allocated to all the VMs must be within the physical RAM configured on the server.

Task #3b: Add Networks.

The Oracle RAC node servers will need to operate on 3 different networks.The public network which you’ll use to login to the server. The private interconnect network for inter-node traffic and finally the shared storage network connected to the Openfiler shared storage server. Using a previous Add Networks step as a guide, complete the Set up Networks screen using these values:

Slot# Network
0 Management_Public
1 Shared_Storage_Public
2 GI_Interconnect_Private

Your screen should now look similar to this:

Click Next.

Task #3c: Add Storage.

Using a previous Add Storage step as a guide, complete the addition of storage to RACNODE1_VM using these values:

Slot# Disk Type Name Size (GiB)
0 CD/DVD V52218-01.iso 3.59
1 Virtual Disk RACNODE1_OS 40.0
2 Virtual Disk RACNODE1_u01 30.0

The /u01 filesystem will be used to install the Oracle Grid Infrastructure and Database software. The database data files will be stored in shared storage on the Openfiler server and accessed via the storage network. Hence, no need for a /u02 filesystem.

Your screen should now look similar to this:

Click Next.

Task #3d: Add Boot Order.

Using the arrow icons, move the boot options from the left pane to the right pane in the order shown in this screen:

Click Finish and your screen should look similar to this:

That’s the last of the 3 VMs created. Nicely done. “What about the other 5 VMs?” I hear you say. You don’t miss a trick do you? We could continue to create more VMs using the procedures described above. However, there’s a smarter, more efficient way to get that job done. All will be revealed in Part 7. I bet you can’t wait!

Next up is building out the OEM Cloud Control Server in Part 6. See you there!

If you have any comments or questions about this post, please use the Contact form here.

Build Your Own Oracle Infrastructure: Part 4 – Configure Oracle VM Server.

Config-OVM


Configuring Oracle VM Server using Oracle VM Manager can be a little tricky.

However, once you know just a handful of concepts and some terminology, you’ll wonder what the fuss was all about.

In Part 4, we’ll show you the steps to configure OVM Server in readiness to create VMs.

Part of the reason Oracle VM is tricky is due to a lack of decent resources explaining it. The official Getting Started guide is OK, but it doesn’t do a great job of explaining what you’re doing or why. It can also be misleading. For example, the Discover Storage section says this:

Your storage can be any of the following:

  • iSCSI: Abstracted LUNs or raw disks accessible over existing Ethernet infrastructure (SAN Servers).
  • FCP: Abstracted LUNs or raw disks accessible over Fibre Channel host bus adapters (SAN Servers).
  • NFS: File-based storage over existing Ethernet infrastructure (NAS or File Servers).

On that basis you’d be forgiven for thinking you need some form of network storage with which to create your VMs. Not true. We’ll be using internal OVM Server storage for the VMs as you’ll see shortly. Hopefully this installment of the Build Your Own Oracle Infrastructure series will set you straight.

Before we launch into OVM Manager, just a cautionary note to pay close attention to the Job Summary section at the foot of the OVM Manager console screen.

When working with OVM Manager it is advisable to do one thing at a time and let it run to completion before starting the next task.


If you allow multiple tasks to overlap, bad things happen. So have patience, Grasshopper!

Quick links to all the tasks:

Task #1: Discover the OVM Server.

The OVM Server is already up and runnning as a result of following the steps to install OVM Server. To make it easier for the OVM Manager to discover the OVM Server, add the appropriate entry to the local /etc/hosts file on the OVM Manager server:

200.200.10.7     ovmsvr.mynet.com     ovmsvr         # Oracle VM Server

Login to OVM Manager and ensure you are on the Servers and VMs tab. From here, right click Server Pools and select the option, Discover Servers. You could also click the Discover Servers icon. In the Discover Servers dialog box, enter the Oracle VM Agent Password and the name of the Oracle VM Server, which in our example is ovmsvr. The screen should now look similar to this:

Click OK to discover the server. Once the server is discovered, the screen should look similar to this:

ovmsvr.mynet.com should be listed underneath Unassigned Servers. It is unassigned because it does not yet belong to a Server Pool. Also, note the Job Summary section. The Discover Server job completed with a status of Success. Keep an eye on this section and make sure each task completes successfully.

Task #2: Create a Server Pool.

Every OVM Server must belong to a Server Pool. A Server Pool can contain one or more OVM Servers. So the next task is to create a Server Pool and allocate ovmsvr.mynet.com to that Server Pool.

In the Server and VMs tab, right click Server Pools and select the option, Create Server Pool. You could also click the Create Server Pool icon. The Create a Server Pool dialog box will appear. Use the following values:

Field Value
Server Pool Name ServerPool1
Virtual IP Address for the Pool 200.200.10.19
VM Console Keymap Leave the default
VM Start Policy Leave the default
Secure VM Migrate Leave blank
Clustered Server Pool Leave blank
Timeout for Cluster Leave the default
Storage for Server Pool Leave blank
Storage Location Leave blank
Description Default server pool

The screen should look similar to this:

Click Next to continue. Then move the ovmsvr.mynet.com server from the Available Server(s) pane to the Selected Server(s) pane using the arrow icons. The screen should now look similar to this:

Click the Finish button and you’re done. ServerPool1 is now created and contains the server, ovmsvr.mynet.com. The screen should look similar to this:

Task #3: Create Storage Repositories.

Oracle VM Storage Repositories are simply disk resources used by VMs and they need to be created before a VM can reference them. We’ll create two different Storage Repositories. The first will be used to store the Oracle Linux 6 Update 6 ISO which will be used to boot a VM and install the operating system. The second Storage Repository will be used to provide actual disk storage for VM filesystems.

In OVM Manager, click on the Repositories tab, then click the green + sign to open the Create a Repository dialog box. Use these values:

Field Value
Repository Name OL6_Repo
Repository Location Physical Disk
Server Pool ServerPool1
Description Repository to hold the Oracle Linux 6.6 ISO image

The screen should look similar to this:

Click the magnifying glass icon next to the Physical Disk field. The screen should now look similar to this:

Select the physical disk OVM_SYS_REPO_PART_360… and click OK. This piece of disk is the unused 4th partition of the OVM Server installation drive we previously referred to here. The screen should now look similar to this:

Click Next. OVM Manager takes a moment or several to create the repository. You are then presented with a 2 pane dialog box with Available Server(s) on the left and Present to Server(s) on the right. Use the arrow icons to move ovmsvr.mynet.com over to the Present to Server(s) pane. The screen should then look similar to this:

Click Finish to complete the creation of OL6_Repo Storage Repository which will be visible to the ovmsvr.mynet.com OVM Server.

Next, we’ll create the VM_Filesystems_Repo Storage Repository. Still within the Repositories tab in OVM Manager, click the green + sign to open the Create a Repository dialog box. Use these values:

Field Value
Repository Name VM_Filesystems_Repo
Repository Location Physical Disk
Server Pool ServerPool1
Description Repository for VM filesystems

The screen should look similar to this:

Click the magnifying glass icon next to the Physical Disk field. The screen should now look similar to this:

Select the physical disk starting with 3600… and click OK. The name you see will be slightly different. The screen should now look similar to this:

Just as we did with the OL6_Repo Storage Repository, click Next and use the arrow icons to move ovmsvr.mynet.com from the Available Server(s) pane on the left to the Present to Server(s) pane on the right. Click Finish. The screen should now show both Storage Repositories similar to this:

In the left most pane underneath Repositories, click on OL6_Repo. The screen should look similar to this:

Note the ID of the OL6_Repo Storage Repository. In my case, it is 0004fb0000030000cc1495b85b9bd08a. YMMV. Open up a Putty session on the Oracle VM Server and run these commands:

[root@ovmsvr /]# cd /OVS/Repositories
[root@ovmsvr Repositories]# ls -l
drwxr-xr-x 8 root root 3896 Nov 26 14:37 0004fb000003000059416081b6e25e36
drwxr-xr-x 8 root root 3896 Nov 26 14:33 0004fb0000030000cc1495b85b9bd08a

Note the second directory has the same name as the ID of the OL6_Repo Storage Repository. Oracle VM tends to use these very long numbers to uniquely identify everything. Get used to seeing that convention. Now run these commands:

[root@ovmsvr Repositories]# cd 0004fb0000030000cc1495b85b9bd08a
[root@ovmsvr 0004fb0000030000cc1495b85b9bd08a]# ls -l
drwx------ 2 root root 3896 Nov 26 14:33 Assemblies
drwx------ 2 root root 3896 Nov 26 15:18 ISOs
drwxr-xr-x 2 root root 3896 Nov 26 14:33 lost+found
drwx------ 2 root root 3896 Nov 26 14:33 Templates
drwx------ 2 root root 3896 Nov 26 14:33 VirtualDisks
drwx------ 2 root root 3896 Nov 26 14:33 VirtualMachines

At this point the ISOs directory is empty, but we need to populate it with the Oracle Linux 6 Update 6 ISO we downloaded from Oracle earlier (V52218-01.iso). The simplest way to do that is to use FileZilla.

Fire up FileZilla, login to ovmsvr.mynet.com and navigate to /OVS/Repositories/0004fb0000030000cc1495b85b9bd08a/ISOs. Then locate your copy of V52218-01.iso and copy it to the ISOs directory on ovmsvr.mynet.com. The screen should look similar to this:

When the copy is complete, you can verify it’s there using these commands:

[root@ovmsvr 0004fb0000030000cc1495b85b9bd08a]# cd ISOs
[root@ovmsvr ISOs]# ls -l
-rw-r--r-- 1 root root 3853516800 Nov 26 15:21 V52218-01.iso

You can also verify using OVM Manager. Expand the OL6_Repo Storage Repository in the left pane, then click on the ISOs folder. The right pane will show the OL6.6 ISO is now present.

Et voila! 😀

Task #4: Create VM Networks.

This is a little misleading in the sense we’re not actually creating networks per se. It’s more akin to creating a network resource which may be allocated to a VM, after which it becomes a network interface. Allocating a network resource to a VM just means that VM will be created with a virtual NIC which could be active on a specific network. In the Linux world, if you allocate 2 VM Networks to a VM, the VM will be created with 2 network interfaces, eth0 and eth1. This makes more sense when you create the VM Networks and then allocate them to VMs which we’ll do in Part 5. Bear with me.

By installing OVM Server, you get a VM Network created by default. Its name defaults to an IP address made up of the first two octets of the network it’s on, followed by 0.0. In our case, the default name is 200.200.0.0. You can see this VM Network by clicking on the Networking tab in OVM Manager, then by clicking Networks. The screen should look similar to this:

Note, this VM Network has 3 default uses or Channels. Server Management, Cluster Heartbeat and Live Migrate. Click on the pencil icon so we can change the name of this network to something more meaningful and give it another channel. Change the Name to Management_Public and check the box for Virtual Machine. The screen should look similar to this:

Click on the Ports tab. This is where things get a little weird. If you’re following along using the network addressing described in this series, you should see these values on a screen similar to this:

Parameter Value
Port Name bond0 on ovmsvr.mynet.com
Server ovmsvr.mynet.com
IP Address 200.200.10.7
Bonding Yes

When we installed OVM Server, we picked the HP ProLiant DL380’s eth0 for the network interface and assigned it the IP address 200.200.10.7. Rather miraculously, OVM Server has created a network interface bond (bond0) with a single network interface. Hence the reason you see bond0 as the port for the default network. So, addresses on the public network will be of the form, 200.200.10.x.

We need to create two more VM Networks. One for the shared storage which will be coming from the Openfiler server. Another for the private interconnect so the Oracle RAC nodes can chat.

Click OK to return to the Networking tab, then click the green + sign to start the Create Nework dialog. Select the option, Create a Network with Ports/Bond Ports/VLAN interfaces. The screen should look similar to this:

Click Next. In the Name field, enter Shared_Storage_Public and check the box next to Virtual Machine. This signifies that this network will be used as a regular network by a VM. We happen to know it’ll be for the shared storage, but OVM Server doesn’t know that and doesn’t need to care. The screen should now look similar to this:

Note, as tempting as it might seem to check Storage instead, don’t do it. Even the Oracle VM documentation tells you to stay away from it. This from the Create New Network documentation:

Storage: Reserved for future use and currently has no practical function or application.”

Now, we could have a debate at this point about why this option is even here, but my coffee’s getting cold so let’s move on.

Click Next. This shows the Add Ports to Network dialog. There are 3 more network interfaces available on the DL380 running ovmsvr.mynet.com. Select eth1. The screen should now look similar to this:

Click OK. Then click the green + sign to open the Create Network dialog again. This time select the option, Create a Local Network on a single server. Instead of mapping this VM Network to a physical network interface on the DL380, we will create a virtual network resource that doesn’t use any additional networking hardware. Spooky! We still have 2 spare network interfaces on the DL380 (eth2 & eth3), but we’re not going to use them for now. The great thing about this virtual network resource is that it operates at bus speed which is pretty handy for the interconnect. The screen should now look similar to this:

Click Next. Give the network the name, GI_Interconnect_Private and add a suitable description. The screen now looks similar to this:

Click Next. Select ovmsvr.mynet.com for the Server. The screen should look similar to this:

Click Finish. Your screen should look similar to this:

Just one more change to make. We need to add an IP address to the eth1 network interface on the OVM Server. Click the Shared_Storage_Public network, then click the pencil icon to open the Edit Network dialog:

Click the Ports tab, highlight the eth1 entry then click the pencil icon to open the Edit Port(s) dialog. Use these values to complete this screen:

Field Value
Addressing Static
IP Address 200.200.20.7
Mask 255.255.255.0

The screen should look like this:

Click OK and your work is done!

OVM Server is now configured and ready for you to create VMs.
See you in Part 5 for that. We having fun yet? 😀

If you have any comments or questions about this post, please use the Contact form here.