Wednesday, April 10, 2013

VxHA: set a new monitor timeout

Today I had a customer with a cluster and a Orcale database. The customer called in because his database always shutsdown once a day. At first I took a look at the alert log for the database (SID DB):

# less alert_DB.log
Instance terminated by USER, pid = 15764

Obviously the database was shutdown manually. The next log I investigated was engine_A.log and Oracle_A.log. One messages repeated itself in both log files:

# less engine_A.log
VCS ERROR V-16-2-13027 (NODE_2) Resource(Oracle_DB) - monitor procedure did not complete within the expected time.

So I deceided to set the timeouts for the Oracle resource higher. The first thing I needed to know was the type for the resource:

# hares -display Oracle_DB
Oracle_DB Type             global     Oracle

The type is Oracle and this type holds the values for the timeout:

# hatype -display Oracle
Oracle       MonitorTimeout          60
Oracle       MonitorInterval         60

There is no sense in setting only the timeout higher, the interval should also be updated to a higher value. To update the values the config must be in rw mode (-makerw):

# haconf -makerw

When the config is in rw mode then the new values for the timeout and the interval can be set:

# hatype -modify Oracle MonitorTimeout 180
# hatype -modify Oracle MonitorInterval 180

After that the config must be spread across all nodes (-dump) and set to be in ro mode (-makero) again:

# haconf -dump -makero
Finally make sure that the values are set:

# hatype -value Oracle MonitorTimeout
# hatype -value Oracle MonitorInterval

The values for the monitor timeout and the monitor interval were successfully set from 60 to 180 seconds.