Today I had a customer with a cluster and a Orcale database. The customer called in because his database always shutsdown once a day. At first I took a look at the alert log for the database (SID DB):
# less alert_DB.log
...
Instance terminated by USER, pid = 15764
...
Obviously the database was shutdown manually. The next log I investigated was engine_A.log and Oracle_A.log. One messages repeated itself in both log files:
# less engine_A.log
...
VCS ERROR V-16-2-13027 (NODE_2) Resource(Oracle_DB) - monitor procedure did not complete within the expected time.
...
So I deceided to set the timeouts for the Oracle resource higher. The first thing I needed to know was the type for the resource:
# hares -display Oracle_DB
...
Oracle_DB Type global Oracle
...
The type is Oracle and this type holds the values for the timeout:
# hatype -display Oracle
...
Oracle MonitorTimeout 60
...
Oracle MonitorInterval 60
...
There is no sense in setting only the timeout higher, the interval should also be updated to a higher value. To update the values the config must be in rw mode (-makerw):
# haconf -makerw
When the config is in rw mode then the new values for the timeout and the interval can be set:
# hatype -modify Oracle MonitorTimeout 180
# hatype -modify Oracle MonitorInterval 180
After that the config must be spread across all nodes (-dump) and set to be in ro mode (-makero) again:
# haconf -dump -makero
Finally make sure that the values are set:
# hatype -value Oracle MonitorTimeout
180
# hatype -value Oracle MonitorInterval
180
The values for the monitor timeout and the monitor interval were successfully set from 60 to 180 seconds.
No comments:
Post a Comment