Checklist for Monitoring NetApp OCI and OCI DWH Infrastructure Servers

Here we present a simple checklist to record your monitoring of your NetApp OCI and OCI DWH Infrastructure servers. This isn’t about monitoring stuff with OCI correctly, this is monitoring the actual OCI infrastructure is good and healthy.

This is based on OCI 7.3.8. Later versions of OCI (like 7.3.9 and 7.3.10 now already in June 2020) may have more monitoring/alerting options and different services.

Click on the images to make larger.

Image: Checklist for Monitoring the NetApp OCI Server

Image: Checklist for Monitoring the NetApp OCI DWH Server

APPENDIX: Text version of above tables.

Not pretty output which is why I posted the images above.

Server Name # Server Type # Location # Thing to Check # How to Check # Checkbox
 # OCI # Remote Monitoring # Is the server pingable? # Server monitoring (i.e. SCOM?) #
 # OCI # Remote Monitoring # Is the web page available? # Server monitoring (i.e. SCOM?) #
 # OCI # Remote Monitoring # CPU # Server monitoring (i.e. SCOM?) / Admin > Health #
 # OCI # Remote Monitoring # Memory # Server monitoring (i.e. SCOM?) / Admin > Health #
 # OCI # Remote Monitoring # Disk Space # Server monitoring (i.e. SCOM?) / Admin > Health #
 # OCI # Remote Monitoring # Disk Latency # Server monitoring (i.e. SCOM?) #
 # OCI # services.msc # MySQL (Start 1st) # Server monitoring (i.e. SCOM?) / Admin > Health #
 # OCI # services.msc # Elasticsearch (Start 2nd) # Server monitoring (i.e. SCOM?) / Admin > Health #
 # OCI # services.msc # SANscreen Server (Start 3rd) # Server monitoring (i.e. SCOM?) #
 # OCI # services.msc # SANscreen Acq (Start 4th) # Server monitoring (i.e. SCOM?) #
 # OCI # Admin > Data sources # Check Status = All successful # WebUI (manual) / REST API #
 # OCI # Admin > Acquisition Units # Check Status = OK # WebUI (manual) / REST API API #
 # OCI # Admin > Health # Check Details = OK # WebUI (manual) / REST API #
 # OCI # Admin > Setup > Licenses # Check Status = OK # WebUI (manual) / REST API #
 # OCI # Admin > Setup > Backup & Archive # Weekly backup enabled? # WebUI (manual) / REST API #
 # OCI # Admin > Health > Server > Weekly Backup # Backup working? # WebUI (manual) / REST API #
 # OCI # Admin > Setup > ASUP & Proxy # ASUP enabled? # WebUI (manual) / REST API #
 # OCI # Admin > Health > Server > ASUP # ASUP working? # WebUI (manual) / REST API #

Server Name # Server Type # Location # Thing to Check # How to Check # Checkbox
# DWH # Remote Monitoring # Is the server pingable? # Server monitoring (i.e. SCOM?) #
# DWH # Remote Monitoring # Are the web pages available? # Server monitoring (i.e. SCOM?) #
# DWH # Remote Monitoring # CPU # Server monitoring (i.e. SCOM?) #
# DWH # Remote Monitoring # Memory # Server monitoring (i.e. SCOM?) #
# DWH # Remote Monitoring # Disk Space # Server monitoring (i.e. SCOM?) #
# DWH # Remote Monitoring # Disk Latency # Server monitoring (i.e. SCOM?) #
# DWH # services.msc # MySQL (Start 1st) # Service Montoring (i.e. via SCOM) #
# DWH # services.msc # SANscreen Server (Start 2nd) # Service Montoring (i.e. via SCOM) #
# DWH # services.msc # ApacheDS - cognos (Start 3rd) # Service Montoring (i.e. via SCOM) #
# DWH # services.msc # Informix IDS Message Service (Start 4th) # Service Montoring (i.e. via SCOM) #
# DWH # services.msc # Informix IDS - ol_cognoscm (Start 5th) # Service Montoring (i.e. via SCOM) #
# DWH # services.msc # IBM Cognos (Start 6th) # Service Montoring (i.e. via SCOM) #
# DWH # /dwh -> Connectors # State = COMPLETED (no FAILED) # WebUI (manual) / via OCI #
# DWH # /dwh -> Jobs # Status = COMPLETED # WebUI (manual) / via OCI (partial check) / Email Notification #
# DWH # /dwh -> Schedule # Build Schedule enabled? # WebUI (manual) #
# DWH # /dwh -> Jobs # Build working? # WebUI (manual) #
# DWH # /dwh -> Schedule # Backup Schedule enabled? # WebUI (manual) #
# DWH # /dwh -> Jobs # Backup working? # WebUI (manual) #
# DWH # /dwh -> Email Notification # Configured? # WebUI (manual) #
# DWH # /dwh -> Troubleshooting # OCI ASUP Enabled? # WebUI (manual) #

Comments