Search

High Availability Operating Scenarios

What are the operating scenarios of the CEMView High Availability system?

The following sections describe some operating scenarios with possible failure modes and how it is handled by the CEMView High Availability system.


Case 1: The secondary computer fails and the primary computer is running.


When the primary CEMView Server application determines that the secondary system failure has occurred due to the watchdog handshake timeout, an alarm would be generated and visible to all CEMView Client applications.  The primary CEMView server operates as normal.

When the secondary system is fixed and restarted, it will first communicate with the primary system to determine if there is any data available on the primary system database that could fill in the gap of data in its dataset.  It will automatically synchronize its database with the primary system by requesting the missing data.  It will then operate as before.


Case 2: The primary computer fails and the secondary computer is running.


When the secondary CEMView Server application determines that the primary system failure has occurred due to the watchdog handshake timeout, the secondary CEMView server will disconnect from the primary computer's OPC Servers, launch the local OPC Servers and connect to these local OPC Servers.  An alarm would be generated and visible to all connected CEMView Client applications.  It will then continue to operate as normal, and will periodically attempt to re-connect to the primary CEMView Server.  The secondary CEMView Client application that was connected to the primary CEMView Server would have also automatically detected the primary CEMView Server failure and automatically connects to the local secondary CEMView Server application.

When the primary system is fixed and restarted, it will first communicate with the secondary system and determine if there is any data available on the secondary system database that could fill in the gap of data in its dataset.  It will automatically synchronize its database with the secondary system by requesting the missing data.  It will then launch its OPC Servers and operate as before.

When the secondary system detects that the primary system is back online, it will disconnect from the local OPC Servers and reconnect to the primary computer's OPC Servers and CEMView Server.  When the connection to the primary CEMView Server is validated by the watchdog logic, the CEMView Client in the secondary computer will disconnect from the local secondary CEMView Server and re-connect to the remote primary CEMView Server.


Case 3: Both primary and secondary computers are working, but there is a failure in the Ethernet connection between the two computers.


For this case, both systems will operate in parallel as discussed in the previous cases.

When communication is re-established, the secondary system will communicate with the primary system, disconnect its OPC Servers , and the CEMView Client will reconnect to the primary CEMView Server.  There may be small differences in the datasets of the two systems, and this is a natural and expected behaviour of two systems operating and scanning an instrument in parallel.

The secondary system will automatically synchronize its database with the primary system database to eliminate any differences in the datasets.

Copyright 1999-2011 by Nexus Solutions Inc.