Terracotta Alternative?

If I understood it correctly, Terracotta documentation says that in the event of a TC server failure (in networked active passive mode), all TC clients disconnect, wait for election for new active TC server to finish, then reconnect. If this is the case then all of the servoy app servers in the cluster will restart. The big picture shows the downtime will be minimal, but this is still mass service disruption. Are there any other clustering solutions available that have better failover support than a full cluster restart?

Thanks,
PB

Anyone?

Servoy Application Servers will not restart right away when they are disconnected from the cluster.
If a server stays disconnected for more then 60 seconds, then it will restart. You can tune the timeout (in seconds) by using something like this in servoy.properties:

cluster.timeToForceShutdown=180

This way Servoy Application Servers will wait for the connection to be restored (network problems, active Terracotta server switch, …). The active-passive should switch fast enough for default setting. If you experience something different, please file a case. You should be able to test it yourself.

Logging information related to the cluster can be found in two places:

  • in the Terracotta server and client logs that can be found at the location specified in “tc-config.xml”. Server logs show Terracotta Server activity and client logs show Servoy Application Server activity.
  • in the Servoy log file. Terracotta life-cycle related messages can be controllerd using
log4j.logger.com.servoy.j2db.terracotta.TerracottaStatusMonitor=...
```in the servoy.properties file (where "..." can be ERROR, WARN, INFO, DEBUG, TRACE). INFO is recommended for most cases, as it gives an overview of cluster events that can be useful when diagnosing problems, while at the same time the number of messages should be low.
Changing this log setting to INFO level might help you better understand what's going on when you test things like this.

One more thing - you shouldn’t set cluster.timeToForceShutdown to a very high value.
If only one Servoy App. Server has network problems for example, it will be dropped from the cluster after a while (so still not right away) and even if network connection is restored it will fail to re-establish connection to the cluster and it does need a restart. That’s what this setting is for.