continuent.org JIRA  History | Log In     View a printable version of the current page. Get help!  
Issue Details (XML | Word)

Key: TREP-353
Type: Bug Bug
Status: Resolved Resolved
Resolution: Fixed
Priority: Critical Critical
Assignee: Robert Hodges
Reporter: Scott Martin
Votes: 0
Watchers: 0
Operations

If you were logged in you would be able to see more operations.
Tungsten Replicator

Intermittent issue with master side hang while waiting for heartbeat

Created: 21/Sep/09 11:48 AM   Updated: 12/Oct/09 05:25 PM
Return to search
Component/s: Core Framework
Affects Version/s: Tungsten Replicator 1.0.3
Fix Version/s: Tungsten Replicator 1.0.4

Original Estimate: Unknown Remaining Estimate: Unknown Time Spent: Unknown
Environment: MySQL 5.x


 Description   
The Tungsten replicator periodically seems to miss heartbeat events. The problem appears to be that the wait call on the master site (JMX API ReplicatorManagerMBean.waitForAppliedSequenceNumber() fails to note that we have reached the sequence number, causing a prolonged wait.

The workaround in this case is to send traffic through the master, in which case we recognize that the wait has occurred.

This looks like a race condition related to heartbeat event processing. We also see it in switch operations from the Tungsten Manager. It needs to be fixed as it blocks convenient failover.

 All   Comments   Work Log   Change History      Sort Order:
Comment by Stephane Giron [12/Oct/09 05:25 PM]
New fix committed to SVN