Skip to content

Gossip members that die and get restarted will be starved of access and never join gossip group again #1

@GoogleCodeExporter

Description

@GoogleCodeExporter
What steps will reproduce the problem?
1. Run a gossip group with 3 members, let their heartbeats count up over 20
or so, all successfully gossiping
2. Shut one down
3. Wait for all members in live group to decide the dead member is dead
4. Restart dead member
5. Dead member will try to seed but it will be ignored due to it's
heartbeat not being high enough (Client.java: line 329 rev 9509ef5052).
6. Dead member will then not get any membership lists from live group and
think they are dead too.

What is the expected output? What do you see instead?
Dead member when restarted should be able to re join the live gossip group.

Please provide any additional information below.
This can be fixed by having a kind of zombie state between dead and alive
where if you notice that the zombie heartbeat is increasing it must have
restarted and it's heartbeat set back to zero and be increasing from there:

private Map<Member, Long> zombieHeartbeats = new Hashtable<Member, Long>();
...
...
} else if(deadMembers.contains(remoteMember)){
  Member deadMember = deadMembers.get(deadMembers.indexOf(remoteMember));
  if(remoteMember.getHeartBeat() > deadMember.getHeartBeat()) {
    deadMembers.remove(remoteMember);
    healthyMembers.add(deadMember);
    deadMember.setHeartBeat(remoteMember.getHeartBeat());
    deadMember.resetTimeoutTimer();
} else if(zombieHeartbeats.containsKey(remoteMember) &&
remoteMember.getHeartBeat() > zombieHeartbeats.get(remoteMember)) {
    deadMembers.remove(remoteMember);
    healthyMembers.add(deadMember);
    deadMember.setHeartBeat(remoteMember.getHeartBeat());
    deadMember.resetTimeoutTimer();
    zombieHeartbeats.remove(remoteMember);
} else {
    zombieHeartbeats.put(remoteMember, remoteMember.getHeartBeat());
}

Hope that helps.

Original issue reported on code.google.com by simon.l...@gmail.com on 8 Apr 2010 at 3:54

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions