#include <db.h>

int DB_ENV->rep_elect(DB_ENV *env, int nsites, int nvotes, int priority, u_int32_t timeout, int *envid, u_int32_t flags);

Description: DB_ENV->rep_elect

The DB_ENV->rep_elect method holds an election for the master of a replication group.

If the election is successful, the new master's ID may be the ID of the previous master, or the ID of the current replication site. The application is responsible for adjusting its relationship to the other database environments in the replication group, including directing all database updates to the newly selected master, in accordance with the results of this election.

The thread of control that calls the DB_ENV->rep_elect method must not be the thread of control that processes incoming messages; processing the incoming messages is necessary to successfully complete an election.


The envid parameter references memory into which the newly elected master's ID is copied.
The nsites parameter specifies the number of replication sites expected to participate in the election. Once the current site has election information from that many sites, it will short-circuit the election and immediately cast its vote for a new master. The nsites parameter must be a positive integer, no less than nvotes.
The nvotes parameter specifies the minimum number of replication sites from which the current site must have election information, before the current site will cast a vote for a new master. The nvotes parameter must be a positive integer and no greater than nsites, or 0 if the election should use the value ((nsites / 2) + 1) as the nvotes argument.
The priority parameter is the priority of this environment. It must be a positive integer, or 0 if this environment is not permitted to become a master (see Replication environment priorities for more information).
The timeout parameter specifies a timeout period for an election.
The flags parameter is currently unused, and must be set to 0.

Elections are done in two parts: first, replication sites collect information from the other replication sites they know about, and second, replication sites cast their votes for a new master. The second phase is triggered by one of two things: either the replication site gets election information from nsites sites, or the election timeout expires. Once the second phase is triggered, the replication site will cast a vote for the new master of its choice if, and only if, the site has election information from at least nvotes sites. If a site receives nvotes votes for it to become the new master, then it will become the new master.

We recommend nvotes be set to at least:

(sites participating in the election / 2) + 1

to ensure there are never more than two masters active at the same time even in the case of a network partition. When a network partitions, the side of the partition with more than half the environments will elect a new master and continue, while the environments communicating with fewer than half of the environments will fail to find a new master, as no site can get nvotes votes.

We recommend nsites be set to:

sites participating in the election - 1

This allows replication groups to elect a new master immediately if the current master fails. Setting the nsites value to the number of sites participating in the replication election ensures all sites in the group get the chance to participate in the election, at the cost of potentially slower elections. Setting nsites to lower values can increase the speed of an election, but can also result in election failure, and is usually not recommended.


The DB_ENV->rep_elect method may fail and return one of the following non-zero errors:

The replication group was unable to elect a master, or was unable to complete the election in the specified timeout period.



See Also

Replication and Related Methods


Copyright (c) 1996-2005 Sleepycat Software, Inc. - All rights reserved.