Context Navigation

← Previous Change
Next Change →

ctdb-tunables.7.html

Timestamp:

Nov 25, 2016, 8:04:54 PM (9 years ago)

Author:

Silvan Scherrer

Message:

Samba Server: update vendor to version 4.4.7

File:

: 1 edited

vendor/current/ctdb/doc/ctdb-tunables.7.html (modified) (3 diffs)

Legend:

: Unmodified
: Added
: Removed

vendor/current/ctdb/doc/ctdb-tunables.7.html

-              r988
+              r989
 <html><head><meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"><title>ctdb-tunables</title><meta name="generator" content="DocBook XSL Stylesheets V1.78.1"></head><body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF"><div class="refentry"><a name="ctdb-tunables.7"></a><div class="titlepage"></div><div class="refnamediv"><h2>Name</h2><p>ctdb-tunables &#8212; CTDB tunable configuration variables</p></div><div class="refsect1"><a name="idp52032112"></a><h2>DESCRIPTION</h2><p>
+<html><head><meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"><title>ctdb-tunables</title><meta name="generator" content="DocBook XSL Stylesheets V1.78.1"></head><body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF"><div class="refentry"><a name="ctdb-tunables.7"></a><div class="titlepage"></div><div class="refnamediv"><h2>Name</h2><p>ctdb-tunables &#8212; CTDB tunable configuration variables</p></div><div class="refsect1"><a name="idp51068080"></a><h2>DESCRIPTION</h2><p>
       CTDB's behaviour can be configured by setting run-time tunable
       variables.  This lists and describes all tunables.  See the
 …
       <span class="command"><strong>listvars</strong></span>, <span class="command"><strong>setvar</strong></span> and
       <span class="command"><strong>getvar</strong></span> commands for more details.
+    </p><div class="refsect2"><a name="idp52844128"></a><h3>MaxRedirectCount</h3><p>Default: 3</p><p>
+        If we are not the DMASTER and need to fetch a record across the network
+        we first send the request to the LMASTER after which the record
+        is passed onto the current DMASTER. If the DMASTER changes before
+        the request has reached that node, the request will be passed onto the
+        "next" DMASTER. For very hot records that migrate rapidly across the
+        cluster this can cause a request to "chase" the record for many hops
+        before it catches up with the record.
+        this is how many hops we allow trying to chase the DMASTER before we
+        switch back to the LMASTER again to ask for new directions.
+      </p><p>
+        When chasing a record, this is how many hops we will chase the record
+        for before going back to the LMASTER to ask for new guidance.
+      </p></div><div class="refsect2"><a name="idp52639696"></a><h3>SeqnumInterval</h3><p>Default: 1000</p><p>
+        Some databases have seqnum tracking enabled, so that samba will be able
+        to detect asynchronously when there has been updates to the database.
+        Everytime a database is updated its sequence number is increased.
+      </p><p>
+        This tunable is used to specify in 'ms' how frequently ctdb will
+        send out updates to remote nodes to inform them that the sequence
+        number is increased.
+      </p></div><div class="refsect2"><a name="idp52023488"></a><h3>ControlTimeout</h3><p>Default: 60</p><p>
+        This is the default
+        setting for timeout for when sending a control message to either the
+        local or a remote ctdb daemon.
+      </p></div><div class="refsect2"><a name="idp51243376"></a><h3>TraverseTimeout</h3><p>Default: 20</p><p>
+        This setting controls how long we allow a traverse process to run.
+        After this timeout triggers, the main ctdb daemon will abort the
+        traverse if it has not yet finished.
+      </p></div><div class="refsect2"><a name="idp50157008"></a><h3>KeepaliveInterval</h3><p>Default: 5</p><p>
+        How often in seconds should the nodes send keepalives to eachother.
+      </p></div><div class="refsect2"><a name="idp49234000"></a><h3>KeepaliveLimit</h3><p>Default: 5</p><p>
+        After how many keepalive intervals without any traffic should a node
+        wait until marking the peer as DISCONNECTED.
+      </p><p>
+        If a node has hung, it can thus take KeepaliveInterval*(KeepaliveLimit+1)
+        seconds before we determine that the node is DISCONNECTED and that we
+        require a recovery. This limitshould not be set too high since we want
+        a hung node to be detectec, and expunged from the cluster well before
+        common CIFS timeouts (45-90 seconds) kick in.
+      </p></div><div class="refsect2"><a name="idp53887184"></a><h3>RecoverTimeout</h3><p>Default: 20</p><p>
+        This is the default setting for timeouts for controls when sent from the
+        recovery daemon. We allow longer control timeouts from the recovery daemon
+        than from normal use since the recovery dameon often use controls that
+        can take a lot longer than normal controls.
+      </p></div><div class="refsect2"><a name="idp53889072"></a><h3>RecoverInterval</h3><p>Default: 1</p><p>
+        How frequently in seconds should the recovery daemon perform the
+        consistency checks that determine if we need to perform a recovery or not.
+      </p></div><div class="refsect2"><a name="idp53890832"></a><h3>ElectionTimeout</h3><p>Default: 3</p><p>
+        When electing a new recovery master, this is how many seconds we allow
+        the election to take before we either deem the election finished
+        or we fail the election and start a new one.
+      </p></div><div class="refsect2"><a name="idp53892640"></a><h3>TakeoverTimeout</h3><p>Default: 9</p><p>
+        This is how many seconds we allow controls to take for IP failover events.
+      </p></div><div class="refsect2"><a name="idp53894240"></a><h3>MonitorInterval</h3><p>Default: 15</p><p>
+        How often should ctdb run the event scripts to check for a nodes health.
+      </p></div><div class="refsect2"><a name="idp53895840"></a><h3>TickleUpdateInterval</h3><p>Default: 20</p><p>
+        How often will ctdb record and store the "tickle" information used to
+        kickstart stalled tcp connections after a recovery.
+      </p></div><div class="refsect2"><a name="idp53897584"></a><h3>EventScriptTimeout</h3><p>Default: 30</p><p>
+    </p><p>
+      The tunable variables are listed alphabetically.
+    </p><div class="refsect2"><a name="idp51120048"></a><h3>AllowClientDBAttach</h3><p>Default: 1</p><p>
+        When set to 0, clients are not allowed to attach to any databases.
+        This can be used to temporarily block any new processes from
+        attaching to and accessing the databases.  This is mainly used
+        for detaching a volatile database using 'ctdb detach'.
+      </p></div><div class="refsect2"><a name="idp53889776"></a><h3>AllowUnhealthyDBRead</h3><p>Default: 0</p><p>
+        When set to 1, ctdb allows database traverses to read unhealthy
+        databases.  By default, ctdb does not allow reading records from
+        unhealthy databases.
+      </p></div><div class="refsect2"><a name="idp54131312"></a><h3>ControlTimeout</h3><p>Default: 60</p><p>
+        This is the default setting for timeout for when sending a
+        control message to either the local or a remote ctdb daemon.
+      </p></div><div class="refsect2"><a name="idp51364816"></a><h3>DatabaseHashSize</h3><p>Default: 100001</p><p>
+        Number of the hash chains for the local store of the tdbs that
+        ctdb manages.
+      </p></div><div class="refsect2"><a name="idp53157488"></a><h3>DatabaseMaxDead</h3><p>Default: 5</p><p>
+        Maximum number of dead records per hash chain for the tdb databses
+        managed by ctdb.
+      </p></div><div class="refsect2"><a name="idp50010288"></a><h3>DBRecordCountWarn</h3><p>Default: 100000</p><p>
+        When set to non-zero, ctdb will log a warning during recovery if
+        a database has more than this many records. This will produce a
+        warning if a database grows uncontrollably with orphaned records.
+      </p></div><div class="refsect2"><a name="idp49085760"></a><h3>DBRecordSizeWarn</h3><p>Default: 10000000</p><p>
+        When set to non-zero, ctdb will log a warning during recovery
+        if a single record is bigger than this size. This will produce
+        a warning if a database record grows uncontrollably.
+      </p></div><div class="refsect2"><a name="idp49087568"></a><h3>DBSizeWarn</h3><p>Default: 1000000000</p><p>
+        When set to non-zero, ctdb will log a warning during recovery if
+        a database size is bigger than this. This will produce a warning
+        if a database grows uncontrollably.
+      </p></div><div class="refsect2"><a name="idp49089360"></a><h3>DeferredAttachTO</h3><p>Default: 120</p><p>
+        When databases are frozen we do not allow clients to attach to
+        the databases. Instead of returning an error immediately to the
+        client, the attach request from the client is deferred until
+        the database becomes available again at which stage we respond
+        to the client.
+      </p><p>
+        This timeout controls how long we will defer the request from the
+        client before timing it out and returning an error to the client.
+      </p></div><div class="refsect2"><a name="idp54043296"></a><h3>DeterministicIPs</h3><p>Default: 0</p><p>
+        When set to 1, ctdb will try to keep public IP addresses locked
+        to specific nodes as far as possible. This makes it easier
+        for debugging since you can know that as long as all nodes are
+        healthy public IP X will always be hosted by node Y.
+      </p><p>
+        The cost of using deterministic IP address assignment is that it
+        disables part of the logic where ctdb tries to reduce the number
+        of public IP assignment changes in the cluster. This tunable may
+        increase the number of IP failover/failbacks that are performed
+        on the cluster by a small margin.
+      </p></div><div class="refsect2"><a name="idp54045872"></a><h3>DisableIPFailover</h3><p>Default: 0</p><p>
+        When set to non-zero, ctdb will not perform failover or
+        failback. Even if a node fails while holding public IPs, ctdb
+        will not recover the IPs or assign them to another node.
+      </p><p>
+        When this tunable is enabled, ctdb will no longer attempt
+        to recover the cluster by failing IP addresses over to other
+        nodes. This leads to a service outage until the administrator
+        has manually performed IP failover to replacement nodes using the
+        'ctdb moveip' command.
+      </p></div><div class="refsect2"><a name="idp54048368"></a><h3>ElectionTimeout</h3><p>Default: 3</p><p>
+        The number of seconds to wait for the election of recovery
+        master to complete. If the election is not completed during this
+        interval, then that round of election fails and ctdb starts a
+        new election.
+      </p></div><div class="refsect2"><a name="idp54050192"></a><h3>EnableBans</h3><p>Default: 1</p><p>
+        This parameter allows ctdb to ban a node if the node is misbehaving.
+      </p><p>
+        When set to 0, this disables banning completely in the cluster
+        and thus nodes can not get banned, even it they break. Don't
+        set to 0 unless you know what you are doing.  You should set
+        this to the same value on all nodes to avoid unexpected behaviour.
+      </p></div><div class="refsect2"><a name="idp54052448"></a><h3>EventScriptTimeout</h3><p>Default: 30</p><p>
         Maximum time in seconds to allow an event to run before timing
         out.  This is the total time for all enabled scripts that are
 …
         success.  The logic here is that the callers of these events
         implement their own additional timeout.
+      </p></div><div class="refsect2"><a name="idp53900064"></a><h3>MonitorTimeoutCount</h3><p>Default: 20</p><p>
+        How many monitor events in a row need to timeout before a node
+        is flagged as UNHEALTHY.  This setting is useful if scripts
+        can not be written so that they do not hang for benign
+        reasons.
+      </p></div><div class="refsect2"><a name="idp53901872"></a><h3>RecoveryGracePeriod</h3><p>Default: 120</p><p>
+        During recoveries, if a node has not caused recovery failures during the
+        last grace period, any records of transgressions that the node has caused
+        recovery failures will be forgiven. This resets the ban-counter back to
+        zero for that node.
+      </p></div><div class="refsect2"><a name="idp49113200"></a><h3>RecoveryBanPeriod</h3><p>Default: 300</p><p>
+        If a node becomes banned causing repetitive recovery failures. The node will
+        eventually become banned from the cluster.
+        This controls how long the culprit node will be banned from the cluster
+        before it is allowed to try to join the cluster again.
+        Don't set to small. A node gets banned for a reason and it is usually due
+        to real problems with the node.
+      </p></div><div class="refsect2"><a name="idp49115184"></a><h3>DatabaseHashSize</h3><p>Default: 100001</p><p>
+        Size of the hash chains for the local store of the tdbs that ctdb manages.
+      </p></div><div class="refsect2"><a name="idp49116784"></a><h3>DatabaseMaxDead</h3><p>Default: 5</p><p>
+        How many dead records per hashchain in the TDB database do we allow before
+        the freelist needs to be processed.
+      </p></div><div class="refsect2"><a name="idp49118528"></a><h3>RerecoveryTimeout</h3><p>Default: 10</p><p>
+        Once a recovery has completed, no additional recoveries are permitted
+        until this timeout has expired.
+      </p></div><div class="refsect2"><a name="idp49120256"></a><h3>EnableBans</h3><p>Default: 1</p><p>
+        When set to 0, this disables BANNING completely in the cluster and thus
+        nodes can not get banned, even it they break. Don't set to 0 unless you
+        know what you are doing.  You should set this to the same value on
+        all nodes to avoid unexpected behaviour.
+      </p></div><div class="refsect2"><a name="idp49122128"></a><h3>DeterministicIPs</h3><p>Default: 0</p><p>
+        When enabled, this tunable makes ctdb try to keep public IP addresses
+        locked to specific nodes as far as possible. This makes it easier for
+        debugging since you can know that as long as all nodes are healthy
+        public IP X will always be hosted by node Y.
+      </p><p>
+        The cost of using deterministic IP address assignment is that it
+        disables part of the logic where ctdb tries to reduce the number of
+        public IP assignment changes in the cluster. This tunable may increase
+        the number of IP failover/failbacks that are performed on the cluster
+        by a small margin.
+      </p></div><div class="refsect2"><a name="idp49124720"></a><h3>LCP2PublicIPs</h3><p>Default: 1</p><p>
+        When enabled this switches ctdb to use the LCP2 ip allocation
+        algorithm.
+      </p></div><div class="refsect2"><a name="idp49126320"></a><h3>ReclockPingPeriod</h3><p>Default: x</p><p>
+        Obsolete
+      </p></div><div class="refsect2"><a name="idp49127952"></a><h3>NoIPFailback</h3><p>Default: 0</p><p>
+        When set to 1, ctdb will not perform failback of IP addresses when a node
+        becomes healthy. Ctdb WILL perform failover of public IP addresses when a
+        node becomes UNHEALTHY, but when the node becomes HEALTHY again, ctdb
+        will not fail the addresses back.
+      </p><p>
+        Use with caution! Normally when a node becomes available to the cluster
+        ctdb will try to reassign public IP addresses onto the new node as a way
+        to distribute the workload evenly across the clusternode. Ctdb tries to
+        make sure that all running nodes have approximately the same number of
+        public addresses it hosts.
+      </p><p>
+        When you enable this tunable, CTDB will no longer attempt to rebalance
+        the cluster by failing IP addresses back to the new nodes. An unbalanced
+        cluster will therefore remain unbalanced until there is manual
+        intervention from the administrator. When this parameter is set, you can
+        manually fail public IP addresses over to the new node(s) using the
+        'ctdb moveip' command.
+      </p></div><div class="refsect2"><a name="idp49136144"></a><h3>DisableIPFailover</h3><p>Default: 0</p><p>
+        When enabled, ctdb will not perform failover or failback. Even if a
+        node fails while holding public IPs, ctdb will not recover the IPs or
+        assign them to another node.
+      </p><p>
+        When you enable this tunable, CTDB will no longer attempt to recover
+        the cluster by failing IP addresses over to other nodes. This leads to
+        a service outage until the administrator has manually performed failover
+        to replacement nodes using the 'ctdb moveip' command.
+      </p></div><div class="refsect2"><a name="idp49138608"></a><h3>NoIPTakeover</h3><p>Default: 0</p><p>
+        When set to 1, ctdb will not allow IP addresses to be failed over
+        onto this node. Any IP addresses that the node currently hosts
+        will remain on the node but no new IP addresses can be failed over
+        to the node.
+      </p></div><div class="refsect2"><a name="idp49140448"></a><h3>NoIPHostOnAllDisabled</h3><p>Default: 0</p><p>
+        If no nodes are healthy then by default ctdb will happily host
+      </p></div><div class="refsect2"><a name="idp54054880"></a><h3>FetchCollapse</h3><p>Default: 1</p><p>
+       This parameter is used to avoid multiple migration requests for
+       the same record from a single node. All the record requests for
+       the same record are queued up and processed when the record is
+       migrated to the current node.
+      </p><p>
+        When many clients across many nodes try to access the same record
+        at the same time this can lead to a fetch storm where the record
+        becomes very active and bounces between nodes very fast. This
+        leads to high CPU utilization of the ctdbd daemon, trying to
+        bounce that record around very fast, and poor performance.
+        This can improve performance and reduce CPU utilization for
+        certain workloads.
+      </p></div><div class="refsect2"><a name="idp48966640"></a><h3>HopcountMakeSticky</h3><p>Default: 50</p><p>
+        For database(s) marked STICKY (using 'ctdb setdbsticky'),
+        any record that is migrating so fast that hopcount
+        exceeds this limit is marked as STICKY record for
+        <code class="varname">StickyDuration</code> seconds. This means that
+        after each migration the sticky record will be kept on the node
+        <code class="varname">StickyPindown</code>milliseconds and prevented from
+        being migrated off the node.
+       </p><p>
+        This will improve performance for certain workloads, such as
+        locking.tdb if many clients are opening/closing the same file
+        concurrently.
+      </p></div><div class="refsect2"><a name="idp48969952"></a><h3>KeepaliveInterval</h3><p>Default: 5</p><p>
+        How often in seconds should the nodes send keep-alive packets to
+        each other.
+      </p></div><div class="refsect2"><a name="idp48971552"></a><h3>KeepaliveLimit</h3><p>Default: 5</p><p>
+        After how many keepalive intervals without any traffic should
+        a node wait until marking the peer as DISCONNECTED.
+       </p><p>
+        If a node has hung, it can take
+        <code class="varname">KeepaliveInterval</code> *
+        (<code class="varname">KeepaliveLimit</code> + 1) seconds before
+        ctdb determines that the node is DISCONNECTED and performs
+        a recovery. This limit should not be set too high to enable
+        early detection and avoid any application timeouts (e.g. SMB1)
+        to kick in before the fail over is completed.
+      </p></div><div class="refsect2"><a name="idp48974864"></a><h3>LCP2PublicIPs</h3><p>Default: 1</p><p>
+        When set to 1, ctdb uses the LCP2 ip allocation algorithm.
+      </p></div><div class="refsect2"><a name="idp48976464"></a><h3>LockProcessesPerDB</h3><p>Default: 200</p><p>
+        This is the maximum number of lock helper processes ctdb will
+        create for obtaining record locks.  When ctdb cannot get a record
+        lock without blocking, it creates a helper process that waits
+        for the lock to be obtained.
+      </p></div><div class="refsect2"><a name="idp48978304"></a><h3>LogLatencyMs</h3><p>Default: 0</p><p>
+        When set to non-zero, ctdb will log if certains operations
+        take longer than this value, in milliseconds, to complete.
+        These operations include "process a record request from client",
+        "take a record or database lock", "update a persistent database
+        record" and "vaccum a database".
+      </p></div><div class="refsect2"><a name="idp48980208"></a><h3>MaxQueueDropMsg</h3><p>Default: 1000000</p><p>
+        This is the maximum number of messages to be queued up for
+        a client before ctdb will treat the client as hung and will
+        terminate the client connection.
+      </p></div><div class="refsect2"><a name="idp48981984"></a><h3>MonitorInterval</h3><p>Default: 15</p><p>
+        How often should ctdb run the 'monitor' event in seconds to check
+        for a node's health.
+      </p></div><div class="refsect2"><a name="idp48988480"></a><h3>MonitorTimeoutCount</h3><p>Default: 20</p><p>
+        How many 'monitor' events in a row need to timeout before a node
+        is flagged as UNHEALTHY.  This setting is useful if scripts can
+        not be written so that they do not hang for benign reasons.
+      </p></div><div class="refsect2"><a name="idp48990288"></a><h3>NoIPFailback</h3><p>Default: 0</p><p>
+        When set to 1, ctdb will not perform failback of IP addresses
+        when a node becomes healthy. When a node becomes UNHEALTHY,
+        ctdb WILL perform failover of public IP addresses, but when the
+        node becomes HEALTHY again, ctdb will not fail the addresses back.
+      </p><p>
+        Use with caution! Normally when a node becomes available to the
+        cluster ctdb will try to reassign public IP addresses onto the
+        new node as a way to distribute the workload evenly across the
+        clusternode. Ctdb tries to make sure that all running nodes have
+        approximately the same number of public addresses it hosts.
+      </p><p>
+        When you enable this tunable, ctdb will no longer attempt to
+        rebalance the cluster by failing IP addresses back to the new
+        nodes. An unbalanced cluster will therefore remain unbalanced
+        until there is manual intervention from the administrator. When
+        this parameter is set, you can manually fail public IP addresses
+        over to the new node(s) using the 'ctdb moveip' command.
+      </p></div><div class="refsect2"><a name="idp48993680"></a><h3>NoIPHostOnAllDisabled</h3><p>Default: 0</p><p>
+        If no nodes are HEALTHY then by default ctdb will happily host
         public IPs on disabled (unhealthy or administratively disabled)
         nodes.  This can cause problems, for example if the underlying
+        nodes.  This can cause problems, for example if the underlying
         cluster filesystem is not mounted.  When set to 1 on a node and
         that node is disabled it, any IPs hosted by this node will be
+        that node is disabled, any IPs hosted by this node will be
         released and the node will not takeover any IPs until it is no
         longer disabled.
+      </p></div><div class="refsect2"><a name="idp49142480"></a><h3>DBRecordCountWarn</h3><p>Default: 100000</p><p>
+        When set to non-zero, ctdb will log a warning when we try to recover a
+        database with more than this many records. This will produce a warning
+        if a database grows uncontrollably with orphaned records.
+      </p></div><div class="refsect2"><a name="idp49144304"></a><h3>DBRecordSizeWarn</h3><p>Default: 10000000</p><p>
+        When set to non-zero, ctdb will log a warning when we try to recover a
+        database where a single record is bigger than this. This will produce
+        a warning if a database record grows uncontrollably with orphaned
+        sub-records.
+      </p></div><div class="refsect2"><a name="idp49146144"></a><h3>DBSizeWarn</h3><p>Default: 1000000000</p><p>
+        When set to non-zero, ctdb will log a warning when we try to recover a
+        database bigger than this. This will produce
+        a warning if a database grows uncontrollably.
+      </p></div><div class="refsect2"><a name="idp49147936"></a><h3>VerboseMemoryNames</h3><p>Default: 0</p><p>
+        This feature consumes additional memory. when used the talloc library
+        will create more verbose names for all talloc allocated objects.
+      </p></div><div class="refsect2"><a name="idp49149696"></a><h3>RecdPingTimeout</h3><p>Default: 60</p><p>
+        If the main dameon has not heard a "ping" from the recovery dameon for
+        this many seconds, the main dameon will log a message that the recovery
+        daemon is potentially hung.
+      </p></div><div class="refsect2"><a name="idp49151488"></a><h3>RecdFailCount</h3><p>Default: 10</p><p>
+        If the recovery daemon has failed to ping the main dameon for this many
+        consecutive intervals, the main daemon will consider the recovery daemon
+        as hung and will try to restart it to recover.
+      </p></div><div class="refsect2"><a name="idp49153312"></a><h3>LogLatencyMs</h3><p>Default: 0</p><p>
+        When set to non-zero, this will make the main daemon log any operation that
+        took longer than this value, in 'ms', to complete.
+        These include "how long time a lockwait child process needed",
+        "how long time to write to a persistent database" but also
+        "how long did it take to get a response to a CALL from a remote node".
+      </p></div><div class="refsect2"><a name="idp49155264"></a><h3>RecLockLatencyMs</h3><p>Default: 1000</p><p>
+        When using a reclock file for split brain prevention, if set to non-zero
+        this tunable will make the recovery dameon log a message if the fcntl()
+        call to lock/testlock the recovery file takes longer than this number of
+        ms.
+      </p></div><div class="refsect2"><a name="idp49157120"></a><h3>RecoveryDropAllIPs</h3><p>Default: 120</p><p>
+        If we have been stuck in recovery, or stopped, or banned, mode for
+        this many seconds we will force drop all held public addresses.
+      </p></div><div class="refsect2"><a name="idp55021168"></a><h3>VacuumInterval</h3><p>Default: 10</p><p>
+      </p></div><div class="refsect2"><a name="idp48995696"></a><h3>NoIPTakeover</h3><p>Default: 0</p><p>
+        When set to 1, ctdb will not allow IP addresses to be failed
+        over onto this node. Any IP addresses that the node currently
+        hosts will remain on the node but no new IP addresses can be
+        failed over to the node.
+      </p></div><div class="refsect2"><a name="idp48997536"></a><h3>PullDBPreallocation</h3><p>Default: 10*1024*1024</p><p>
+        This is the size of a record buffer to pre-allocate for sending
+        reply to PULLDB control. Usually record buffer starts with size
+        of the first record and gets reallocated every time a new record
+        is added to the record buffer. For a large number of records,
+        this can be very inefficient to grow the record buffer one record
+        at a time.
+      </p></div><div class="refsect2"><a name="idp48999504"></a><h3>RecBufferSizeLimit</h3><p>Default: 1000000</p><p>
+        This is the limit on the size of the record buffer to be sent
+        in various controls.  This limit is used by new controls used
+        for recovery and controls used in vacuuming.
+      </p></div><div class="refsect2"><a name="idp49001328"></a><h3>RecdFailCount</h3><p>Default: 10</p><p>
+        If the recovery daemon has failed to ping the main dameon for
+        this many consecutive intervals, the main daemon will consider
+        the recovery daemon as hung and will try to restart it to recover.
+      </p></div><div class="refsect2"><a name="idp49003152"></a><h3>RecdPingTimeout</h3><p>Default: 60</p><p>
+        If the main dameon has not heard a "ping" from the recovery dameon
+        for this many seconds, the main dameon will log a message that
+        the recovery daemon is potentially hung.  This also increments a
+        counter which is checked against <code class="varname">RecdFailCount</code>
+        for detection of hung recovery daemon.
+      </p></div><div class="refsect2"><a name="idp49005424"></a><h3>RecLockLatencyMs</h3><p>Default: 1000</p><p>
+        When using a reclock file for split brain prevention, if set
+        to non-zero this tunable will make the recovery dameon log a
+        message if the fcntl() call to lock/testlock the recovery file
+        takes longer than this number of milliseconds.
+      </p></div><div class="refsect2"><a name="idp49007280"></a><h3>RecoverInterval</h3><p>Default: 1</p><p>
+        How frequently in seconds should the recovery daemon perform the
+        consistency checks to determine if it should perform a recovery.
+      </p></div><div class="refsect2"><a name="idp49009040"></a><h3>RecoverPDBBySeqNum</h3><p>Default: 1</p><p>
+        When set to zero, database recovery for persistent databases is
+        record-by-record and recovery process simply collects the most
+        recent version of every individual record.
+      </p><p>
+        When set to non-zero, persistent databases will instead be
+        recovered as a whole db and not by individual records. The
+        node that contains the highest value stored in the record
+        "__db_sequence_number__" is selected and the copy of that nodes
+        database is used as the recovered database.
+      </p><p>
+        By default, recovery of persistent databses is done using
+        __db_sequence_number__ record.
+      </p></div><div class="refsect2"><a name="idp54874960"></a><h3>RecoverTimeout</h3><p>Default: 120</p><p>
+        This is the default setting for timeouts for controls when sent
+        from the recovery daemon. We allow longer control timeouts from
+        the recovery daemon than from normal use since the recovery
+        dameon often use controls that can take a lot longer than normal
+        controls.
+      </p></div><div class="refsect2"><a name="idp54876784"></a><h3>RecoveryBanPeriod</h3><p>Default: 300</p><p>
+       The duration in seconds for which a node is banned if the node
+       fails during recovery.  After this time has elapsed the node will
+       automatically get unbanned and will attempt to rejoin the cluster.
+      </p><p>
+       A node usually gets banned due to real problems with the node.
+       Don't set this value too small.  Otherwise, a problematic node
+       will try to re-join cluster too soon causing unnecessary recoveries.
+      </p></div><div class="refsect2"><a name="idp54879184"></a><h3>RecoveryDropAllIPs</h3><p>Default: 120</p><p>
+        If a node is stuck in recovery, or stopped, or banned, for this
+        many seconds, then ctdb will release all public addresses on
+        that node.
+      </p></div><div class="refsect2"><a name="idp54880880"></a><h3>RecoveryGracePeriod</h3><p>Default: 120</p><p>
+       During recoveries, if a node has not caused recovery failures
+       during the last grace period in seconds, any records of
+       transgressions that the node has caused recovery failures will be
+       forgiven. This resets the ban-counter back to zero for that node.
+      </p></div><div class="refsect2"><a name="idp54882720"></a><h3>RepackLimit</h3><p>Default: 10000</p><p>
+        During vacuuming, if the number of freelist records are more than
+        <code class="varname">RepackLimit</code>, then the database is repacked
+        to get rid of the freelist records to avoid fragmentation.
+      </p><p>
+        Databases are repacked only if both <code class="varname">RepackLimit</code>
+        and <code class="varname">VacuumLimit</code> are exceeded.
+      </p></div><div class="refsect2"><a name="idp54885920"></a><h3>RerecoveryTimeout</h3><p>Default: 10</p><p>
+        Once a recovery has completed, no additional recoveries are
+        permitted until this timeout in seconds has expired.
+      </p></div><div class="refsect2"><a name="idp54887600"></a><h3>Samba3AvoidDeadlocks</h3><p>Default: 0</p><p>
+        If set to non-zero, enable code that prevents deadlocks with Samba
+        (only for Samba 3.x).
+      </p><p>
+        This should be set to 1 only when using Samba version 3.x
+        to enable special code in ctdb to avoid deadlock with Samba
+        version 3.x.  This code is not required for Samba version 4.x
+        and must not be enabled for Samba 4.x.
+      </p></div><div class="refsect2"><a name="idp54889888"></a><h3>SeqnumInterval</h3><p>Default: 1000</p><p>
+        Some databases have seqnum tracking enabled, so that samba will
+        be able to detect asynchronously when there has been updates
+        to the database.  Everytime a database is updated its sequence
+        number is increased.
+      </p><p>
+        This tunable is used to specify in milliseconds how frequently
+        ctdb will send out updates to remote nodes to inform them that
+        the sequence number is increased.
+      </p></div><div class="refsect2"><a name="idp54892240"></a><h3>StatHistoryInterval</h3><p>Default: 1</p><p>
+        Granularity of the statistics collected in the statistics
+        history. This is reported by 'ctdb stats' command.
+      </p></div><div class="refsect2"><a name="idp54893904"></a><h3>StickyDuration</h3><p>Default: 600</p><p>
+        Once a record has been marked STICKY, this is the duration in
+        seconds, the record will be flagged as a STICKY record.
+      </p></div><div class="refsect2"><a name="idp54895584"></a><h3>StickyPindown</h3><p>Default: 200</p><p>
+        Once a STICKY record has been migrated onto a node, it will be
+        pinned down on that node for this number of milliseconds. Any
+        request from other nodes to migrate the record off the node will
+        be deferred.
+      </p></div><div class="refsect2"><a name="idp54897344"></a><h3>TakeoverTimeout</h3><p>Default: 9</p><p>
+        This is the duration in seconds in which ctdb tries to complete IP
+        failover.
+      </p></div><div class="refsect2"><a name="idp54898880"></a><h3>TDBMutexEnabled</h3><p>Default: 0</p><p>
+        This paramter enables TDB_MUTEX_LOCKING feature on volatile
+        databases if the robust mutexes are supported. This optimizes the
+        record locking using robust mutexes and is much more efficient
+        that using posix locks.
+      </p></div><div class="refsect2"><a name="idp54900656"></a><h3>TickleUpdateInterval</h3><p>Default: 20</p><p>
+        Every <code class="varname">TickleUpdateInterval</code> seconds, ctdb
+        synchronizes the client connection information across nodes.
+      </p></div><div class="refsect2"><a name="idp54902576"></a><h3>TraverseTimeout</h3><p>Default: 20</p><p>
+        This is the duration in seconds for which a database traverse
+        is allowed to run.  If the traverse does not complete during
+        this interval, ctdb will abort the traverse.
+      </p></div><div class="refsect2"><a name="idp54904304"></a><h3>VacuumFastPathCount</h3><p>Default: 60</p><p>
+       During a vacuuming run, ctdb usually processes only the records
+       marked for deletion also called the fast path vacuuming. After
+       finishing <code class="varname">VacuumFastPathCount</code> number of fast
+       path vacuuming runs, ctdb will trigger a scan of complete database
+       for any empty records that need to be deleted.
+      </p></div><div class="refsect2"><a name="idp54906560"></a><h3>VacuumInterval</h3><p>Default: 10</p><p>
         Periodic interval in seconds when vacuuming is triggered for
         volatile databases.
+      </p></div><div class="refsect2"><a name="idp55022832"></a><h3>VacuumMaxRunTime</h3><p>Default: 120</p><p>
+      </p></div><div class="refsect2"><a name="idp54908224"></a><h3>VacuumLimit</h3><p>Default: 5000</p><p>
+        During vacuuming, if the number of deleted records are more than
+        <code class="varname">VacuumLimit</code>, then databases are repacked to
+        avoid fragmentation.
+      </p><p>
+        Databases are repacked only if both <code class="varname">RepackLimit</code>
+        and <code class="varname">VacuumLimit</code> are exceeded.
+      </p></div><div class="refsect2"><a name="idp54911392"></a><h3>VacuumMaxRunTime</h3><p>Default: 120</p><p>
         The maximum time in seconds for which the vacuuming process is
         allowed to run.  If vacuuming process takes longer than this
         value, then the vacuuming process is terminated.
+      </p></div><div class="refsect2"><a name="idp55024592"></a><h3>RepackLimit</h3><p>Default: 10000</p><p>
+        During vacuuming, if the number of freelist records are more
+        than <code class="varname">RepackLimit</code>, then databases are
+        repacked to get rid of the freelist records to avoid
+        fragmentation.
+      </p><p>
+        Databases are repacked only if both
+        <code class="varname">RepackLimit</code> and
+        <code class="varname">VacuumLimit</code> are exceeded.
+      </p></div><div class="refsect2"><a name="idp55027792"></a><h3>VacuumLimit</h3><p>Default: 5000</p><p>
+        During vacuuming, if the number of deleted records are more
+        than <code class="varname">VacuumLimit</code>, then databases are
+        repacked to avoid fragmentation.
+      </p><p>
+        Databases are repacked only if both
+        <code class="varname">RepackLimit</code> and
+        <code class="varname">VacuumLimit</code> are exceeded.
+      </p></div><div class="refsect2"><a name="idp55030864"></a><h3>VacuumFastPathCount</h3><p>Default: 60</p><p>
+        When a record is deleted, it is marked for deletion during
+        vacuuming.  Vacuuming process usually processes this list to purge
+        the records from the database.  If the number of records marked
+        for deletion are more than VacuumFastPathCount, then vacuuming
+        process will scan the complete database for empty records instead
+        of using the list of records marked for deletion.
+      </p></div><div class="refsect2"><a name="idp55032832"></a><h3>DeferredAttachTO</h3><p>Default: 120</p><p>
+        When databases are frozen we do not allow clients to attach to the
+        databases. Instead of returning an error immediately to the application
+        the attach request from the client is deferred until the database
+        becomes available again at which stage we respond to the client.
+      </p><p>
+        This timeout controls how long we will defer the request from the client
+        before timing it out and returning an error to the client.
+      </p></div><div class="refsect2"><a name="idp55035216"></a><h3>HopcountMakeSticky</h3><p>Default: 50</p><p>
+        If the database is set to 'STICKY' mode, using the 'ctdb setdbsticky'
+        command, any record that is seen as very hot and migrating so fast that
+        hopcount surpasses 50 is set to become a STICKY record for StickyDuration
+        seconds. This means that after each migration the record will be kept on
+        the node and prevented from being migrated off the node.
+      </p><p>
+        This setting allows one to try to identify such records and stop them from
+        migrating across the cluster so fast. This will improve performance for
+        certain workloads, such as locking.tdb if many clients are opening/closing
+        the same file concurrently.
+      </p></div><div class="refsect2"><a name="idp55037776"></a><h3>StickyDuration</h3><p>Default: 600</p><p>
+        Once a record has been found to be fetch-lock hot and has been flagged to
+        become STICKY, this is for how long, in seconds, the record will be
+        flagged as a STICKY record.
+      </p></div><div class="refsect2"><a name="idp55039504"></a><h3>StickyPindown</h3><p>Default: 200</p><p>
+        Once a STICKY record has been migrated onto a node, it will be pinned down
+        on that node for this number of ms. Any request from other nodes to migrate
+        the record off the node will be deferred until the pindown timer expires.
+      </p></div><div class="refsect2"><a name="idp55041296"></a><h3>StatHistoryInterval</h3><p>Default: 1</p><p>
+        Granularity of the statistics collected in the statistics history.
+      </p></div><div class="refsect2"><a name="idp55042928"></a><h3>AllowClientDBAttach</h3><p>Default: 1</p><p>
+        When set to 0, clients are not allowed to attach to any databases.
+        This can be used to temporarily block any new processes from attaching
+        to and accessing the databases.
+      </p></div><div class="refsect2"><a name="idp55044656"></a><h3>RecoverPDBBySeqNum</h3><p>Default: 1</p><p>
+        When set to zero, database recovery for persistent databases
+        is record-by-record and recovery process simply collects the
+        most recent version of every individual record.
+      </p><p>
+        When set to non-zero, persistent databases will instead be
+        recovered as a whole db and not by individual records. The
+        node that contains the highest value stored in the record
+        "__db_sequence_number__" is selected and the copy of that
+        nodes database is used as the recovered database.
+      </p><p>
+        By default, recovery of persistent databses is done using
+        __db_sequence_number__ record.
+      </p></div><div class="refsect2"><a name="idp55047584"></a><h3>FetchCollapse</h3><p>Default: 1</p><p>
+        When many clients across many nodes try to access the same record at the
+        same time this can lead to a fetch storm where the record becomes very
+        active and bounces between nodes very fast. This leads to high CPU
+        utilization of the ctdbd daemon, trying to bounce that record around
+        very fast, and poor performance.
+      </p><p>
+        This parameter is used to activate a fetch-collapse. A fetch-collapse
+        is when we track which records we have requests in flight so that we only
+        keep one request in flight from a certain node, even if multiple smbd
+        processes are attemtping to fetch the record at the same time. This
+        can improve performance and reduce CPU utilization for certain
+        workloads.
+      </p><p>
+        This timeout controls if we should collapse multiple fetch operations
+        of the same record into a single request and defer all duplicates or not.
+      </p></div><div class="refsect2"><a name="idp55050784"></a><h3>Samba3AvoidDeadlocks</h3><p>Default: 0</p><p>
+        Enable code that prevents deadlocks with Samba (only for Samba 3.x).
+      </p><p>
+        This should be set to 1 when using Samba version 3.x to enable special
+        code in CTDB to avoid deadlock with Samba version 3.x.  This code
+        is not required for Samba version 4.x and must not be enabled for
+        Samba 4.x.
+      </p></div></div><div class="refsect1"><a name="idp55053168"></a><h2>SEE ALSO</h2><p>
+      </p></div><div class="refsect2"><a name="idp54913152"></a><h3>VerboseMemoryNames</h3><p>Default: 0</p><p>
+        When set to non-zero, ctdb assigns verbose names for some of
+        the talloc allocated memory objects.  These names are visible
+        in the talloc memory report generated by 'ctdb dumpmemory'.
+      </p></div></div><div class="refsect1"><a name="idp54915024"></a><h2>SEE ALSO</h2><p>
       <span class="citerefentry"><span class="refentrytitle">ctdb</span>(1)</span>,

Note: See TracChangeset for help on using the changeset viewer.

Context Navigation

Changeset 989 for vendor/current/ctdb/doc/ctdb-tunables.7.html

Legend:

vendor/current/ctdb/doc/ctdb-tunables.7.html

Download in other formats: