Utility & Helper Methods
************************

class curator.utils.TimestringSearch(timestring)

   An object to allow repetitive search against a string, *searchme*,
   without having to repeatedly recreate the regex.

   Parameters:
      **timestring** -- An strftime pattern

   get_epoch(searchme)

      Return the epoch timestamp extracted from the *timestring*
      appearing in *searchme*.

      Parameters:
         **searchme** -- A string to be searched for a date pattern
         that matches *timestring*

      Return type:
         int

curator.utils.absolute_date_range(unit, date_from, date_to, date_from_format=None, date_to_format=None)

   Get the epoch start time and end time of a range of "unit``s,
   reckoning the start of the week (if that's the selected unit) based
   on ``week_starts_on", which can be either "sunday" or "monday".

   Parameters:
      * **unit** -- One of "hours", "days", "weeks", "months", or
        "years".

      * **date_from** -- The simplified date for the start of the
        range

      * **date_to** -- The simplified date for the end of the range.
        If this value is the same as "date_from", the full value of
        "unit" will be extrapolated for the range.  For example, if
        "unit" is "months", and "date_from" and "date_to" are both
        "2017.01", then the entire month of January 2017 will be the
        absolute date range.

      * **date_from_format** -- The strftime string used to parse
        "date_from"

      * **date_to_format** -- The strftime string used to parse
        "date_to"

   Return type:
      tuple

curator.utils.byte_size(num, suffix='B')

   Return a formatted string indicating the size in bytes, with the
   proper unit, e.g. KB, MB, GB, TB, etc.

   Parameters:
      * **num** -- The number of byte

      * **suffix** -- An arbitrary suffix, like *Bytes*

   Return type:
      float

curator.utils.check_csv(value)

   Some of the curator methods should not operate against multiple
   indices at once.  This method can be used to check if a list or csv
   has been sent.

   Parameters:
      **value** -- The value to test, if list or csv string

   Return type:
      bool

curator.utils.check_master(client, master_only=False)

   Check if connected client is the elected master node of the
   cluster. If not, cleanly exit with a log message.

   Parameters:
      **client** -- An "elasticsearch.Elasticsearch" client object

   Return type:
      None

curator.utils.check_version(client)

   Verify version is within acceptable range.  Raise an exception if
   it is not.

   Parameters:
      **client** -- An "elasticsearch.Elasticsearch" client object

   Return type:
      None

curator.utils.chunk_index_list(indices)

   This utility chunks very large index lists into 3KB chunks It
   measures the size as a csv string, then converts back into a list
   for the return value.

   Parameters:
      **indices** -- A list of indices to act on.

   Return type:
      list

curator.utils.create_repo_body(repo_type=None, compress=True, chunk_size=None, max_restore_bytes_per_sec=None, max_snapshot_bytes_per_sec=None, location=None, bucket=None, region=None, base_path=None, access_key=None, secret_key=None, **kwargs)

   Build the 'body' portion for use in creating a repository.

   Parameters:
      * **repo_type** -- The type of repository (presently only *fs*
        and *s3*)

      * **compress** -- Turn on compression of the snapshot files.
        Compression is applied only to metadata files (index mapping
        and settings). Data files are not compressed. (Default:
        *True*)

      * **chunk_size** -- The chunk size can be specified in bytes or
        by using size value notation, i.e. 1g, 10m, 5k. Defaults to
        *null* (unlimited chunk size).

      * **max_restore_bytes_per_sec** -- Throttles per node restore
        rate. Defaults to "20mb" per second.

      * **max_snapshot_bytes_per_sec** -- Throttles per node snapshot
        rate. Defaults to "20mb" per second.

      * **location** -- Location of the snapshots. Required.

      * **bucket** -- *S3 only.* The name of the bucket to be used for
        snapshots. Required.

      * **region** -- *S3 only.* The region where bucket is located.
        Defaults to *US Standard*

      * **base_path** -- *S3 only.* Specifies the path within bucket
        to repository data. Defaults to value of
        "repositories.s3.base_path" or to root directory if not set.

      * **access_key** -- *S3 only.* The access key to use for
        authentication. Defaults to value of "cloud.aws.access_key".

      * **secret_key** -- *S3 only.* The secret key to use for
        authentication. Defaults to value of "cloud.aws.secret_key".

   Returns:
      A dictionary suitable for creating a repository from the
      provided arguments.

   Return type:
      dict

curator.utils.create_repository(client, **kwargs)

   Create repository with repository and body settings

   Parameters:
      * **client** -- An "elasticsearch.Elasticsearch" client object

      * **repository** -- The Elasticsearch snapshot repository to use

      * **repo_type** -- The type of repository (presently only *fs*
        and *s3*)

      * **compress** -- Turn on compression of the snapshot files.
        Compression is applied only to metadata files (index mapping
        and settings). Data files are not compressed. (Default:
        *True*)

      * **chunk_size** -- The chunk size can be specified in bytes or
        by using size value notation, i.e. 1g, 10m, 5k. Defaults to
        *null* (unlimited chunk size).

      * **max_restore_bytes_per_sec** -- Throttles per node restore
        rate. Defaults to "20mb" per second.

      * **max_snapshot_bytes_per_sec** -- Throttles per node snapshot
        rate. Defaults to "20mb" per second.

      * **location** -- Location of the snapshots. Required.

      * **bucket** -- *S3 only.* The name of the bucket to be used for
        snapshots. Required.

      * **region** -- *S3 only.* The region where bucket is located.
        Defaults to *US Standard*

      * **base_path** -- *S3 only.* Specifies the path within bucket
        to repository data. Defaults to value of
        "repositories.s3.base_path" or to root directory if not set.

      * **access_key** -- *S3 only.* The access key to use for
        authentication. Defaults to value of "cloud.aws.access_key".

      * **secret_key** -- *S3 only.* The secret key to use for
        authentication. Defaults to value of "cloud.aws.secret_key".

      * **skip_repo_fs_check** -- Skip verifying the repo after
        creation.

   Returns:
      A boolean value indicating success or failure.

   Return type:
      bool

curator.utils.create_snapshot_body(indices, ignore_unavailable=False, include_global_state=True, partial=False)

   Create the request body for creating a snapshot from the provided
   arguments.

   Parameters:
      * **indices** -- A single index, or list of indices to snapshot.

      * **ignore_unavailable** (*bool*) -- Ignore unavailable
        shards/indices. (default: *False*)

      * **include_global_state** (*bool*) -- Store cluster global
        state with snapshot. (default: *True*)

      * **partial** (*bool*) -- Do not fail if primary shard is
        unavailable. (default: *False*)

   Return type:
      dict

curator.utils.date_range(unit, range_from, range_to, epoch=None, week_starts_on='sunday')

   Get the epoch start time and end time of a range of "unit``s,
   reckoning the start of the week (if that's the selected unit) based
   on ``week_starts_on", which can be either "sunday" or "monday".

   Parameters:
      * **unit** -- One of "hours", "days", "weeks", "months", or
        "years".

      * **range_from** -- How many "unit" (s) in the past/future is
        the origin?

      * **range_to** -- How many "unit" (s) in the past/future is the
        end point?

      * **epoch** -- An epoch timestamp used to establish a point of
        reference for calculations.

      * **week_starts_on** -- Either "sunday" or "monday". Default is
        "sunday"

   Return type:
      tuple

curator.utils.ensure_list(indices)

   Return a list, even if indices is a single value

   Parameters:
      **indices** -- A list of indices to act upon

   Return type:
      list

curator.utils.find_snapshot_tasks(client)

   Check if there is snapshot activity in the Tasks API. Return *True*
   if activity is found, or *False*

   Parameters:
      **client** -- An "elasticsearch.Elasticsearch" client object

   Return type:
      bool

curator.utils.fix_epoch(epoch)

   Fix value of *epoch* to be epoch, which should be 10 or fewer
   digits long.

   Parameters:
      **epoch** -- An epoch timestamp, in epoch + milliseconds, or
      microsecond, or even nanoseconds.

   Return type:
      int

curator.utils.get_client(**kwargs)

   NOTE: AWS IAM parameters *aws_sign_request* and *aws_region* are
      provided to facilitate request signing. The credentials will be
      fetched from the local environment as per the AWS documentation:
      http://amzn.to/2fRCGCt

   AWS IAM parameters *aws_key*, *aws_secret_key*, and *aws_region*
   are provided for users that still have their keys included in the
   Curator config file.

   Return an "elasticsearch.Elasticsearch" client object using the
   provided parameters. Any of the keyword arguments the
   "elasticsearch.Elasticsearch" client object can receive are valid,
   such as:

   Parameters:
      * **hosts** (*list*) -- A list of one or more Elasticsearch
        client hostnames or IP addresses to connect to.  Can send a
        single host.

      * **port** (*int*) -- The Elasticsearch client port to connect
        to.

      * **url_prefix** (*str*) -- *Optional* url prefix, if needed to
        reach the Elasticsearch API (i.e., it's not at the root level)

      * **use_ssl** (*bool*) -- Whether to connect to the client via
        SSL/TLS

      * **certificate** -- Path to SSL/TLS certificate

      * **client_cert** -- Path to SSL/TLS client certificate (public
        key)

      * **client_key** -- Path to SSL/TLS private key

      * **aws_key** -- AWS IAM Access Key (Only used if the "requests-
        aws4auth" python module is installed)

      * **aws_secret_key** -- AWS IAM Secret Access Key (Only used if
        the "requests-aws4auth" python module is installed)

      * **aws_region** -- AWS Region (Only used if the "requests-
        aws4auth" python module is installed)

      * **aws_sign_request** --

        Sign request to AWS (Only used if the "requests-aws4auth"
           and "boto3" python modules are installed)

        arg aws_region:
           AWS Region where the cluster exists (Only used if the
           "requests-aws4auth" and "boto3" python modules are
           installed)

      * **ssl_no_validate** (*bool*) -- If *True*, do not validate the
        certificate chain.  This is an insecure option and you will
        see warnings in the log output.

      * **http_auth** (*str*) -- Authentication credentials in
        *user:pass* format.

      * **timeout** (*int*) -- Number of seconds before the client
        will timeout.

      * **master_only** (*bool*) -- If *True*, the client will *only*
        connect if the endpoint is the elected master node of the
        cluster.  **This option does not work if `hosts` has more than
        one value.**  It will raise an Exception in that case.

      * **skip_version_test** -- If *True*, skip the version check as
        part of the client connection.

   Return type:
      "elasticsearch.Elasticsearch"

curator.utils.get_date_regex(timestring)

   Return a regex string based on a provided strftime timestring.

   Parameters:
      **timestring** -- An strftime pattern

   Return type:
      str

curator.utils.get_datemath(client, datemath, random_element=None)

   Return the parsed index name from "datemath"

curator.utils.get_datetime(index_timestamp, timestring)

   Return the datetime extracted from the index name, which is the
   index creation time.

   Parameters:
      * **index_timestamp** -- The timestamp extracted from an index
        name

      * **timestring** -- An strftime pattern

   Return type:
      "datetime.datetime"

curator.utils.get_indices(client)

   Get the current list of indices from the cluster.

   Parameters:
      **client** -- An "elasticsearch.Elasticsearch" client object

   Return type:
      list

curator.utils.get_point_of_reference(unit, count, epoch=None)

   Get a point-of-reference timestamp in epoch + milliseconds by
   deriving from a *unit* and a *count*, and an optional reference
   timestamp, *epoch*

   Parameters:
      * **unit** -- One of "seconds", "minutes", "hours", "days",
        "weeks", "months", or "years".

      * **unit_count** -- The number of "units". "unit_count" * "unit"
        will be calculated out to the relative number of seconds.

      * **epoch** -- An epoch timestamp used in conjunction with
        "unit" and "unit_count" to establish a point of reference for
        calculations.

   Return type:
      int

curator.utils.get_repository(client, repository='')

   Return configuration information for the indicated repository.

   Parameters:
      * **client** -- An "elasticsearch.Elasticsearch" client object

      * **repository** -- The Elasticsearch snapshot repository to use

   Return type:
      dict

curator.utils.get_snapshot(client, repository=None, snapshot='')

   Return information about a snapshot (or a comma-separated list of
   snapshots) If no snapshot specified, it will return all snapshots.
   If none exist, an empty dictionary will be returned.

   Parameters:
      * **client** -- An "elasticsearch.Elasticsearch" client object

      * **repository** -- The Elasticsearch snapshot repository to use

      * **snapshot** -- The snapshot name, or a comma-separated list
        of snapshots

   Return type:
      dict

curator.utils.get_snapshot_data(client, repository=None)

   Get "_all" snapshots from repository and return a list.

   Parameters:
      * **client** -- An "elasticsearch.Elasticsearch" client object

      * **repository** -- The Elasticsearch snapshot repository to use

   Return type:
      list

curator.utils.get_version(client)

   Return the ES version number as a tuple. Omits trailing tags like
   -dev, or Beta

   Parameters:
      **client** -- An "elasticsearch.Elasticsearch" client object

   Return type:
      tuple

curator.utils.get_yaml(path)

   Read the file identified by *path* and import its YAML contents.

   Parameters:
      **path** -- The path to a YAML configuration file.

   Return type:
      dict

curator.utils.health_check(client, **kwargs)

   This function calls client.cluster.health and, based on the args
   provided, will return *True* or *False* depending on whether that
   particular keyword appears in the output, and has the expected
   value. If multiple keys are provided, all must match for a *True*
   response.

   Parameters:
      **client** -- An "elasticsearch.Elasticsearch" client object

curator.utils.is_master_node(client)

   Return *True* if the connected client node is the elected master
   node in the Elasticsearch cluster, otherwise return *False*.

   Parameters:
      **client** -- An "elasticsearch.Elasticsearch" client object

   Return type:
      bool

curator.utils.name_to_node_id(client, name)

   Return the node_id of the node identified by "name"

   Parameters:
      **client** -- An "elasticsearch.Elasticsearch" client object

   Return type:
      str

curator.utils.node_id_to_name(client, node_id)

   Return the name of the node identified by "node_id"

   Parameters:
      **client** -- An "elasticsearch.Elasticsearch" client object

   Return type:
      str

curator.utils.node_roles(client, node_id)

   Return the list of roles assigned to the node identified by
   "node_id"

   Parameters:
      **client** -- An "elasticsearch.Elasticsearch" client object

   Return type:
      list

curator.utils.parse_date_pattern(name)

   Scan and parse *name* for "time.strftime()" strings, replacing them
   with the associated value when found, but otherwise returning
   lowercase values, as uppercase snapshot names are not allowed. It
   will detect if the first character is a *<*, which would indicate
   *name* is going to be using Elasticsearch date math syntax, and
   skip accordingly.

   The "time.strftime()" identifiers that Curator currently recognizes
   as acceptable include:

   * "Y": A 4 digit year

   * "y": A 2 digit year

   * "m": The 2 digit month

   * "W": The 2 digit week of the year

   * "d": The 2 digit day of the month

   * "H": The 2 digit hour of the day, in 24 hour notation

   * "M": The 2 digit minute of the hour

   * "S": The 2 digit number of second of the minute

   * "j": The 3 digit day of the year

   Parameters:
      **name** -- A name, which can contain "time.strftime()" strings

curator.utils.parse_datemath(client, value)

   Check if "value" is datemath. Parse it if it is. Return the bare
   value otherwise.

curator.utils.prune_nones(mydict)

   Remove keys from *mydict* whose values are *None*

   Parameters:
      **mydict** -- The dictionary to act on

   Return type:
      dict

curator.utils.read_file(myfile)

   Read a file and return the resulting data.

   Parameters:
      **myfile** -- A file to read.

   Return type:
      str

curator.utils.relocate_check(client, index)

   This function calls client.cluster.state with a given index to
   check if all of the shards for that index are in the STARTED state.
   It will return *True* if all shards both primary and replica are in
   the STARTED state, and it will return *False* if any primary or
   replica shard is in a different state.

   Parameters:
      * **client** -- An "elasticsearch.Elasticsearch" client object

      * **index** -- The index to check the index shards state.

curator.utils.report_failure(exception)

   Raise a *exceptions.FailedExecution* exception and include the
   original error message.

   Parameters:
      **exception** -- The upstream exception.

   Return type:
      None

curator.utils.repository_exists(client, repository=None)

   Verify the existence of a repository

   Parameters:
      * **client** -- An "elasticsearch.Elasticsearch" client object

      * **repository** -- The Elasticsearch snapshot repository to use

   Return type:
      bool

curator.utils.restore_check(client, index_list)

   This function calls client.indices.recovery with the list of
   indices to check for complete recovery.  It will return *True* if
   recovery of those indices is complete, and *False* otherwise.  It
   is designed to fail fast: if a single shard is encountered that is
   still recovering (not in *DONE* stage), it will immediately return
   *False*, rather than complete iterating over the rest of the
   response.

   Parameters:
      * **client** -- An "elasticsearch.Elasticsearch" client object

      * **index_list** -- The list of indices to verify having been
        restored.

curator.utils.rollable_alias(client, alias)

   Ensure that *alias* is an alias, and points to an index that can
   use the "_rollover" API.

   Parameters:
      * **client** -- An "elasticsearch.Elasticsearch" client object

      * **alias** -- An Elasticsearch alias

curator.utils.safe_to_snap(client, repository=None, retry_interval=120, retry_count=3)

   Ensure there are no snapshots in progress.  Pause and retry
   accordingly

   Parameters:
      * **client** -- An "elasticsearch.Elasticsearch" client object

      * **repository** -- The Elasticsearch snapshot repository to use

      * **retry_interval** -- Number of seconds to delay betwen
        retries. Default: 120 (seconds)

      * **retry_count** -- Number of attempts to make. Default: 3

   Return type:
      bool

curator.utils.show_dry_run(ilo, action, **kwargs)

   Log dry run output with the action which would have been executed.

   Parameters:
      * **ilo** -- A "curator.indexlist.IndexList"

      * **action** -- The *action* to be performed.

      * **kwargs** -- Any other args to show in the log output

curator.utils.single_data_path(client, node_id)

   In order for a shrink to work, it should be on a single filesystem,
   as shards cannot span filesystems.  Return *True* if the node has a
   single filesystem, and *False* otherwise.

   Parameters:
      **client** -- An "elasticsearch.Elasticsearch" client object

   Return type:
      bool

curator.utils.snapshot_check(client, snapshot=None, repository=None)

   This function calls *client.snapshot.get* and tests to see whether
   the snapshot is complete, and if so, with what status.  It will log
   errors according to the result. If the snapshot is still
   *IN_PROGRESS*, it will return *False*.  *SUCCESS* will be an *INFO*
   level message, *PARTIAL* nets a *WARNING* message, *FAILED* is an
   *ERROR*, message, and all others will be a *WARNING* level message.

   Parameters:
      * **client** -- An "elasticsearch.Elasticsearch" client object

      * **snapshot** -- The name of the snapshot.

      * **repository** -- The Elasticsearch snapshot repository to use

curator.utils.snapshot_in_progress(client, repository=None, snapshot=None)

   Determine whether the provided snapshot in *repository* is
   "IN_PROGRESS". If no value is provided for *snapshot*, then check
   all of them. Return *snapshot* if it is found to be in progress, or
   *False*

   Parameters:
      * **client** -- An "elasticsearch.Elasticsearch" client object

      * **repository** -- The Elasticsearch snapshot repository to use

      * **snapshot** -- The snapshot name

curator.utils.snapshot_running(client)

   Return *True* if a snapshot is in progress, and *False* if not

   Parameters:
      **client** -- An "elasticsearch.Elasticsearch" client object

   Return type:
      bool

curator.utils.task_check(client, task_id=None)

   This function calls client.tasks.get with the provided *task_id*.
   If the task data contains "'completed': True", then it will return
   *True* If the task is not completed, it will log some information
   about the task and return *False*

   Parameters:
      * **client** -- An "elasticsearch.Elasticsearch" client object

      * **task_id** -- A task_id which ostensibly matches a task
        searchable in the tasks API.

curator.utils.test_client_options(config)

   Test whether a SSL/TLS files exist. Will raise an exception if the
   files cannot be read.

   Parameters:
      **config** -- A client configuration file data dictionary

   Return type:
      None

curator.utils.test_repo_fs(client, repository=None)

   Test whether all nodes have write access to the repository

   Parameters:
      * **client** -- An "elasticsearch.Elasticsearch" client object

      * **repository** -- The Elasticsearch snapshot repository to use

curator.utils.to_csv(indices)

   Return a csv string from a list of indices, or a single value if
   only one value is present

   Parameters:
      **indices** -- A list of indices to act on, or a single value,
      which could be in the format of a csv string already.

   Return type:
      str

curator.utils.validate_actions(data)

   Validate an Action configuration dictionary, as imported from
   actions.yml, for example.

   The method returns a validated and sanitized configuration
   dictionary.

   Parameters:
      **data** -- The configuration dictionary

   Return type:
      dict

curator.utils.validate_filters(action, filters)

   Validate that the filters are appropriate for the action type, e.g.
   no index filters applied to a snapshot list.

   Parameters:
      * **action** -- An action name

      * **filters** -- A list of filters to test.

curator.utils.verify_client_object(test)

   Test if *test* is a proper "elasticsearch.Elasticsearch" client
   object and raise an exception if it is not.

   Parameters:
      **test** -- The variable or object to test

   Return type:
      None

curator.utils.verify_index_list(test)

   Test if *test* is a proper "curator.indexlist.IndexList" object and
   raise an exception if it is not.

   Parameters:
      **test** -- The variable or object to test

   Return type:
      None

curator.utils.verify_snapshot_list(test)

   Test if *test* is a proper "curator.snapshotlist.SnapshotList"
   object and raise an exception if it is not.

   Parameters:
      **test** -- The variable or object to test

   Return type:
      None

curator.utils.wait_for_it(client, action, task_id=None, snapshot=None, repository=None, index=None, index_list=None, wait_interval=9, max_wait=- 1)

   This function becomes one place to do all wait_for_completion type
   behaviors

   Parameters:
      * **client** -- An "elasticsearch.Elasticsearch" client object

      * **action** -- The action name that will identify how to wait

      * **task_id** -- If the action provided a task_id, this is where
        it must be declared.

      * **snapshot** -- The name of the snapshot.

      * **repository** -- The Elasticsearch snapshot repository to use

      * **wait_interval** -- How frequently the specified "wait"
        behavior will be polled to check for completion.

      * **max_wait** -- Number of seconds will the "wait" behavior
        persist before giving up and raising an Exception.  The
        default is -1, meaning it will try forever.

class curator.SchemaCheck(config, schema, test_what, location)

   Validate "config" with the provided voluptuous "schema".
   "test_what" and "location" are for reporting the results, in case
   of failure.  If validation is successful, the method returns
   "config" as valid.

   Parameters:
      * **config** (*dict*) -- A configuration dictionary.

      * **schema** ("voluptuous.Schema") -- A voluptuous schema
        definition

      * **test_what** (*str*) -- which configuration block is being
        validated

      * **location** (*str*) -- An string to report which
        configuration sub-block is being tested.
