Descriptor
**********

Package for parsing and processing descriptor data.

**Module Overview:**

   parse_file - Parses the descriptors in a file.
   create_signing_key - Cretes a signing key that can be used for creating descriptors.

   Compression - method of descriptor decompression

   Descriptor - Common parent for all descriptor file types.
     | |- content - creates the text of a new descriptor
     | |- create - creates a new descriptor
     | +- from_str - provides a parsed descriptor for the given string
     |
     |- type_annotation - provides our @type annotation
     |- get_path - location of the descriptor on disk if it came from a file
     |- get_archive_path - location of the descriptor within the archive it came from
     |- get_bytes - similar to str(), but provides our original bytes content
     |- get_unrecognized_lines - unparsed descriptor content
     +- __str__ - string that the descriptor was made from

stem.descriptor.__init__.DigestHash(enum)

   New in version 1.8.0.

   Hash function used by tor for descriptor digests.

   +-------------+-------------+
   | DigestHash  | Description |
   +=============+=============+
   | SHA1        | SHA1 hash   |
   +-------------+-------------+
   | SHA256      | SHA256 hash |
   +-------------+-------------+

stem.descriptor.__init__.DigestEncoding(enum)

   New in version 1.8.0.

   Encoding of descriptor digests.

   +-------------------+------------------------------------------------------------------------------------------------------------------------+
   | DigestEncoding    | Description                                                                                                            |
   +===================+========================================================================================================================+
   | RAW               | hash object                                                                                                            |
   +-------------------+------------------------------------------------------------------------------------------------------------------------+
   | HEX               | uppercase hexidecimal encoding                                                                                         |
   +-------------------+------------------------------------------------------------------------------------------------------------------------+
   | BASE64            | base64 encoding without trailing ‘=’ padding                                                                           |
   +-------------------+------------------------------------------------------------------------------------------------------------------------+

stem.descriptor.__init__.DocumentHandler(enum)

   Ways in which we can parse a "NetworkStatusDocument".

   Both **ENTRIES** and **BARE_DOCUMENT** have a ‘thin’ document,
   which doesn’t have a populated **routers** attribute. This allows
   for lower memory usage and upfront runtime. However, if read time
   and memory aren’t a concern then **DOCUMENT** can provide you with
   a fully populated document.

   Handlers don’t change the fact that most methods that provide
   descriptors return an iterator. In the case of **DOCUMENT** and
   **BARE_DOCUMENT** that iterator would have just a single item - the
   document itself.

   Simple way to handle this is to call **next()** to get the
   iterator’s one and only value…

      import stem.descriptor.remote
      from stem.descriptor import DocumentHandler

      consensus = next(stem.descriptor.remote.get_consensus(
        document_handler = DocumentHandler.BARE_DOCUMENT,
      )

   +---------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
   | DocumentHandler     | Description                                                                                                                                                                               |
   +=====================+===========================================================================================================================================================================================+
   | **ENTRIES**         | Iterates over the contained "RouterStatusEntry". Each has a reference to the bare document it came from (through its **document** attribute).                                             |
   +---------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
   | **DOCUMENT**        | "NetworkStatusDocument" with the "RouterStatusEntry" it contains (through its **routers** attribute).                                                                                     |
   +---------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
   | **BARE_DOCUMENT**   | "NetworkStatusDocument" **without** a reference to its contents (the "RouterStatusEntry" are unread).                                                                                     |
   +---------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

stem.descriptor.__init__.parse_file(descriptor_file, descriptor_type=None, validate=False, document_handler='ENTRIES', normalize_newlines=None, **kwargs)

   Simple function to read the descriptor contents from a file,
   providing an iterator for its "Descriptor" contents.

   If you don’t provide a **descriptor_type** argument then this
   automatically tries to determine the descriptor type based on the
   following…

   * The @type annotation on the first line. These are generally
     only found in the CollecTor archives.

   * The filename if it matches something from tor’s data directory.
     For instance, tor’s ‘cached-descriptors’ contains server
     descriptors.

   This is a handy function for simple usage, but if you’re reading
   multiple descriptor files you might want to consider the
   "DescriptorReader".

   Descriptor types include the following, including further minor
   versions (ie. if we support 1.1 then we also support everything
   from 1.0 and most things from 1.2, but not 2.0)…

   +-------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------+
   | Descriptor Type                           | Class                                                                                                                                           |
   +===========================================+=================================================================================================================================================+
   | server-descriptor 1.0                     | "RelayDescriptor"                                                                                                                               |
   +-------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------+
   | extra-info 1.0                            | "RelayExtraInfoDescriptor"                                                                                                                      |
   +-------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------+
   | microdescriptor 1.0                       | "Microdescriptor"                                                                                                                               |
   +-------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------+
   | directory 1.0                             | **unsupported**                                                                                                                                 |
   +-------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------+
   | network-status-2 1.0                      | "RouterStatusEntryV2" (with a "NetworkStatusDocumentV2")                                                                                        |
   +-------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------+
   | dir-key-certificate-3 1.0                 | "KeyCertificate"                                                                                                                                |
   +-------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------+
   | network-status-consensus-3 1.0            | "RouterStatusEntryV3" (with a "NetworkStatusDocumentV3")                                                                                        |
   +-------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------+
   | network-status-vote-3 1.0                 | "RouterStatusEntryV3" (with a "NetworkStatusDocumentV3")                                                                                        |
   +-------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------+
   | network-status-microdesc-consensus-3 1.0  | "RouterStatusEntryMicroV3" (with a "NetworkStatusDocumentV3")                                                                                   |
   +-------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------+
   | bridge-network-status 1.0                 | "RouterStatusEntryV3" (with a "BridgeNetworkStatusDocument")                                                                                    |
   +-------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------+
   | bridge-server-descriptor 1.0              | "BridgeDescriptor"                                                                                                                              |
   +-------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------+
   | bridge-extra-info 1.1 or 1.2              | "BridgeExtraInfoDescriptor"                                                                                                                     |
   +-------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------+
   | torperf 1.0                               | **unsupported**                                                                                                                                 |
   +-------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------+
   | bridge-pool-assignment 1.0                | **unsupported**                                                                                                                                 |
   +-------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------+
   | tordnsel 1.0                              | "TorDNSEL"                                                                                                                                      |
   +-------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------+
   | hidden-service-descriptor 1.0             | "HiddenServiceDescriptorV2"                                                                                                                     |
   +-------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------+

   If you’re using **python 3** then beware that the open() function
   defaults to using text mode. **Binary mode** is strongly suggested
   because it’s both faster (by my testing by about 33x) and doesn’t
   do universal newline translation which can make us misparse the
   document.

      my_descriptor_file = open(descriptor_path, 'rb')

   Parameters:
      * **descriptor_file** (*str**,**file**,**tarfile*) – path or
        opened file with the descriptor contents

      * **descriptor_type** (*str*) – descriptor type, this is
        guessed if not provided

      * **validate** (*bool*) – checks the validity of the
        descriptor’s content if **True**, skips these checks otherwise

      * **document_handler**
        (*stem.descriptor.__init__.DocumentHandler*) – method in which
        to parse the "NetworkStatusDocument"

      * **normalize_newlines** (*bool*) – converts windows newlines
        (CRLF), this is the default when reading data directories on
        windows

      * **kwargs** (*dict*) – additional arguments for the
        descriptor constructor

   Returns:
      iterator for "Descriptor" instances in the file

   Raises:
      * **ValueError** if the contents is malformed and validate is
        True

      * **TypeError** if we can’t match the contents of the file to
        a descriptor type

      * **IOError** if unable to read from the descriptor_file

class stem.descriptor.__init__.Descriptor(contents, lazy_load=False)

   Bases: "object"

   Common parent for all types of descriptors.

   TYPE_ANNOTATION_NAME = None

   classmethod from_str(content, **kwargs)

      Provides a "Descriptor" for the given content.

      To parse a descriptor we must know its type. There are three
      ways to convey this…

         # use a descriptor_type argument
         desc = Descriptor.from_str(content, descriptor_type = 'server-descriptor 1.0')

         # prefixing the content with a "@type" annotation
         desc = Descriptor.from_str('@type server-descriptor 1.0\n' + content)

         # use this method from a subclass
         desc = stem.descriptor.server_descriptor.RelayDescriptor.from_str(content)

      New in version 1.8.0.

      Parameters:
         * **content** (*str**,**bytes*) – string to construct the
           descriptor from

         * **multiple** (*bool*) – if provided with **True** this
           provides a list of descriptors rather than a single one

         * **kwargs** (*dict*) – additional arguments for
           "parse_file()"

      Returns:
         "Descriptor" subclass for the given content, or a **list** of
         descriptors if **multiple = True** is provided

      Raises:
         * **ValueError** if the contents is malformed and validate
           is True

         * **TypeError** if we can’t match the contents of the file
           to a descriptor type

         * **IOError** if unable to read from the descriptor_file

   classmethod content(attr=None, exclude=(), sign=False)

      Creates descriptor content with the given attributes. Mandatory
      fields are filled with dummy information unless data is
      supplied. This doesn’t yet create a valid signature.

      New in version 1.6.0.

      Parameters:
         * **attr** (*dict*) – keyword/value mappings to be included
           in the descriptor

         * **exclude** (*list*) – mandatory keywords to exclude from
           the descriptor, this results in an invalid descriptor

         * **sign** (*bool*) – includes cryptographic signatures and
           digests if True

      Returns:
         **str** with the content of a descriptor

      Raises:
         * **ImportError** if cryptography is unavailable and sign
           is True

         * **NotImplementedError** if not implemented for this
           descriptor type

   classmethod create(attr=None, exclude=(), validate=True, sign=False)

      Creates a descriptor with the given attributes. Mandatory fields
      are filled with dummy information unless data is supplied. This
      doesn’t yet create a valid signature.

      New in version 1.6.0.

      Parameters:
         * **attr** (*dict*) – keyword/value mappings to be included
           in the descriptor

         * **exclude** (*list*) – mandatory keywords to exclude from
           the descriptor, this results in an invalid descriptor

         * **validate** (*bool*) – checks the validity of the
           descriptor’s content if **True**, skips these checks
           otherwise

         * **sign** (*bool*) – includes cryptographic signatures and
           digests if True

      Returns:
         "Descriptor" subclass

      Raises:
         * **ValueError** if the contents is malformed and validate
           is True

         * **ImportError** if cryptography is unavailable and sign
           is True

         * **NotImplementedError** if not implemented for this
           descriptor type

   type_annotation()

      Provides the Tor metrics annotation of this descriptor type. For
      example, “@type server-descriptor 1.0” for server descriptors.

      Please note that the version number component is specific to
      CollecTor, and for the moment hardcode as 1.0. This may change
      in the future.

      New in version 1.8.0.

      Returns:
         "TypeAnnotation" with our type information

   get_path()

      Provides the absolute path that we loaded this descriptor from.

      Returns:
         **str** with the absolute path of the descriptor source

   get_archive_path()

      If this descriptor came from an archive then provides its path
      within the archive. This is only set if the descriptor came from
      a "DescriptorReader", and is **None** if this descriptor didn’t
      come from an archive.

      Returns:
         **str** with the descriptor’s path within the archive

   get_bytes()

      Provides the ASCII **bytes** of the descriptor. This only
      differs from **str()** if you’re running python 3.x, in which
      case **str()** provides a **unicode** string.

      Returns:
         **bytes** for the descriptor’s contents

   get_unrecognized_lines()

      Provides a list of lines that were either ignored or had data
      that we did not know how to process. This is most common due to
      new descriptor fields that this library does not yet know how to
      process. Patches welcome!

      Returns:
         **list** of lines of unrecognized content
