ak.to_parquet
-------------

Defined in `awkward.operations.convert <https://github.com/scikit-hep/awkward-1.0/blob/80bbef0738a6b7928333d7c705ee1b359991de5b/src/awkward/operations/convert.py>`__ on `line 2961 <https://github.com/scikit-hep/awkward-1.0/blob/80bbef0738a6b7928333d7c705ee1b359991de5b/src/awkward/operations/convert.py#L2961>`__.

.. py:function:: ak.to_parquet(array, where, explode_records=False, list_to32=False, string_to32=True, bytestring_to32=True)


    :param array: Data to write to a Parquet file.
    :param where: Where to write the Parquet file.
    :type where: str, Path, file-like object
    :param explode_records: If True, lists of records are written as
                        records of lists, so that nested fields become top-level fields
                        (which can be zipped when read back).
    :type explode_records: bool
    :param list_to32: If True, convert Awkward lists into 32-bit Arrow lists
                  if they're small enough, even if it means an extra conversion. Otherwise,
                  signed 32-bit :py:obj:`ak.layout.ListOffsetArray` maps to Arrow ``ListType`` and
                  all others map to Arrow ``LargeListType``.
    :type list_to32: bool
    :param string_to32: Same as the above for Arrow ``string`` and ``large_string``.
    :type string_to32: bool
    :param bytestring_to32: Same as the above for Arrow ``binary`` and ``large_binary``.
    :type bytestring_to32: bool
    :param options: All other options are passed to pyarrow.parquet.ParquetWriter.
                In particular, if no ``schema`` is given, a schema is derived from
                the array type.

Writes an Awkward Array to a Parquet file (through pyarrow).

.. code-block:: python


    >>> array1 = ak.Array([[1, 2, 3], [], [4, 5], [], [], [6, 7, 8, 9]])
    >>> ak.to_parquet(array1, "array1.parquet")

If the ``array`` does not contain records at top-level, the Arrow table will consist
of one field whose name is ``""``.

Parquet files can maintain the distinction between "option-type but no elements are
missing" and "not option-type" at all levels, including the top level. However,
there is no distinction between ``?union[X, Y, Z]]`` type and ``union[?X, ?Y, ?Z]`` type.
Be aware of these type distinctions when passing data through Arrow or Parquet.

To make a partitioned Parquet dataset, use this function to write each Parquet
file to a directory (as separate invocations, probably in parallel with multiple
processes), then give them common metadata by calling ``ak.to_parquet.dataset``.

.. code-block:: python


    >>> ak.to_parquet(array1, "directory-name/file1.parquet")
    >>> ak.to_parquet(array2, "directory-name/file2.parquet")
    >>> ak.to_parquet(array3, "directory-name/file3.parquet")
    >>> ak.to_parquet.dataset("directory-name")

Then all of the flies in the collection can be addressed as one array. For example,

.. code-block:: python


    >>> dataset = ak.from_parquet("directory_name", lazy=True)

(If it is large, you will likely want to load it lazily.)

See also :py:obj:`ak.to_arrow`, which is used as an intermediate step.
See also :py:obj:`ak.from_parquet`.

