Types
=====

All expressions have a type which is known at compile time. Nim
is statically typed. One can declare new types, which is in essence defining
an identifier that can be used to denote this custom type.

These are the major type classes:

* ordinal types (consist of integer, bool, character, enumeration
  (and subranges thereof) types)
* floating point types
* string type
* structured types
* reference (pointer) type
* procedural type
* generic type


Ordinal types
-------------
Ordinal types have the following characteristics:

- Ordinal types are countable and ordered. This property allows
  the operation of functions as ``inc``, ``ord``, ``dec`` on ordinal types to
  be defined.
- Ordinal values have a smallest possible value. Trying to count further
  down than the smallest value gives a checked runtime or static error.
- Ordinal values have a largest possible value. Trying to count further
  than the largest value gives a checked runtime or static error.

Integers, bool, characters and enumeration types (and subranges of these
types) belong to ordinal types. For reasons of simplicity of implementation
the types ``uint`` and ``uint64`` are not ordinal types.


Pre-defined integer types
-------------------------
These integer types are pre-defined:

``int``
  the generic signed integer type; its size is platform dependent and has the
  same size as a pointer. This type should be used in general. An integer
  literal that has no type suffix is of this type.

intXX
  additional signed integer types of XX bits use this naming scheme
  (example: int16 is a 16 bit wide integer).
  The current implementation supports ``int8``, ``int16``, ``int32``, ``int64``.
  Literals of these types have the suffix 'iXX.

``uint``
  the generic `unsigned integer`:idx: type; its size is platform dependent and
  has the same size as a pointer. An integer literal with the type
  suffix ``'u`` is of this type.

uintXX
  additional signed integer types of XX bits use this naming scheme
  (example: uint16 is a 16 bit wide unsigned integer).
  The current implementation supports ``uint8``, ``uint16``, ``uint32``,
  ``uint64``. Literals of these types have the suffix 'uXX.
  Unsigned operations all wrap around; they cannot lead to over- or
  underflow errors.


In addition to the usual arithmetic operators for signed and unsigned integers
(``+ - *`` etc.) there are also operators that formally work on *signed*
integers but treat their arguments as *unsigned*: They are mostly provided
for backwards compatibility with older versions of the language that lacked
unsigned integer types. These unsigned operations for signed integers use
the ``%`` suffix as convention:


======================   ======================================================
operation                meaning
======================   ======================================================
``a +% b``               unsigned integer addition
``a -% b``               unsigned integer subtraction
``a *% b``               unsigned integer multiplication
``a /% b``               unsigned integer division
``a %% b``               unsigned integer modulo operation
``a <% b``               treat ``a`` and ``b`` as unsigned and compare
``a <=% b``              treat ``a`` and ``b`` as unsigned and compare
``ze(a)``                extends the bits of ``a`` with zeros until it has the
                         width of the ``int`` type
``toU8(a)``              treats ``a`` as unsigned and converts it to an
                         unsigned integer of 8 bits (but still the
                         ``int8`` type)
``toU16(a)``             treats ``a`` as unsigned and converts it to an
                         unsigned integer of 16 bits (but still the
                         ``int16`` type)
``toU32(a)``             treats ``a`` as unsigned and converts it to an
                         unsigned integer of 32 bits (but still the
                         ``int32`` type)
======================   ======================================================

`Automatic type conversion`:idx: is performed in expressions where different
kinds of integer types are used: the smaller type is converted to the larger.

A `narrowing type conversion`:idx: converts a larger to a smaller type (for
example ``int32 -> int16``. A `widening type conversion`:idx: converts a
smaller type to a larger type (for example ``int16 -> int32``). In Nim only
widening type conversions are *implicit*:

.. code-block:: nim
  var myInt16 = 5i16
  var myInt: int
  myInt16 + 34     # of type ``int16``
  myInt16 + myInt  # of type ``int``
  myInt16 + 2i32   # of type ``int32``

However, ``int`` literals are implicitly convertible to a smaller integer type
if the literal's value fits this smaller type and such a conversion is less
expensive than other implicit conversions, so ``myInt16 + 34`` produces
an ``int16`` result.

For further details, see `Convertible relation`_.


Subrange types
--------------
A subrange type is a range of values from an ordinal type (the base
type). To define a subrange type, one must specify it's limiting values: the
lowest and highest value of the type:

.. code-block:: nim
  type
    Subrange = range[0..5]


``Subrange`` is a subrange of an integer which can only hold the values 0
to 5. Assigning any other value to a variable of type ``Subrange`` is a
checked runtime error (or static error if it can be statically
determined). Assignments from the base type to one of its subrange types
(and vice versa) are allowed.

A subrange type has the same size as its base type (``int`` in the example).

Nim requires `interval arithmetic`:idx: for subrange types over a set
of built-in operators that involve constants: ``x %% 3`` is of
type ``range[0..2]``. The following built-in operators for integers are
affected by this rule: ``-``, ``+``, ``*``, ``min``, ``max``, ``succ``,
``pred``, ``mod``, ``div``, ``%%``, ``and`` (bitwise ``and``).

Bitwise ``and`` only produces a ``range`` if one of its operands is a
constant *x* so that (x+1) is a number of two.
(Bitwise ``and`` is then a ``%%`` operation.)

This means that the following code is accepted:

.. code-block:: nim
  case (x and 3) + 7
  of 7: echo "A"
  of 8: echo "B"
  of 9: echo "C"
  of 10: echo "D"
  # note: no ``else`` required as (x and 3) + 7 has the type: range[7..10]


Pre-defined floating point types
--------------------------------

The following floating point types are pre-defined:

``float``
  the generic floating point type; its size is platform dependent
  (the compiler chooses the processor's fastest floating point type).
  This type should be used in general.

floatXX
  an implementation may define additional floating point types of XX bits using
  this naming scheme (example: float64 is a 64 bit wide float). The current
  implementation supports ``float32`` and ``float64``. Literals of these types
  have the suffix 'fXX.


Automatic type conversion in expressions with different kinds
of floating point types is performed: See `Convertible relation`_ for further
details. Arithmetic performed on floating point types follows the IEEE
standard. Integer types are not converted to floating point types automatically
and vice versa.

The IEEE standard defines five types of floating-point exceptions:

* Invalid: operations with mathematically invalid operands,
  for example 0.0/0.0, sqrt(-1.0), and log(-37.8).
* Division by zero: divisor is zero and dividend is a finite nonzero number,
  for example 1.0/0.0.
* Overflow: operation produces a result that exceeds the range of the exponent,
  for example MAXDOUBLE+0.0000000000001e308.
* Underflow: operation produces a result that is too small to be represented
  as a normal number, for example, MINDOUBLE * MINDOUBLE.
* Inexact: operation produces a result that cannot be represented with infinite
  precision, for example, 2.0 / 3.0, log(1.1) and 0.1 in input.

The IEEE exceptions are either ignored at runtime or mapped to the
Nim exceptions: `FloatInvalidOpError`:idx:, `FloatDivByZeroError`:idx:,
`FloatOverflowError`:idx:, `FloatUnderflowError`:idx:,
and `FloatInexactError`:idx:.
These exceptions inherit from the `FloatingPointError`:idx: base class.

Nim provides the pragmas `NaNChecks`:idx: and `InfChecks`:idx: to control
whether the IEEE exceptions are ignored or trap a Nim exception:

.. code-block:: nim
  {.NanChecks: on, InfChecks: on.}
  var a = 1.0
  var b = 0.0
  echo b / b # raises FloatInvalidOpError
  echo a / b # raises FloatOverflowError

In the current implementation ``FloatDivByZeroError`` and ``FloatInexactError``
are never raised. ``FloatOverflowError`` is raised instead of
``FloatDivByZeroError``.
There is also a `floatChecks`:idx: pragma that is a short-cut for the
combination of ``NaNChecks`` and ``InfChecks`` pragmas. ``floatChecks`` are
turned off as default.

The only operations that are affected by the ``floatChecks`` pragma are
the ``+``, ``-``, ``*``, ``/`` operators for floating point types.

An implementation should always use the maximum precision available to evaluate
floating pointer values at compile time; this means expressions like
``0.09'f32 + 0.01'f32 == 0.09'f64 + 0.01'f64`` are true.


Boolean type
------------
The boolean type is named `bool`:idx: in Nim and can be one of the two
pre-defined values ``true`` and ``false``. Conditions in while,
if, elif, when statements need to be of type bool.

This condition holds::

  ord(false) == 0 and ord(true) == 1

The operators ``not, and, or, xor, <, <=, >, >=, !=, ==`` are defined
for the bool type. The ``and`` and ``or`` operators perform short-cut
evaluation. Example:

.. code-block:: nim

  while p != nil and p.name != "xyz":
    # p.name is not evaluated if p == nil
    p = p.next


The size of the bool type is one byte.


Character type
--------------
The character type is named ``char`` in Nim. Its size is one byte.
Thus it cannot represent an UTF-8 character, but a part of it.
The reason for this is efficiency: for the overwhelming majority of use-cases,
the resulting programs will still handle UTF-8 properly as UTF-8 was specially
designed for this.
Another reason is that Nim can support ``array[char, int]`` or
``set[char]`` efficiently as many algorithms rely on this feature. The
`Rune` type is used for Unicode characters, it can represent any Unicode
character. ``Rune`` is declared in the `unicode module <unicode.html>`_.




Enumeration types
-----------------
Enumeration types define a new type whose values consist of the ones
specified. The values are ordered. Example:

.. code-block:: nim

  type
    Direction = enum
      north, east, south, west


Now the following holds::

  ord(north) == 0
  ord(east) == 1
  ord(south) == 2
  ord(west) == 3

Thus, north < east < south < west. The comparison operators can be used
with enumeration types.

For better interfacing to other programming languages, the fields of enum
types can be assigned an explicit ordinal value. However, the ordinal values
have to be in ascending order. A field whose ordinal value is not
explicitly given is assigned the value of the previous field + 1.

An explicit ordered enum can have *holes*:

.. code-block:: nim
  type
    TokenType = enum
      a = 2, b = 4, c = 89 # holes are valid

However, it is then not an ordinal anymore, so it is not possible to use these
enums as an index type for arrays. The procedures ``inc``, ``dec``, ``succ``
and ``pred`` are not available for them either.


The compiler supports the built-in stringify operator ``$`` for enumerations.
The stringify's result can be controlled by explicitly giving the string
values to use:

.. code-block:: nim

  type
    MyEnum = enum
      valueA = (0, "my value A"),
      valueB = "value B",
      valueC = 2,
      valueD = (3, "abc")

As can be seen from the example, it is possible to both specify a field's
ordinal value and its string value by using a tuple. It is also
possible to only specify one of them.

An enum can be marked with the ``pure`` pragma so that it's fields are not
added to the current scope, so they always need to be accessed
via ``MyEnum.value``:

.. code-block:: nim

  type
    MyEnum {.pure.} = enum
      valueA, valueB, valueC, valueD

  echo valueA # error: Unknown identifier
  echo MyEnum.valueA # works


String type
-----------
All string literals are of the type ``string``. A string in Nim is very
similar to a sequence of characters. However, strings in Nim are both
zero-terminated and have a length field. One can retrieve the length with the
builtin ``len`` procedure; the length never counts the terminating zero.
The assignment operator for strings always copies the string.
The ``&`` operator concatenates strings.

Most native Nim types support conversion to strings with the special ``$`` proc.
When calling the ``echo`` proc, for example, the built-in stringify operation
for the parameter is called:

.. code-block:: nim

  echo 3 # calls `$` for `int`

Whenever a user creates a specialized object, implementation of this procedure
provides for ``string`` representation.

.. code-block:: nim
  type
    Person = object
      name: string
      age: int

  proc `$`(p: Person): string = # `$` always returns a string
    result = p.name & " is " &
            $p.age & # we *need* the `$` in front of p.age, which
                     # is natively an integer, to convert it to
                     # a string
            " years old."

While ``$p.name`` can also be used, the ``$`` operation on a string does
nothing. Note that we cannot rely on automatic conversion from an ``int`` to
a ``string`` like we can for the ``echo`` proc.

Strings are compared by their lexicographical order. All comparison operators
are available. Strings can be indexed like arrays (lower bound is 0). Unlike
arrays, they can be used in case statements:

.. code-block:: nim

  case paramStr(i)
  of "-v": incl(options, optVerbose)
  of "-h", "-?": incl(options, optHelp)
  else: write(stdout, "invalid command line option!\n")

Per convention, all strings are UTF-8 strings, but this is not enforced. For
example, when reading strings from binary files, they are merely a sequence of
bytes. The index operation ``s[i]`` means the i-th *char* of ``s``, not the
i-th *unichar*. The iterator ``runes`` from the `unicode module
<unicode.html>`_ can be used for iteration over all Unicode characters.


cstring type
------------
The ``cstring`` type represents a pointer to a zero-terminated char array
compatible to the type ``char*`` in Ansi C. Its primary purpose lies in easy
interfacing with C. The index operation ``s[i]`` means the i-th *char* of
``s``; however no bounds checking for ``cstring`` is performed making the
index operation unsafe.

A Nim ``string`` is implicitly convertible
to ``cstring`` for convenience. If a Nim string is passed to a C-style
variadic proc, it is implicitly converted to ``cstring`` too:

.. code-block:: nim
  proc printf(formatstr: cstring) {.importc: "printf", varargs,
                                    header: "<stdio.h>".}

  printf("This works %s", "as expected")

Even though the conversion is implicit, it is not *safe*: The garbage collector
does not consider a ``cstring`` to be a root and may collect the underlying
memory. However in practice this almost never happens as the GC considers
stack roots conservatively. One can use the builtin procs ``GC_ref`` and
``GC_unref`` to keep the string data alive for the rare cases where it does
not work.

A `$` proc is defined for cstrings that returns a string. Thus to get a nim
string from a cstring:

.. code-block:: nim
  var str: string = "Hello!"
  var cstr: cstring = str
  var newstr: string = $cstr


Structured types
----------------
A variable of a structured type can hold multiple values at the same
time. Structured types can be nested to unlimited levels. Arrays, sequences,
tuples, objects and sets belong to the structured types.

Array and sequence types
------------------------
Arrays are a homogeneous type, meaning that each element in the array
has the same type. Arrays always have a fixed length which is specified at
compile time (except for open arrays). They can be indexed by any ordinal type.
A parameter ``A`` may be an *open array*, in which case it is indexed by
integers from 0 to ``len(A)-1``. An array expression may be constructed by the
array constructor ``[]``.

Sequences are similar to arrays but of dynamic length which may change
during runtime (like strings). Sequences are implemented as growable arrays,
allocating pieces of memory as items are added. A sequence ``S`` is always
indexed by integers from 0 to ``len(S)-1`` and its bounds are checked.
Sequences can be constructed by the array constructor ``[]`` in conjunction
with the array to sequence operator ``@``. Another way to allocate space for a
sequence is to call the built-in ``newSeq`` procedure.

A sequence may be passed to a parameter that is of type *open array*.

Example:

.. code-block:: nim

  type
    IntArray = array[0..5, int] # an array that is indexed with 0..5
    IntSeq = seq[int] # a sequence of integers
  var
    x: IntArray
    y: IntSeq
  x = [1, 2, 3, 4, 5, 6]  # [] is the array constructor
  y = @[1, 2, 3, 4, 5, 6] # the @ turns the array into a sequence

The lower bound of an array or sequence may be received by the built-in proc
``low()``, the higher bound by ``high()``. The length may be
received by ``len()``. ``low()`` for a sequence or an open array always returns
0, as this is the first valid index.
One can append elements to a sequence with the ``add()`` proc or the ``&``
operator, and remove (and get) the last element of a sequence with the
``pop()`` proc.

The notation ``x[i]`` can be used to access the i-th element of ``x``.

Arrays are always bounds checked (at compile-time or at runtime). These
checks can be disabled via pragmas or invoking the compiler with the
``--boundChecks:off`` command line switch.


Open arrays
-----------

Often fixed size arrays turn out to be too inflexible; procedures should
be able to deal with arrays of different sizes. The `openarray`:idx: type
allows this; it can only be used for parameters. Openarrays are always
indexed with an ``int`` starting at position 0. The ``len``, ``low``
and ``high`` operations are available for open arrays too. Any array with
a compatible base type can be passed to an openarray parameter, the index
type does not matter. In addition to arrays sequences can also be passed
to an open array parameter.

The openarray type cannot be nested: multidimensional openarrays are not
supported because this is seldom needed and cannot be done efficiently.

.. code-block:: nim
  proc testOpenArray(x: openArray[int]) = echo repr(x)

  testOpenArray([1,2,3])  # array[]
  testOpenArray(@[1,2,3]) # seq[]

Varargs
-------

A ``varargs`` parameter is an openarray parameter that additionally
allows to pass a variable number of arguments to a procedure. The compiler
converts the list of arguments to an array implicitly:

.. code-block:: nim
  proc myWriteln(f: File, a: varargs[string]) =
    for s in items(a):
      write(f, s)
    write(f, "\n")

  myWriteln(stdout, "abc", "def", "xyz")
  # is transformed to:
  myWriteln(stdout, ["abc", "def", "xyz"])

This transformation is only done if the varargs parameter is the
last parameter in the procedure header. It is also possible to perform
type conversions in this context:

.. code-block:: nim
  proc myWriteln(f: File, a: varargs[string, `$`]) =
    for s in items(a):
      write(f, s)
    write(f, "\n")

  myWriteln(stdout, 123, "abc", 4.0)
  # is transformed to:
  myWriteln(stdout, [$123, $"def", $4.0])

In this example ``$`` is applied to any argument that is passed to the
parameter ``a``. (Note that ``$`` applied to strings is a nop.)

Note that an explicit array constructor passed to a ``varargs`` parameter is
not wrapped in another implicit array construction:

.. code-block:: nim
  proc takeV[T](a: varargs[T]) = discard

  takeV([123, 2, 1]) # takeV's T is "int", not "array of int"


``varargs[expr]`` is treated specially: It matches a variable list of arguments
of arbitrary type but *always* constructs an implicit array. This is required
so that the builtin ``echo`` proc does what is expected:

.. code-block:: nim
  proc echo*(x: varargs[expr, `$`]) {...}

  echo @[1, 2, 3]
  # prints "@[1, 2, 3]" and not "123"


Tuples and object types
-----------------------
A variable of a tuple or object type is a heterogeneous storage
container.
A tuple or object defines various named *fields* of a type. A tuple also
defines an *order* of the fields. Tuples are meant for heterogeneous storage
types with no overhead and few abstraction possibilities. The constructor ``()``
can be used to construct tuples. The order of the fields in the constructor
must match the order of the tuple's definition. Different tuple-types are
*equivalent* if they specify the same fields of the same type in the same
order. The *names* of the fields also have to be identical.

The assignment operator for tuples copies each component.
The default assignment operator for objects copies each component. Overloading
of the assignment operator for objects is not possible, but this will change
in future versions of the compiler.

.. code-block:: nim

  type
    Person = tuple[name: string, age: int] # type representing a person:
                                           # a person consists of a name
                                           # and an age
  var
    person: Person
  person = (name: "Peter", age: 30)
  # the same, but less readable:
  person = ("Peter", 30)

The implementation aligns the fields for best access performance. The alignment
is compatible with the way the C compiler does it.

For consistency  with ``object`` declarations, tuples in a ``type`` section
can also be defined with indentation instead of ``[]``:

.. code-block:: nim
  type
    Person = tuple   # type representing a person
      name: string   # a person consists of a name
      age: natural   # and an age

Objects provide many features that tuples do not. Object provide inheritance
and information hiding. Objects have access to their type at runtime, so that
the ``of`` operator can be used to determine the object's type. The ``of`` operator
is similar to the ``instanceof`` operator in Java.

.. code-block:: nim
  type
    Person = object of RootObj
      name*: string   # the * means that `name` is accessible from other modules
      age: int        # no * means that the field is hidden

    Student = ref object of Person # a student is a person
      id: int                  # with an id field

  var
    student: Student
    person: Person
  assert(student of Student) # is true
  assert(student of Person) # also true

Object fields that should be visible from outside the defining module, have to
be marked by ``*``. In contrast to tuples, different object types are
never *equivalent*. Objects that have no ancestor are implicitly ``final``
and thus have no hidden type field. One can use the ``inheritable`` pragma to
introduce new object roots apart from ``system.RootObj``.


Object construction
-------------------

Objects can also be created with an `object construction expression`:idx: that
has the syntax ``T(fieldA: valueA, fieldB: valueB, ...)`` where ``T`` is
an ``object`` type or a ``ref object`` type:

.. code-block:: nim
  var student = Student(name: "Anton", age: 5, id: 3)

Note that, unlike tuples, objects require the field names along with their values.
For a ``ref object`` type ``system.new`` is invoked implicitly.


Object variants
---------------
Often an object hierarchy is overkill in certain situations where simple
variant types are needed.

An example:

.. code-block:: nim

  # This is an example how an abstract syntax tree could be modelled in Nim
  type
    NodeKind = enum  # the different node types
      nkInt,          # a leaf with an integer value
      nkFloat,        # a leaf with a float value
      nkString,       # a leaf with a string value
      nkAdd,          # an addition
      nkSub,          # a subtraction
      nkIf            # an if statement
    Node = ref NodeObj
    NodeObj = object
      case kind: NodeKind  # the ``kind`` field is the discriminator
      of nkInt: intVal: int
      of nkFloat: floatVal: float
      of nkString: strVal: string
      of nkAdd, nkSub:
        leftOp, rightOp: Node
      of nkIf:
        condition, thenPart, elsePart: Node

  # create a new case object:
  var n = Node(kind: nkIf, condition: nil)
  # accessing n.thenPart is valid because the ``nkIf`` branch is active:
  n.thenPart = Node(kind: nkFloat, floatVal: 2.0)

  # the following statement raises an `FieldError` exception, because
  # n.kind's value does not fit and the ``nkString`` branch is not active:
  n.strVal = ""

  # invalid: would change the active object branch:
  n.kind = nkInt

  var x = Node(kind: nkAdd, leftOp: Node(kind: nkInt, intVal: 4),
                            rightOp: Node(kind: nkInt, intVal: 2))
  # valid: does not change the active object branch:
  x.kind = nkSub

As can been seen from the example, an advantage to an object hierarchy is that
no casting between different object types is needed. Yet, access to invalid
object fields raises an exception.

The syntax of ``case`` in an object declaration follows closely the syntax of
the ``case`` statement: The branches in a ``case`` section may be indented too.

In the example the ``kind`` field is called the `discriminator`:idx:\: For
safety its address cannot be taken and assignments to it are restricted: The
new value must not lead to a change of the active object branch. For an object
branch switch ``system.reset`` has to be used.


Set type
--------

.. include:: ../sets_fragment.txt

Reference and pointer types
---------------------------
References (similar to pointers in other programming languages) are a
way to introduce many-to-one relationships. This means different references can
point to and modify the same location in memory (also called `aliasing`:idx:).

Nim distinguishes between `traced`:idx: and `untraced`:idx: references.
Untraced references are also called *pointers*. Traced references point to
objects of a garbage collected heap, untraced references point to
manually allocated objects or to objects somewhere else in memory. Thus
untraced references are *unsafe*. However for certain low-level operations
(accessing the hardware) untraced references are unavoidable.

Traced references are declared with the **ref** keyword, untraced references
are declared with the **ptr** keyword.

An empty subscript ``[]`` notation can be used to derefer a reference,
the ``addr`` procedure returns the address of an item. An address is always
an untraced reference.
Thus the usage of ``addr`` is an *unsafe* feature.

The ``.`` (access a tuple/object field operator)
and ``[]`` (array/string/sequence index operator) operators perform implicit
dereferencing operations for reference types:

.. code-block:: nim

  type
    Node = ref NodeObj
    NodeObj = object
      le, ri: Node
      data: int

  var
    n: Node
  new(n)
  n.data = 9
  # no need to write n[].data; in fact n[].data is highly discouraged!

Automatic dereferencing is also performed for the first argument of a routine
call. But currently this feature has to be only enabled
via ``{.experimental.}``:

.. code-block:: nim
  {.experimental.}

  proc depth(x: NodeObj): int = ...

  var
    n: Node
  new(n)
  echo n.depth
  # no need to write n[].depth either



In order to simplify structural type checking, recursive tuples are not valid:

.. code-block:: nim
  # invalid recursion
  type MyTuple = tuple[a: ref MyTuple]

Likewise ``T = ref T`` is an invalid type.

As a syntactical extension ``object`` types can be anonymous if
declared in a type section via the ``ref object`` or ``ptr object`` notations.
This feature is useful if an object should only gain reference semantics:

.. code-block:: nim

  type
    Node = ref object
      le, ri: Node
      data: int


To allocate a new traced object, the built-in procedure ``new`` has to be used.
To deal with untraced memory, the procedures ``alloc``, ``dealloc`` and
``realloc`` can be used. The documentation of the system module contains
further information.

If a reference points to *nothing*, it has the value ``nil``.

Special care has to be taken if an untraced object contains traced objects like
traced references, strings or sequences: in order to free everything properly,
the built-in procedure ``GCunref`` has to be called before freeing the untraced
memory manually:

.. code-block:: nim
  type
    Data = tuple[x, y: int, s: string]

  # allocate memory for Data on the heap:
  var d = cast[ptr Data](alloc0(sizeof(Data)))

  # create a new string on the garbage collected heap:
  d.s = "abc"

  # tell the GC that the string is not needed anymore:
  GCunref(d.s)

  # free the memory:
  dealloc(d)

Without the ``GCunref`` call the memory allocated for the ``d.s`` string would
never be freed. The example also demonstrates two important features for low
level programming: the ``sizeof`` proc returns the size of a type or value
in bytes. The ``cast`` operator can circumvent the type system: the compiler
is forced to treat the result of the ``alloc0`` call (which returns an untyped
pointer) as if it would have the type ``ptr Data``. Casting should only be
done if it is unavoidable: it breaks type safety and bugs can lead to
mysterious crashes.

**Note**: The example only works because the memory is initialized to zero
(``alloc0`` instead of ``alloc`` does this): ``d.s`` is thus initialized to
``nil`` which the string assignment can handle. One needs to know low level
details like this when mixing garbage collected data with unmanaged memory.

.. XXX finalizers for traced objects


Not nil annotation
------------------

All types for that ``nil`` is a valid value can be annotated to
exclude ``nil`` as a valid value with the ``not nil`` annotation:

.. code-block:: nim
  type
    PObject = ref TObj not nil
    TProc = (proc (x, y: int)) not nil

  proc p(x: PObject) =
    echo "not nil"

  # compiler catches this:
  p(nil)

  # and also this:
  var x: PObject
  p(x)

The compiler ensures that every code path initializes variables which contain
non nilable pointers. The details of this analysis are still to be specified
here.


Memory regions
--------------

The types ``ref`` and ``ptr`` can get an optional ``region`` annotation.
A region has to be an object type.

Regions are very useful to separate user space and kernel memory in the
development of OS kernels:

.. code-block:: nim
  type
    Kernel = object
    Userspace = object

  var a: Kernel ptr Stat
  var b: Userspace ptr Stat

  # the following does not compile as the pointer types are incompatible:
  a = b

As the example shows ``ptr`` can also be used as a binary
operator, ``region ptr T`` is a shortcut for ``ptr[region, T]``.

In order to make generic code easier to write ``ptr T`` is a subtype
of ``ptr[R, T]`` for any ``R``.

Furthermore the subtype relation of the region object types is lifted to
the pointer types: If ``A <: B`` then ``ptr[A, T] <: ptr[B, T]``. This can be
used to model subregions of memory. As a special typing rule ``ptr[R, T]`` is
not compatible to ``pointer`` to prevent the following from compiling:

.. code-block:: nim
  # from system
  proc dealloc(p: pointer)

  # wrap some scripting language
  type
    PythonsHeap = object
    PyObjectHeader = object
      rc: int
      typ: pointer
    PyObject = ptr[PythonsHeap, PyObjectHeader]

  proc createPyObject(): PyObject {.importc: "...".}
  proc destroyPyObject(x: PyObject) {.importc: "...".}

  var foo = createPyObject()
  # type error here, how convenient:
  dealloc(foo)


Future directions:

* Memory regions might become available for  ``string`` and ``seq`` too.
* Builtin regions like ``private``, ``global`` and ``local`` will
  prove very useful for the upcoming OpenCL target.
* Builtin "regions" can model ``lent`` and ``unique`` pointers.
* An assignment operator can be attached to a region so that proper write
  barriers can be generated. This would imply that the GC can be implemented
  completely in user-space.


Procedural type
---------------
A procedural type is internally a pointer to a procedure. ``nil`` is
an allowed value for variables of a procedural type. Nim uses procedural
types to achieve `functional`:idx: programming techniques.

Examples:

.. code-block:: nim

  proc printItem(x: int) = ...

  proc forEach(c: proc (x: int) {.cdecl.}) =
    ...

  forEach(printItem)  # this will NOT compile because calling conventions differ


.. code-block:: nim

  type
    OnMouseMove = proc (x, y: int) {.closure.}

  proc onMouseMove(mouseX, mouseY: int) =
    # has default calling convention
    echo "x: ", mouseX, " y: ", mouseY

  proc setOnMouseMove(mouseMoveEvent: OnMouseMove) = discard

  # ok, 'onMouseMove' has the default calling convention, which is compatible
  # to 'closure':
  setOnMouseMove(onMouseMove)


A subtle issue with procedural types is that the calling convention of the
procedure influences the type compatibility: procedural types are only
compatible if they have the same calling convention. As a special extension,
a procedure of the calling convention ``nimcall`` can be passed to a parameter
that expects a proc of the calling convention ``closure``.

Nim supports these `calling conventions`:idx:\:

`nimcall`:idx:
    is the default convention used for a Nim **proc**. It is the
    same as ``fastcall``, but only for C compilers that support ``fastcall``.

`closure`:idx:
    is the default calling convention for a **procedural type** that lacks
    any pragma annotations. It indicates that the procedure has a hidden
    implicit parameter (an *environment*). Proc vars that have the calling
    convention ``closure`` take up two machine words: One for the proc pointer
    and another one for the pointer to implicitly passed environment.

`stdcall`:idx:
    This the stdcall convention as specified by Microsoft. The generated C
    procedure is declared with the ``__stdcall`` keyword.

`cdecl`:idx:
    The cdecl convention means that a procedure shall use the same convention
    as the C compiler. Under windows the generated C procedure is declared with
    the ``__cdecl`` keyword.

`safecall`:idx:
    This is the safecall convention as specified by Microsoft. The generated C
    procedure is declared with the ``__safecall`` keyword. The word *safe*
    refers to the fact that all hardware registers shall be pushed to the
    hardware stack.

`inline`:idx:
    The inline convention means the the caller should not call the procedure,
    but inline its code directly. Note that Nim does not inline, but leaves
    this to the C compiler; it generates ``__inline`` procedures. This is
    only a hint for the compiler: it may completely ignore it and
    it may inline procedures that are not marked as ``inline``.

`fastcall`:idx:
    Fastcall means different things to different C compilers. One gets whatever
    the C ``__fastcall`` means.

`syscall`:idx:
    The syscall convention is the same as ``__syscall`` in C. It is used for
    interrupts.

`noconv`:idx:
    The generated C code will not have any explicit calling convention and thus
    use the C compiler's default calling convention. This is needed because
    Nim's default calling convention for procedures is ``fastcall`` to
    improve speed.

Most calling conventions exist only for the Windows 32-bit platform.

Assigning/passing a procedure to a procedural variable is only allowed if one
of the following conditions hold:
1) The procedure that is accessed resides in the current module.
2) The procedure is marked with the ``procvar`` pragma (see `procvar pragma <#pragmas-procvar-pragma>`_).
3) The procedure has a calling convention that differs from ``nimcall``.
4) The procedure is anonymous.

The rules' purpose is to prevent the case that extending a non-``procvar``
procedure with default parameters breaks client code.

The default calling convention is ``nimcall``, unless it is an inner proc (a
proc inside of a proc). For an inner proc an analysis is performed whether it
accesses its environment. If it does so, it has the calling convention
``closure``, otherwise it has the calling convention ``nimcall``.


Distinct type
-------------

A ``distinct`` type is new type derived from a `base type`:idx: that is
incompatible with its base type. In particular, it is an essential property
of a distinct type that it **does not** imply a subtype relation between it
and its base type. Explicit type conversions from a distinct type to its
base type and vice versa are allowed.


Modelling currencies
~~~~~~~~~~~~~~~~~~~~

A distinct type can be used to model different physical `units`:idx: with a
numerical base type, for example. The following example models currencies.

Different currencies should not be mixed in monetary calculations. Distinct
types are a perfect tool to model different currencies:

.. code-block:: nim
  type
    Dollar = distinct int
    Euro = distinct int

  var
    d: Dollar
    e: Euro

  echo d + 12
  # Error: cannot add a number with no unit and a ``Dollar``

Unfortunately, ``d + 12.Dollar`` is not allowed either,
because ``+`` is defined for ``int`` (among others), not for ``Dollar``. So
a ``+`` for dollars needs to be defined:

.. code-block::
  proc `+` (x, y: Dollar): Dollar =
    result = Dollar(int(x) + int(y))

It does not make sense to multiply a dollar with a dollar, but with a
number without unit; and the same holds for division:

.. code-block::
  proc `*` (x: Dollar, y: int): Dollar =
    result = Dollar(int(x) * y)

  proc `*` (x: int, y: Dollar): Dollar =
    result = Dollar(x * int(y))

  proc `div` ...

This quickly gets tedious. The implementations are trivial and the compiler
should not generate all this code only to optimize it away later - after all
``+`` for dollars should produce the same binary code as ``+`` for ints.
The pragma `borrow`:idx: has been designed to solve this problem; in principle
it generates the above trivial implementations:

.. code-block:: nim
  proc `*` (x: Dollar, y: int): Dollar {.borrow.}
  proc `*` (x: int, y: Dollar): Dollar {.borrow.}
  proc `div` (x: Dollar, y: int): Dollar {.borrow.}

The ``borrow`` pragma makes the compiler use the same implementation as
the proc that deals with the distinct type's base type, so no code is
generated.

But it seems all this boilerplate code needs to be repeated for the ``Euro``
currency. This can be solved with templates_.

.. code-block:: nim
  template additive(typ: typedesc): stmt =
    proc `+` *(x, y: typ): typ {.borrow.}
    proc `-` *(x, y: typ): typ {.borrow.}

    # unary operators:
    proc `+` *(x: typ): typ {.borrow.}
    proc `-` *(x: typ): typ {.borrow.}

  template multiplicative(typ, base: typedesc): stmt =
    proc `*` *(x: typ, y: base): typ {.borrow.}
    proc `*` *(x: base, y: typ): typ {.borrow.}
    proc `div` *(x: typ, y: base): typ {.borrow.}
    proc `mod` *(x: typ, y: base): typ {.borrow.}

  template comparable(typ: typedesc): stmt =
    proc `<` * (x, y: typ): bool {.borrow.}
    proc `<=` * (x, y: typ): bool {.borrow.}
    proc `==` * (x, y: typ): bool {.borrow.}

  template defineCurrency(typ, base: expr): stmt =
    type
      typ* = distinct base
    additive(typ)
    multiplicative(typ, base)
    comparable(typ)

  defineCurrency(Dollar, int)
  defineCurrency(Euro, int)


The borrow pragma can also be used to annotate the distinct type to allow
certain builtin operations to be lifted:

.. code-block:: nim
  type
    Foo = object
      a, b: int
      s: string

    Bar {.borrow: `.`.} = distinct Foo

  var bb: ref Bar
  new bb
  # field access now valid
  bb.a = 90
  bb.s = "abc"

Currently only the dot accessor can be borrowed in this way.


Avoiding SQL injection attacks
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

An SQL statement that is passed from Nim to an SQL database might be
modelled as a string. However, using string templates and filling in the
values is vulnerable to the famous `SQL injection attack`:idx:\:

.. code-block:: nim
  import strutils

  proc query(db: DbHandle, statement: string) = ...

  var
    username: string

  db.query("SELECT FROM users WHERE name = '$1'" % username)
  # Horrible security hole, but the compiler does not mind!

This can be avoided by distinguishing strings that contain SQL from strings
that don't. Distinct types provide a means to introduce a new string type
``SQL`` that is incompatible with ``string``:

.. code-block:: nim
  type
    SQL = distinct string

  proc query(db: DbHandle, statement: SQL) = ...

  var
    username: string

  db.query("SELECT FROM users WHERE name = '$1'" % username)
  # Error at compile time: `query` expects an SQL string!


It is an essential property of abstract types that they **do not** imply a
subtype relation between the abstract type and its base type. Explicit type
conversions from ``string`` to ``SQL`` are allowed:

.. code-block:: nim
  import strutils, sequtils

  proc properQuote(s: string): SQL =
    # quotes a string properly for an SQL statement
    return SQL(s)

  proc `%` (frmt: SQL, values: openarray[string]): SQL =
    # quote each argument:
    let v = values.mapIt(SQL, properQuote(it))
    # we need a temporary type for the type conversion :-(
    type StrSeq = seq[string]
    # call strutils.`%`:
    result = SQL(string(frmt) % StrSeq(v))

  db.query("SELECT FROM users WHERE name = '$1'".SQL % [username])

Now we have compile-time checking against SQL injection attacks.  Since
``"".SQL`` is transformed to ``SQL("")`` no new syntax is needed for nice
looking ``SQL`` string literals. The hypothetical ``SQL`` type actually
exists in the library as the `TSqlQuery type <db_sqlite.html#TSqlQuery>`_ of
modules like `db_sqlite <db_sqlite.html>`_.


Void type
---------

The ``void`` type denotes the absence of any type. Parameters of
type ``void`` are treated as non-existent, ``void`` as a return type means that
the procedure does not return a value:

.. code-block:: nim
  proc nothing(x, y: void): void =
    echo "ha"

  nothing() # writes "ha" to stdout

The ``void`` type is particularly useful for generic code:

.. code-block:: nim
  proc callProc[T](p: proc (x: T), x: T) =
    when T is void:
      p()
    else:
      p(x)

  proc intProc(x: int) = discard
  proc emptyProc() = discard

  callProc[int](intProc, 12)
  callProc[void](emptyProc)

However, a ``void`` type cannot be inferred in generic code:

.. code-block:: nim
  callProc(emptyProc)
  # Error: type mismatch: got (proc ())
  # but expected one of:
  # callProc(p: proc (T), x: T)

The ``void`` type is only valid for parameters and return types; other symbols
cannot have the type ``void``.


Auto type
---------

The ``auto`` type can only be used for return types and parameters. For return
types it causes the compiler to infer the type from the routine body:

.. code-block:: nim
  proc returnsInt(): auto = 1984

For parameters it currently creates implicitly generic routines:

.. code-block:: nim
  proc foo(a, b: auto) = discard

Is the same as:

.. code-block:: nim
  proc foo[T1, T2](a: T1, b: T2) = discard

However later versions of the language might change this to mean "infer the
parameters' types from the body". Then the above ``foo`` would be rejected as
the parameters' types can not be infered from an empty ``discard`` statement.
