5. Types

5.1. What is a type?

A type consists of a set of values and designated operations on the values.

Some languages provide sophisticated types while others do not. Major reasons why we use types are as follows:

  • More informative for human readers.

  • Compilers can detect errors before execution if types are inconsistent.

  • Compilers can generate more efficient code.

Exercise 10

An integer is used for conditional branch (0 is false, otherwise true) in C, while boolean and int are different types in Java. What are merits and demerits of these two methods.

[Answer]

5.2. Type system

A type system is a set of rules that associates syntactic entities (usually expressions) with their types.

The process of verifying consistency of types in a program is called type checking. Type checking at compile time is very useful for debugging.

Dynamic typing

Syntactic entities (e.g. variables) have no type declarations whereas semantic entities (e.g. values) have types. The type of an expression is determined when its value is calculated. Any type of values can be assigned to any variable.

Static typing

Types of all expressions are determined before execution, based on type declarations.

Gradual typing

Some expressions are given types before execution while othes are left untyped and determined dynamically.

  • In Fortran, explicit and implicit variable declarations are possible. Using implicit declaration, a variable is typed according to the initial letter of its name: integer if beginning with I, … , N, and real if beginning with other letters.

  • In Algol-like languages, variables, function parameters and return values need type declarations.

  • In ML and Haskell, a type can be declared for any expression in a program.

  • Recently, some dynamic typing languages such as Python and Ruby introduce type annotations or type hints to provide gradual typing.

5.3. Various types

Primitive types

Hardware-defined or indecomposable types. (e.g. integer, floating point, character)

Structured types

Types composed of some other types. Many languages allow programmers to define new types.

5.3.1. Union type

A union type is a type whose value may be one of several different types.

  • In C, union allows programmers to access the same memory location with more than one types.

  • It is up to a programmer which type is used to read and write the content of the memory.

  • Some programmers intentionally use it for evil purposes.

  • In Pascal, record with variant parts provides more sophisticated way.

  • A tag field indicates the type currently stored in a variant part.

  • An error occurs when an operation mismatches with the type in the tag.

  • Still, some programmers may intentionally change the tag value to avoid type checking.

A record with variant parts in Pacal

{ `shape' is an enumerated type whose value is `circle' or `rectangle' }
type shape = (circle, rectangle);

{ `figure' is a record type to store data of rectangles and circles }
type figure = record
                x : real, y : real;
                case form : shape of      { variant parts whose tag is `form' }
                  circle:    (radius : real);
                  rectangle: (width: real; height: real);
                end;

var myfigure : figure;
begin
  myfigure.x := 2.0;
  myfigure.y := 3.0;
  myfigure.form := circle;
  myfigure.radius := 1.0;
  myfigure.width := 1.0;  {Error}
end
  • In object oriented languages, class inheritance is used for that puropose.

Class inheritance in Java

class Main {
  public static void main(String[] args) {
     Figure myfigure;
     myfigure = new Circle();
     myfigure.x = 2.0;
     myfigure.y = 3.0;
     ((Circle)myfigure).radius = 1.0;
  }
}

class Figure {
  double x, y;
}

class Circle extends Figure {
  double radius;
}

class Rectangle extends Figure {
  double width, height;
}
  • In TypeScript, the union operator combines any two types.

    • An operation is allowed if it is valid for all of original types.

Union class and narrowing in TypeScript

"use strict"

interface circle { kind: "circle", x:number, y:number, radius:number }
interface rectangle { kind: "rectangle", x:number, y:number, width: number, height: number }

type figure = circle | rectangle

function foo(f: figure) {
    console.log(f.x)       // OK
    console.log(f.radius)  // Error - Property 'radius' does not exist on type 'figure'.
}

function bar(f: figure) {
    if (f.kind == "circle") {   // Narrowing - Type of f is now circle
        console.log(f.radius)   // OK
    }
}

5.3.2. Algebraic data type

An algebraic data type is constructed from other types by two methods: a sum and a product.

  • A sum is a choice among types (same as a union type). It contains a constructor to indicate which component type the value belongs.

  • A product is a combination of types (same as a record type).

Single linked list

data IntList = Nil | Cons Int IntList

5.3.3. Reference type

A reference type (or a pointer type) is a type whose value is a reference to another variable. By applying dereference operation, we can get a value of the referred variable.

By using pointers, we can dynamically allocate variables that are not bound to any names. It may cause the following two problems:

Dangling Pointer

If a memory area is released during a pointer is referring to it, the pointer will refer to a garbage or a reused area.

Lost Heap-dynamic Variable

If values of pointers are changed, the memory area referred by the pointers may lose all the references, which means we can never access that area.

  • In case of C, pointers are virtually same as the hardware memory addresses. C has many problems including the above two.

  • In case of Pascal, a pointer can only refer to an area allocated by new. We must explicitly release the allocated area by dispose. Pascal has the above two problems.

  • In case of Java, objects are implicitly referred to by pointers. Automatic garbage collection prevents the above two problems.

  • In case of Rust, the concept of ownership enables the compiler to guarntee the safety of reference handling without garbage collection.

5.3.4. Function type

A function type is composed of types of parameters and return values. It will be very complex for higher-order functions (i.e. parameters or return values are functions).

Function type in Haskell

inc :: Int -> Int
inc x = x + 1

add :: Int -> Int -> Int
add x y = x + y

twice :: (Int -> Int) -> Int
twice f = f(f(1))

twice(inc)   -- returns 3

5.4. Type inference

In languages such as ML and Haskell, every expression in a program should have its type. It is too painful to write types for all expressions, the compiler makes an inference about proper typing if we give minimum declarations.

Type inference in F#

Fully decalared version:

let addone (x:int) : int =
  let result:int = x + 1
  result

We can omit type declaration. Still, a function call addone 1.5 causes an error at compile time. The compiler infers the type of x is int because arguments of + must have the same type.

let addone x =
  let result = x + 1
  result

5.5. Polymorphism

When a single operator or function can deal with values of more than one type, we call it polymorphism.

ad hoc polymorphism

Definitions for each type are independent. Also called overloading.

parametric polymorphism

Definitions include formal parameters that receive types of data currently processed.

+ operator in Java

In Java, + operator has two meanings: addition of numbers and concatenation of strings.

Template in C++

We can write a function max in C++ as follows:

template <class Type>
Type max(Type first, Type second) {
    return first > second ? first : second;
}

This template can be instantiated for int and char as follows:

int a, b, c;
char d, e, f;

c = max(a, b);
f = max(d, e);

Polymorphic function in Haskell

In Haskell, an array type is denoted by a bracket [ ] and its element type. For example, the type of [1, 2, 3] is [Int], and the type of ["abc", "def"] is [[Char]] (because a string is an array of character [Char]).

Although [1, 2, 3] and ["abc", "def"] have different types, the function length can be applied to both of them. For example, length [1, 2, 3] returns 3 and length ["abc", "def"] returns 2.

The type of length is [a]->Int where a is a type variable.

Exercise 11

In Haskell, a function call map f x applies the function f to each element of the array x, and then make an array collecting their results. For example, map abs [-5, 3] returns [5, 3] and map length ["abc", "de"] returns [3, 2]. Write the type of map, using type variables a and b.

[Answer]