Open-source Erlang - White Paper

This is a brief introduction to Erlang for programmers. 


Erlang Overview

Erlang is a programming language which has many features more commonly associated with an operating system than with a programming language: concurrent processes, scheduling, memory management, distribution, networking, etc.

The initial open-source Erlang release contains the implementation of Erlang, as well as a large part of Ericsson's middleware for building distributed high-availability systems.

Erlang is characterized by the following features:

Concurrency - Erlang has extremely lightweight processes whose memory requirements can vary dynamically. Processes have no shared memory and communicate by asynchronous message passing. Erlang supports applications with very large numbers of concurrent processes. No requirements for concurrency are placed on the host operating system.

Distribution - Erlang is designed to be run in a distributed environment. An Erlang virtual machine is called an Erlang node. A distributed Erlang system is a network of Erlang nodes (typically one per processor). An Erlang node can create parallel processes running on other nodes, which perhaps use other operating systems. Processes residing on different nodes communicate in exactly the same was as processes residing on the same node.

Robustness - Erlang has various error detection primitives which can be used to structure fault-tolerant systems. For example, processes can monitor the status and activities of other processes, even if these processes are executing on other nodes. Processes in a distributed system can be configured to fail-over to other nodes in case of failures and automatically migrate back to recovered nodes.

Soft real-time - Erlang supports programming "soft" real-time systems, which require response times in the order of milliseconds. Long garbage collection delays in such systems are unacceptable, so Erlang uses incremental garbage collection techniques.

Hot code upgrade - Many systems cannot be stopped for software maintenance. Erlang allows program code to be changed in a running system. Old code can be phased out and replaced by new code. During the transition, both old code and new code can coexist. It is thus possible to install bug fixes and upgrades in a running system without disturbing its operation.

Incremental code loading - Users can control in detail how code is loaded. In embedded systems, all code is usually loaded at boot time. In development systems, code is loaded when it is needed, even when the system is running. If testing uncovers bugs, only the buggy code need be replaced.

External interfaces - Erlang processes communicate with the outside world using the same message passing mechanism as used between Erlang processes. This mechanism is used for communication with the host operating system and for interaction with programs written in other languages. If required for reasons of efficiency, a special version of this concept allows e.g. C programs to be directly linked into the Erlang runtime system.

Components

Open-source Erlang comes with several standalone components that can be used as building blocks when developing applications. These components understands Erlang's systems messages (load, unload, start, stop, restart, change code).

Inets - HTTP 1.0 server and FTP client.

Mnesia - Distributed real-time database for Erlang. Supports RAM-replication as well as disk storage, allows dynamic schema changes, allows arbitrarily complex data structures to be stored. Mnesia is very fast since it runs in the same address space as the applications that use it - this is possible since both Mnesia and the applications are written in Erlang. Mnesia is a nice example of the power of Erlang: in how many languages could you write a fully-featured industrial-strength distributed DBMS in less than 20,000 lines of code?

Orber - CORBA v2.0 Object Request Broker (ORB).

SNMP - Extensible SNMP v1/v2 agent and MIB compiler.

Tools and Libraries

Open-source Erlang comes with a library of useful tools:

Appmon - Graphical monitoring of process groups (locally and on remote nodes).

ASN.1 - Compile-time and runtime package which supports the ASN.1 Basic Notation and the encoding rules BER, DER and PER (aligned).

Compiler - Erlang compiler.

Debugger - Graphical Erlang debugger.

ERTS - Erlang runtime system, including the virtual machine, the garbage collector, and the port mapper daemon.

GS - Library for writing graphical user interfaces.

IC - Compiler from OMG's Interface Definition Language (IDL) to Erlang and C and Java.

Kernel - C code necessary to run the Erlang system: Erlang built-in functions (BIFs); code, boot and name servers; networking and distribution support; loaders, linkers and loggers; OS and file system interfaces.

Mnemosyne - Optional query language for Mnesia.

Mnesia Session - Foreign languages interface to Mnesia defined in IDL, providing Mnesia access via the IIOP and erl_interface protocols.

OS monitor (OS_MON) - Monitoring of CPU, disk and memory utilization, including SNMP v1/v2 MIBs. Interfaces to Solaris syslogd and Windows NT event log.

Parse tools - LALR-1 parser generator for Erlang (yecc), similar to yacc. Yecc takes a BNF grammar definition as input, and produces Erlang code for a parser as output. Yecc is used to generate the Erlang parser.

PMan - Tool for tracing and viewing the state of Erlang processes (locally or on remote nodes).

SASL - Progress/error/crash report handling, report browsing, release handling, overload regulation.

Stdlib - Libraries for: input/output; incore and disk-based table storage (ETS and DETS); graphs, dictionaries, lists, strings, sets, queues; regular expressions; math. Erlang interpreter, tokenizer, parser, lint and pretty-printer. Generic frameworks for fault-tolerant servers, event handlers, state machines, and process supervisors. Etc, etc.

Table visualizer - Tool for viewing ETS and Mnesia tables.

Toolbar - Simplifies access to the Erlang Tools.

Tools - Coverage analyser, profiler, text-based tracer, Emacs mode, Emacs TAGS file generator, make utility, call graph utility.
 

Erlang in 14 Examples

The following sections describe the basic features of Erlang through a number of small examples. This is hardly enough to learn the language, so check out the links at the end of this document for suggested further reading.

Sequential Erlang

Example 1 - Factorial

Consider the factorial function n! defined by:

n! = n * (n - 1)!,
n! = 1

if n > 0
if n = 0

The Erlang program for computing factorial closely resembles this definition:

    fac(N) when N > 0  -> N * fac(N-1);
    fac(N) when N == 0 -> 1.

Variables, such as N in this example, start with an uppercase letter (non-numerical constants start with a lowercase letter). In contrast to, for example C variables, Erlang variables can only be assigned once - when a variable has been bound to a value, we cannot bind it to a new value.

This program has two clauses, one for each of the cases (N>0 and N=0). When computing, say, fac(5) we call each of the clauses in order - the first to match the call is selected and executed (the first clause in this case, since 5>0).

We can simplify the second clause to get the program:

    fac(N) when N > 0 ->  N * fac(N-1);
    fac(0)            ->  1.

To understand the simplified second clause, we must understand what is meant by matching a function call with a clause. If we have a function call f(X) and a clause

     f(Y) when Test -> ...

then the call matches the clause if X matches Y and the Test is true. The arguments X and Y match if they are identical or can be made identical by binding variables in Y to corresponding values in X. The call fac(5) matches the first clause, since fac(5) is made identical to fac(N) by binding N to 5 and 5 > 0 is true (but fac(5) does not match the second clause, since it is not identical to fac(0)).

All Erlang functions are defined in modules. A module math containing the factorial function can be written:

    -module(math).
    -export([fac/1]).
 
    fac(N) when N > 0 ->  N * fac(N-1);
    fac(0)            ->  1.

The annotation -export([fac/1]) means the function fac with one argument is exported from the module. Only exported functions can be called from outside the module.

Once a module has been compiled and loaded into the system the query evaluator can be used for function evaluation:

 
mymachine> erl
Erlang (JAM) emulator version 4.7.3
 
Eshell V4.7.3  (abort with ^G)
1> c(math).
{ok,math}
2> math:fac(25).
15511210043330985984000000

As we might guess from the answer of this function evaluation, Erlang can handle arbitrarily large integers without overflow.

Example 2 - Last

Erlang uses the following syntax for linked lists (with three elements in this example): [X,Y,Z]. The notation [First|Rest] means that First is the first element of the list (X in our example) and Rest is the remainder of the list ([Y,Z] in our example). The empty list is denoted [].

This program returns the last element of a list:

 last([First]) -> First;
 last([First|Rest]) -> last(Rest).

The program removes elements from the front of the list until only one element remains (this is the last element of the original list).

Example 3 - Append

This program concatenates two lists:

  append([], List) -> List;
  append([First|Rest], List) -> [First | append(Rest,List)].

The first clause says that concatenating the empty list with any list List yields List.

The second clause says that concatenating [First|Rest] with a list List yeilds a list whose first element is First and whose remainder is the list resulting from concatenating Rest with List.

Note here that the notation [...|...] has different meanings in the function head and function body. In the head it denotes list decomposistion, while in the body it denotes list construction.

Example 4 - Sort

This program sorts a list using the Quicksort algorithm:

  sort([]) -> []; 
  sort([Pivot|T]) -> 
      sort([X||X <- T, X < Pivot]) ++ 
      [Pivot] ++
      sort([X||X <- T, X >= Pivot]).

Here ++ is the infix append operator.

The notation [Expr || Qualifier1, Qualifier2, ...] introduces a list comprehension. Here Expression is an arbitrary expression, and each Qualifier is either a generator or a filter.

For example, the list comprehension

[X || X <- [1,2,a,3,4,b,5,6], integer(X), X > 3].

should be read as:

The list of X such that X is taken from the list [1,2,a,3,4,b,5,6] and X is an integer and X is greater than 3.

Here X <- [1,2,a,3,4,b,5,6] is a generator, and integer(X) and X > 3 are filters. This list comprehension evaluates to [4,5,6].

Example 5 - Adder

Function objects are introduced with the syntax:

  fun(...) -> Expr end

As an example we can write:

  > Adder = fun(N,X) ->  X + N end.
  #Fun
  > Adder(10,5).
  15

As another an example we can write:

  > Adder = fun(N) -> fun(X) -> X + N end end.
  #Fun
  > G = Adder(10).
  #Fun
  > G(5).
  15

Example 5 - Binary Tree

This program looks up a numeric key in a dictionary represented as a binary tree. The value stored with the key is returned. The binary tree is represented as a tuple {Key,Val,Left,Right}, where Left and Right are the left and right subtrees. (Tuples store fixed numbers of arguments, which themselves may be tuples or any other data type).

  lookup(Key, {Key1,Val,Left,Right}) when Key == Key1 ->
      Val;              
  lookup(Key,{Key1,Val,Left,Right}) when Key < Key1 ->
      lookup(Key, Left);
  lookup(Key, {Key1,Val,Left,Right}) when Key > Key1 ->
      lookup(Key, Right).

We can simplify the first clause by replacing Key1 with Key in the head, since Key == Key1 for a call to match this clause (the test Key == Key1 is then redundant):

  lookup(Key, {Key,Val,Left,Right}) ->
      Val;

This works as follows: The first occurrence of Key is bound to an integer n in the call; then the second occurrence - which is now bound to n - is compared to the key in the current root node of the tree. If we have a match, the corresponding value is returned.

(Note also that the test Key > Key1 in the third clause is redundant, since clauses are matched in sequential order. However, it is good programming practice to include it for clarity.)

Concurrent Erlang

Example 6 - An area Server

The next program is an "area server" - you can ask the server what the area of a square or rectangle is, or, you can ask it to return the total of all areas that it has been requested to compute (the total is the local state kept by the server). Messages to the server are tuples {Pid,Request}, where Pid is the client's process identifier.

   -module(area_server).
   -export([start/1, loop/1]).
 
    start() ->
        spawn(area_server, loop, [0]).
 
    loop(Tot) ->
        receive
            {Client, {square, X}} ->
                Client ! X*X,
                loop(Tot + X*X);
            {Client, {rectangle,X,Y}} ->
                Client ! X*Y,
                loop(Tot + X*Y);
            {Client, areas} ->
                Client ! Tot,
                loop(Tot)
        end.

The Erlang primitive spawn(Module,Fun,Args) creates a parallel process which evaluates Module:Fun with arguments from the list Args. It returns a process identifier (Pid) which can be used to communicate with the new process.

The expression Pid ! Msg sends the message Msg to the process Pid. Sending a message is a non-blocking operation. Messages are received using the receive ... end construction, which selects what message to receive by matching the message with the receive clauses in sequential order. The receive operation is blocking - it suspends until a matching message is delivered to the process. This is how synchronization is expressed in Erlang.

Example 7 - An area client

Client code which uses the above server can be written (self() returns the client's Pid):

    Server ! {self(), {square, 10}},
    receive
        Area ->
            Area
    end

Here the area is simply returned from the receive statement.

Example 8 - Named Server

In the above examples, the process identifier of the server had to made known to the client. To provide a named service we can instead register the process under a name, which is local to the node:

    Pid = spawn(Module,Fun,Args),
    register(area_server, Pid)

This associates the name area_server with Pid. Now any process evaluating in the node can send a message to area_server:

    area_server ! SomeMessage

Distribution

Example 9 - Spawning on a remote node

We can start Erlang systems (nodes) on several machines and have them talk to each other. Here is a simple example.

    -module(ex9).
    -export([start/0, world/0]).
    
    start() -> 
        Pid = spawn('b@host2,ex9,world,[]),
        Pid ! {self(), 'hello world!'},
        receive
          Msg ->
            io:format('~w~n',[Msg])
        end.
 
    world() ->
        receive
          {Pid,'hello world!'} ->
            io:format(user,'hello world!~n',[]),          
            Pid ! 'hello, little thread'
        end.

Running the example on two machines yields the following:

host1>erl -sname a

host2>erl -sname b

a@host1>c(ex9).

b@host1>c(ex9).

a@host1>ex9:start().

b@host2>

a@host1>

b@host2>hello, world!

a@host1>hello, little thread

b@host2>

Error detection

Erlang is designed for programming "robust" systems, so there are a number of primitives for trapping errors. Error recovery is not automatic. The programmer must design a fault-tolerant architecture which can be implemented using the error detection mechanisms.

Example 10 - Catch

All Erlang programs run within Erlang processes. If a runtime error occurs in a program, it crashes the process it runs within but leaves other processes intact. So, for example, if you evaluate 1/0 within a process, the process crashes with the printout:

 ** exited: {badarith,{erl_eval,eval_op,['/',1,0]}} **

(In fact, the Erlang shell detects the process crash and prints the message; see Example 13.) A computation Expr can also be enclosed within catch/1, that is we run (catch Expr) instead of just Expr. Any error within Expr is then converted to a data structure describing the error and the process does not crash. Thus (catch 1/0) returns the tuple:

{'EXIT',{badarith,{erl_eval,eval_op,['/',1,0]}}}

Example 11 - Catch and throw

Non-local returns can be performed with throw(Expr), which causes Expr to be evaluated and returned by the enclosing catch. Say that f/1 is a function that may throw an exception:

     f(X) when X > 0 ->
        ...
        NormalReturnValue;
     f(X) when X =< 0 ->
        ...
        throw({exception1, ...}).

In some other part of the program we have a call to f/1, which is wrapped by a catch:

     case catch f(X) of
        {exception1, Why} ->
           handle_exception(Why,...);
        NormalReturnValue ->
           handle_normal_case(NormalReturnValue,...)
     end

Example 12 - Links and trapping exits

Processes can be linked together. If a process dies an error message is sent to all processes to which it is linked and which have set the flag trap_exit.

     process_flag(trap_exit, true),
     Pid = spawn_link(Mod, Fun, Args),
     receive
       {'EXIT', Pid, Why} ->
           handle_process_crash(Why,...);
       ...
     end

Here spawn_link(Mod,Fun,Args) creates a parallel process and links the parent with the child.

Example 13 - Process supervision

If the child process crashes then an error message is sent to all processes which the child process is linked to. In our example an error message is sent to the parent process. The parent process can receive the error message and take appropriate action (for example, restart the child process).
 

-module(ex13).
-export([start/0, loop/0]).

start() ->
    process_flag(trap_exit, true),
    parent(Child, 5),
    Child = spawn(ex13, child, []).

parent(Child, K) ->
    receive
        {'EXIT', Child, Why} when K > 0 ->
            io:format('child died with reason ~p~n', [Why]),
            NewChild = spawn_link(ex13, child, []),
            parent(NewChild, K-1);
        {'EXIT', Child, _} ->
            io:format('too many restarts, bye~n', [])
    end.

child() -> exit(died).


Program ex13 implements a simple supervisor, which restarts child processes that die. We can easily make Parent supervise Child remotely: just spawn the processes on two separate nodes. The rest of the program is unchanged.

Hot code replacement

Example 14 - Code replacement

Erlang is designed for "non-stop" systems. We need to be able to replace code and data without stopping the system. This example shows how we can change code in a server, without stopping the server.
 

-module(ex14).
-export([start/0,server/0,client/0]).

start( ) -> Pid = spawn(ex14, server, []), spawn(ex14, client [0,Pid]).

server( ) ->
    receive
        Msg ->
            io:format('received ~p~n', [Msg]),
            ex14:server( )
    end.

client(N, Echo) ->
    Echo ! N,
    client(N+1, Echo).


This example requires that you edit and recompile ex14 while ex14 is running. The recompilation also loads the new code. Once the new module code has been loaded, the next fully qualified call ex14:server() switches to the new version of server().

mymachine> erl
Erlang (JAM) emulator version 4.7.3.3

Eshell V4.7.3.3  (abort with ^G)
> c(ex14).
{ok,ex14}
> ex14:start().        start server() and client(0,Echo_PID)
<0.39.0>
received 0             while ex14 putters along ...
received 1              ... edit ex14 to use "io:format('NEW received ~p~n',[Msg]),"
received 2
received 3
> c(ex14).             compile ex14 again (this loads the new code)
received 4
{ok,ex14}
received 5
NEW received 6         server() has switched to the new code
NEW received 7
NEW received 8
NEW received 9
NEW received 10
>

Applications can also load new code without using the Erlang shell.

Further reading

More links for getting started.

History of this document

Version 1.0 (Joe Armstrong, Bjarne Däcker, Thomas Lindgren, Håkan Millroth): first public version
Version 2.0 (Updates by the Erlang product team at Ericsson)

Erlang Open-Source homepage: http://www.erlang.org