Fundamentals#

Built-in Data Structures#

Boolean#

MoonBit has a built-in boolean type, which has two values: true and false. The boolean type is used in conditional expressions and control structures.

let a = true
let b = false
let c = a && b
let d = a || b
let e = not(a)

Number#

MoonBit have integer type and floating point type:

type

description

example

Int16

16-bit signed integer

(42 : Int16)

Int

32-bit signed integer

42

Int64

64-bit signed integer

1000L

UInt16

16-bit unsigned integer

(14 : UInt16)

UInt

32-bit unsigned integer

14U

UInt64

64-bit unsigned integer

14UL

Double

64-bit floating point, defined by IEEE754

3.14

Float

32-bit floating point

(3.14 : Float)

BigInt

represents numeric values larger than other types

10000000000000000000000N

MoonBit also supports numeric literals, including decimal, binary, octal, and hexadecimal numbers.

To improve readability, you may place underscores in the middle of numeric literals such as 1_000_000. Note that underscores can be placed anywhere within a number, not just every three digits.

  • Decimal numbers can have underscore between the numbers.

    By default, an int literal is signed 32-bit number. For unsigned numbers, a postfix U is needed; for 64-bit numbers, a postfix L is needed.

    let a = 1234
    let b : Int = 1_000_000 + a
    let unsigned_num       : UInt   = 4_294_967_295U
    let large_num          : Int64  = 9_223_372_036_854_775_807L
    let unsigned_large_num : UInt64 = 18_446_744_073_709_551_615UL
    
  • A binary number has a leading zero followed by a letter “B”, i.e. 0b/0B. Note that the digits after 0b/0B must be 0 or 1.

    let bin = 0b110010
    let another_bin = 0B110010
    
  • An octal number has a leading zero followed by a letter “O”, i.e. 0o/0O. Note that the digits after 0o/0O must be in the range from 0 through 7:

    let octal = 0o1234
    let another_octal = 0O1234
    
  • A hexadecimal number has a leading zero followed by a letter “X”, i.e. 0x/0X. Note that the digits after the 0x/0X must be in the range 0123456789ABCDEF.

    let hex = 0XA
    let another_hex = 0xA_B_C
    
  • A floating-point number literal is 64-bit floating-point number. To define a float, type annotation is needed.

    let double = 3.14 // Double
    let float : Float = 3.14
    let float2 = (3.14 : Float)
    

    A 64-bit floating-point number can also be defined using hexadecimal format:

    let hex_double = 0x1.2P3 // (1.0 + 2 / 16) * 2^(+3) == 9
    

Overloaded literal#

When the expected type is known, MoonBit can automatically overload literal, and there is no need to specify the type of number via letter postfix:

let int : Int = 42
let uint : UInt = 42
let int64 : Int64 = 42
let double : Double = 42
let float : Float = 42
let bigint : BigInt = 42

String#

String holds a sequence of UTF-16 code units. You can use double quotes to create a string, or use #| to write a multi-line string.

let a = "兔rabbit"
println(a[0])
println(a[1])
let b =
  #| Hello
  #| MoonBit\n
  #|
println(b)
Output#
'兔'
'r'
 Hello
 MoonBit\n

In double quotes string, a backslash followed by certain special characters forms an escape sequence:

escape sequences

description

\n,\r,\t,\b

New line, Carriage return, Horizontal tab, Backspace

\\

Backslash

\x41

Hexadecimal escape sequence

\o102

Octal escape sequence

\u5154,\u{1F600}

Unicode escape sequence

MoonBit supports string interpolation. It enables you to substitute variables within interpolated strings. This feature simplifies the process of constructing dynamic strings by directly embedding variable values into the text. Variables used for string interpolation must support the to_string method.

let x = 42
println("The answer is \{x}")

Multi-line strings do not support interpolation by default, but you can enable interpolation for a specific line by changing the leading #| to $|:

let lang = "MoonBit"
let str =
  #| Hello
  #| ---
  $| \{lang}\n
  #| ---
println(str)
Output#
 Hello
 ---
 MoonBit

 ---

Char#

Char represents a Unicode code point.

let a : Char = 'A'
let b = '\x41'
let c = '兔'
let zero = '\u{30}'
let zero = '\u0030'

Byte(s)#

A byte literal in MoonBit is either a single ASCII character or a single escape enclosed in single quotes ', and preceded by the character b. Byte literals are of type Byte. For example:

fn main {
  let b1 : Byte = b'a'
  println(b1.to_int())
  let b2 = b'\xff'
  println(b2.to_int())
}
Output#
97
255

A Bytes is a sequence of bytes. Similar to byte, bytes literals have the form of b"...". For example:

test {
  let b1 : Bytes = b"abcd"
  let b2 = b"\x61\x62\x63\x64"
  assert_eq!(b1, b2)
}

Tuple#

A tuple is a collection of finite values constructed using round brackets () with the elements separated by commas ,. The order of elements matters; for example, (1,true) and (true,1) have different types. Here’s an example:

fn main {
  fn pack(
    a : Bool,
    b : Int,
    c : String,
    d : Double
  ) -> (Bool, Int, String, Double) {
    (a, b, c, d)
  }

  let quad = pack(false, 100, "text", 3.14)
  let (bool_val, int_val, str, float_val) = quad
  println("\{bool_val} \{int_val} \{str} \{float_val}")
}
Output#
false 100 text 3.14

Tuples can be accessed via pattern matching or index:

test {
  let t = (1, 2)
  let (x1, y1) = t
  let x2 = t.0
  let y2 = t.1
  assert_eq!(x1, x2)
  assert_eq!(y1, y2)
}

Ref#

A Ref[T] is a mutable reference containing a value val of type T.

It can be constructed using { val : x }, and can be accessed using ref.val. See struct for detailed explanation.

let a : Ref[Int] = { val : 100 }

test {
  a.val = 200
  assert_eq!(a.val, 200)
  a.val += 1
  assert_eq!(a.val, 201)
}

Option and Result#

Option and Result are the most common types to represent a possible error or failure in MoonBit.

  • Option[T] represents a possibly missing value of type T. It can be abbreviated as T?.

  • Result[T, E] represents either a value of type T or an error of type E.

See enum for detailed explanation.

test {
  let a : Option[Int] = None
  let b : Option[Int] = Some(42)
  let c : Result[Int, String] = Ok(42)
  let d : Result[Int, String] = Err("error")
  match a {
    Some(_) => assert_true!(false)
    None => assert_true!(true)
  }
  match d {
    Ok(_) => assert_true!(false)
    Err(_) => assert_true!(true)
  }
}

Array#

An array is a finite sequence of values constructed using square brackets [], with elements separated by commas ,. For example:

let numbers = [1, 2, 3, 4]

You can use numbers[x] to refer to the xth element. The index starts from zero.

test {
  let numbers = [1, 2, 3, 4]
  let a = numbers[2]
  numbers[3] = 5
  let b = a + numbers[3]
  assert_eq!(b, 8)
}

There are Array[T] and FixedArray[T]:

  • Array[T] can grow in size, while

  • FixedArray[T] has a fixed size, thus it needs to be created with initial value.

Warning

A common pitfall is creating FixedArray with the same initial value:

test {
  let two_dimension_array = FixedArray::make(10, FixedArray::make(10, 0))
  two_dimension_array[0][5] = 10
  assert_eq!(two_dimension_array[5][5], 10)
}

This is because all the cells reference to the same object (the FixedArray[Int] in this case). One should use FixedArray::makei() instead which creates an object for each index.

test {
  let two_dimension_array = FixedArray::makei(
    10, 
    fn (_i) { FixedArray::make(10, 0) }
  )
  two_dimension_array[0][5] = 10
  assert_eq!(two_dimension_array[5][5], 0)
}

When the expected type is known, MoonBit can automatically overload array, otherwise Array[T] is created:

let fixed_array_1 : FixedArray[Int] = [1, 2, 3]
let fixed_array_2 = ([1, 2, 3] : FixedArray[Int])
let array_3 = [1, 2, 3] // Array[Int]

ArrayView#

Analogous to slice in other languages, the view is a reference to a specific segment of collections. You can use data[start:end] to create a view of array data, referencing elements from start to end (exclusive). Both start and end indices can be omitted.

test {
  let xs = [0, 1, 2, 3, 4, 5]
  let s1 : ArrayView[Int] = xs[2:]
  inspect!(s1, content="[2, 3, 4, 5]")
  inspect!(xs[:4], content="[0, 1, 2, 3]")
  inspect!(xs[2:5], content="[2, 3, 4]")
  inspect!(xs[:], content="[0, 1, 2, 3, 4, 5]")
}

Map#

MoonBit provides a hash map data structure that preserves insertion order called Map in its standard library. Maps can be created via a convenient literal syntax:

let map : Map[String, Int] = { "x": 1, "y": 2, "z": 3 }

Currently keys in map literal syntax must be constant. Maps can also be destructed elegantly with pattern matching, see Map Pattern.

Json literal#

MoonBit supports convenient json handling by overloading literals. When the expected type of an expression is Json, number, string, array and map literals can be directly used to create json data:

let moon_pkg_json_example : Json = {
  "import": ["moonbitlang/core/builtin", "moonbitlang/core/coverage"],
  "test-import": ["moonbitlang/core/random"],
}

Json values can be pattern matched too, see Json Pattern.

Functions#

Functions take arguments and produce a result. In MoonBit, functions are first-class, which means that functions can be arguments or return values of other functions. MoonBit’s naming convention requires that function names should not begin with uppercase letters (A-Z). Compare for constructors in the enum section below.

Top-Level Functions#

Functions can be defined as top-level or local. We can use the fn keyword to define a top-level function that sums three integers and returns the result, as follows:

fn add3(x : Int, y : Int, z : Int) -> Int {
  x + y + z
}

Note that the arguments and return value of top-level functions require explicit type annotations.

Local Functions#

Local functions can be named or anonymous. Type annotations can be omitted for local function definitions: they can be automatically inferred in most cases. For example:

fn local_1() -> Int {
  fn inc(x) { // named as `inc`
    x + 1
  }
  // anonymous, instantly applied to integer literal 6
  (fn(x) { x + inc(2) })(6)
}

test {
  assert_eq!(local_1(), 9)
}

There’s also a form called matrix function that make use of pattern matching:

let extract : (Int?, Int) -> Int = fn {
  Some(x), _ => x
  None, default => default
}

Functions, whether named or anonymous, are lexical closures: any identifiers without a local binding must refer to bindings from a surrounding lexical scope. For example:

let global_y = 3

fn local_2(x : Int) -> (Int, Int) {
  fn inc() {
    x + 1
  }

  fn four() {
    global_y + 1
  }

  (inc(), four())
}

test {
  assert_eq!(local_2(3), (4, 4))
}

Function Applications#

A function can be applied to a list of arguments in parentheses:

add3(1, 2, 7)

This works whether add3 is a function defined with a name (as in the previous example), or a variable bound to a function value, as shown below:

test {
  let add3 = fn(x, y, z) { x + y + z }
  assert_eq!(add3(1, 2, 7), 10)
}

The expression add3(1, 2, 7) returns 10. Any expression that evaluates to a function value is applicable:

test {
  let f = fn(x) { x + 1 }
  let g = fn(x) { x + 2 }
  let w = (if true { f } else { g })(3)
  assert_eq!(w, 4)
}

Labelled arguments#

Top-level functions can declare labelled argument with the syntax label~ : Type. label will also serve as parameter name inside function body:

fn labelled_1(arg1~ : Int, arg2~ : Int) -> Int {
  arg1 + arg2
}

Labelled arguments can be supplied via the syntax label=arg. label=label can be abbreviated as label~:

test {
  let arg1 = 1
  assert_eq!(labelled_1(arg2=2, arg1~), 3)
}

Labelled function can be supplied in any order. The evaluation order of arguments is the same as the order of parameters in function declaration.

Optional arguments#

A labelled argument can be made optional by supplying a default expression with the syntax label~ : Type = default_expr. If this argument is not supplied at call site, the default expression will be used:

fn optional(opt~ : Int = 42) -> Int {
  opt
}

test {
  assert_eq!(optional(), 42)
  assert_eq!(optional(opt=0), 0)
}

The default expression will be evaluated every time it is used. And the side effect in the default expression, if any, will also be triggered. For example:

fn incr(counter~ : Ref[Int] = { val: 0 }) -> Ref[Int] {
  counter.val = counter.val + 1
  counter
}

test {
  inspect!(incr(), content="{val: 1}")
  inspect!(incr(), content="{val: 1}")
  let counter : Ref[Int] = { val: 0 }
  inspect!(incr(counter~), content="{val: 1}")
  inspect!(incr(counter~), content="{val: 2}")
}

If you want to share the result of default expression between different function calls, you can lift the default expression to a toplevel let declaration:

let default_counter : Ref[Int] = { val: 0 }

fn incr_2(counter~ : Ref[Int] = default_counter) -> Int {
  counter.val = counter.val + 1
  counter.val
}

test {
  assert_eq!(incr_2(), 1)
  assert_eq!(incr_2(), 2)
}

Default expression can depend on the value of previous arguments. For example:

fn sub_array[X](
  xs : Array[X],
  offset~ : Int,
  len~ : Int = xs.length() - offset
) -> Array[X] {
  xs[offset:offset + len].iter().to_array()
}

test {
  assert_eq!(sub_array([1, 2, 3], offset=1), [2, 3])
  assert_eq!(sub_array([1, 2, 3], offset=1, len=1), [2])
}

Automatically insert Some when supplying optional arguments#

It is quite often optional arguments have type T? with None as default value. In this case, passing the argument explicitly requires wrapping a Some, which is ugly:

fn ugly_constructor(width~ : Int? = None, height~ : Int? = None) -> Image {
  ...
}

let img : Image = ugly_constructor(width=Some(1920), height=Some(1080))

Fortunately, MoonBit provides a special kind of optional arguments to solve this problem. Optional arguments declared with label? : T has type T? and None as default value. When supplying this kind of optional argument directly, MoonBit will automatically insert a Some:

fn nice_constructor(width? : Int, height? : Int) -> Image {
  ...
}

let img2 : Image = nice_constructor(width=1920, height=1080)

Sometimes, it is also useful to pass a value of type T? directly, for example when forwarding optional argument. MoonBit provides a syntax label?=value for this, with label? being an abbreviation of label?=label:

fn image(width? : Int, height? : Int) -> Image {
  ...
}

fn fixed_width_image(height? : Int) -> Image {
  image(width=1920, height?)
}

Autofill arguments#

MoonBit supports filling specific types of arguments automatically at different call site, such as the source location of a function call. To declare an autofill argument, simply declare an optional argument with _ as default value. Now if the argument is not explicitly supplied, MoonBit will automatically fill it at the call site.

Currently MoonBit supports two types of autofill arguments, SourceLoc, which is the source location of the whole function call, and ArgsLoc, which is a array containing the source location of each argument, if any:

fn f(_x : Int, loc~ : SourceLoc = _, args_loc~ : ArgsLoc = _) -> String {
  $|loc of whole function call: \{loc}
  $|loc of arguments: \{args_loc}
  // loc of whole function call: <filename>:7:3-7:10
  // loc of arguments: [Some(<filename>:7:5-7:6), Some(<filename>:7:8-7:9), None, None]
}

Autofill arguments are very useful for writing debugging and testing utilities.

Control Structures#

Conditional Expressions#

A conditional expression consists of a condition, a consequent, and an optional else clause or else if clause.

if x == y {
  expr1
} else if x == z {
  expr2
} else {
  expr3
}

The curly brackets around the consequent are required.

Note that a conditional expression always returns a value in MoonBit, and the return values of the consequent and the else clause must be of the same type. Here is an example:

let initial = if size < 1 { 1 } else { size }

The else clause can only be omitted if the return value has type Unit.

Match Expression#

The match expression is similar to conditional expression, but it uses pattern matching to decide which consequent to evaluate and extracting variables at the same time.

fn decide_sport(weather : String, humidity : Int) -> String {
  match weather {
    "sunny" => "tennis"
    "rainy" => if humidity > 80 { "swimming" } else { "football" }
    _ => "unknown"
  }
}

test {
  assert_eq!(decide_sport("sunny", 0), "tennis")
}

If a possible condition is omitted, the compiler will issue a warning, and the program will terminate if that case were reached.

Guard Statement#

The guard statement is used to check a specified invariant. If the condition of the invariant is satisfied, the program continues executing the subsequent statements and returns. If the condition is not satisfied (i.e., false), the code in the else block is executed and its evaluation result is returned (the subsequent statements are skipped).

fn guarded_get(array : Array[Int], index : Int) -> Int? {
  guard index >= 0 && index < array.length() else { None }
  Some(array[index])
}

test {
  inspect!(guarded_get([1, 2, 3], -1), content="None")
}

Guarded Let#

The let statement can be used with pattern matching. However, let statement can only handle one case. And guard let can solve this issue.

In the following example, getProcessedText assumes that the input path points to resources that are all plain text, and it uses the guard statement to ensure this invariant. Compared to using a match statement, the subsequent processing of text can have one less level of indentation.

enum Resource {
  Folder(Array[String])
  PlainText(String)
  JsonConfig(Json)
}

fn getProcessedText(
  resources : Map[String, Resource],
  path : String
) -> String!Error {
  guard let Some(PlainText(text)) = resources[path] else {
    None => fail!("\{path} not found")
    Some(Folder(_)) => fail!("\{path} is a folder")
    Some(JsonConfig(_)) => fail!("\{path} is a json config")
  }
  process(text)
}

When the else part is omitted, the program terminates if the condition specified in the guard statement is not true or cannot be matched.

guard condition  // <=> guard condition else { panic() }
guard let Some(x) = expr
// <=> guard let Some(x) = expr else { _ => panic() }

While loop#

In MoonBit, while loop can be used to execute a block of code repeatedly as long as a condition is true. The condition is evaluated before executing the block of code. The while loop is defined using the while keyword, followed by a condition and the loop body. The loop body is a sequence of statements. The loop body is executed as long as the condition is true.

fn main {
  let mut i = 5
  while i > 0 {
    println(i)
    i = i - 1
  }
}
Output#
5
4
3
2
1

The loop body supports break and continue. Using break allows you to exit the current loop, while using continue skips the remaining part of the current iteration and proceeds to the next iteration.

fn main {
  let mut i = 5
  while i > 0 {
    i = i - 1
    if i == 4 {
      continue
    }
    if i == 1 {
      break
    }
    println(i)
  }
}
Output#
3
2

The while loop also supports an optional else clause. When the loop condition becomes false, the else clause will be executed, and then the loop will end.

fn main {
  let mut i = 2
  while i > 0 {
    println(i)
    i = i - 1
  } else {
    println(i)
  }
}
Output#
2
1
0

When there is an else clause, the while loop can also return a value. The return value is the evaluation result of the else clause. In this case, if you use break to exit the loop, you need to provide a return value after break, which should be of the same type as the return value of the else clause.

fn main {
  let mut i = 10
  let r = while i > 0 {
    i = i - 1
    if i % 2 == 0 {
      break 5
    }
  } else {
    7
  }
  println(r)
}
Output#
5
fn main {
  let mut i = 10
  let r = while i > 0 {
    i = i - 1
  } else {
    7
  }
  println(r)
}
Output#
7

For Loop#

MoonBit also supports C-style For loops. The keyword for is followed by variable initialization clauses, loop conditions, and update clauses separated by semicolons. They do not need to be enclosed in parentheses. For example, the code below creates a new variable binding i, which has a scope throughout the entire loop and is immutable. This makes it easier to write clear code and reason about it:

fn main {
  for i = 0; i < 5; i = i + 1 {
    println(i)
  }
}
Output#
0
1
2
3
4

The variable initialization clause can create multiple bindings:

for i = 0, j = 0; i + j < 100; i = i + 1, j = j + 1 {
  println(i)
}

It should be noted that in the update clause, when there are multiple binding variables, the semantics are to update them simultaneously. In other words, in the example above, the update clause does not execute i = i + 1, j = j + 1 sequentially, but rather increments i and j at the same time. Therefore, when reading the values of the binding variables in the update clause, you will always get the values updated in the previous iteration.

Variable initialization clauses, loop conditions, and update clauses are all optional. For example, the following two are infinite loops:

for i = 1; ; i = i + 1 {
  println(i)
}
for {
  println("loop forever")
}

The for loop also supports continue, break, and else clauses. Like the while loop, the for loop can also return a value using the break and else clauses.

The continue statement skips the remaining part of the current iteration of the for loop (including the update clause) and proceeds to the next iteration. The continue statement can also update the binding variables of the for loop, as long as it is followed by expressions that match the number of binding variables, separated by commas.

For example, the following program calculates the sum of even numbers from 1 to 6:

fn main {
  let sum = for i = 1, acc = 0; i <= 6; i = i + 1 {
    if i % 2 == 0 {
      println("even: \{i}")
      continue i + 1, acc + i
    }
  } else {
    acc
  }
  println(sum)
}
Output#
even: 2
even: 4
even: 6
12

for .. in loop#

MoonBit supports traversing elements of different data structures and sequences via the for .. in loop syntax:

for x in [1, 2, 3] {
  println(x)
}

for .. in loop is translated to the use of Iter in MoonBit’s standard library. Any type with a method .iter() : Iter[T] can be traversed using for .. in. For more information of the Iter type, see Iterator below.

for .. in loop also supports iterating through a sequence of integers, such as:

test {
  let mut i = 0
  for j in 0..<10 {
    i += j
  }
  assert_eq!(i, 45)
  let mut k = 0
  for l in 0..=10 {
    k += l
  }
  assert_eq!(k, 55)
}

In addition to sequences of a single value, MoonBit also supports traversing sequences of two values, such as Map, via the Iter2 type in MoonBit’s standard library. Any type with method .iter2() : Iter2[A, B] can be traversed using for .. in with two loop variables:

for k, v in { "x": 1, "y": 2, "z": 3 } {
  println(k)
  println(v)
}

Another example of for .. in with two loop variables is traversing an array while keeping track of array index:

fn main {
  for index, elem in [4, 5, 6] {
    let i = index + 1
    println("The \{i}-th element of the array is \{elem}")
  }
}
Output#
The 1-th element of the array is 4
The 2-th element of the array is 5
The 3-th element of the array is 6

Control flow operations such as return, break and error handling are supported in the body of for .. in loop:

fn main {
  let map = { "x": 1, "y": 2, "z": 3, "w": 4 }
  for k, v in map {
    if k == "y" {
      continue
    }
    println("\{k}, \{v}")
    if k == "z" {
      break
    }
  }
}
Output#
x, 1
z, 3

If a loop variable is unused, it can be ignored with _.

Functional loop#

Functional loop is a powerful feature in MoonBit that enables you to write loops in a functional style.

A functional loop consumes arguments and returns a value. It is defined using the loop keyword, followed by its arguments and the loop body. The loop body is a sequence of clauses, each of which consists of a pattern and an expression. The clause whose pattern matches the input will be executed, and the loop will return the value of the expression. If no pattern matches, the loop will panic. Use the continue keyword with arguments to start the next iteration of the loop. Use the break keyword with arguments to return a value from the loop. The break keyword can be omitted if the value is the last expression in the loop body.

test {
  fn sum(xs : @immut/list.T[Int]) -> Int {
    loop xs, 0 {
      Nil, acc => break acc // <=> Nil, acc => acc
      Cons(x, rest), acc => continue rest, x + acc
    }
  }

  assert_eq!(sum(Cons(1, Cons(2, Cons(3, Nil)))), 6)
}

Warning

Currently in loop exprs { ... }, exprs is nonempty list, while for { ... } is accepted for infinite loop.

Labelled Continue/Break#

When a loop is labelled, it can be referenced from a break or continue from within a nested loop. For example:

test "break label" {
  let mut count = 0
  let xs = [1, 2, 3]
  let ys = [4, 5, 6]
  let res = outer~: for i in xs {
    for j in ys {
      count = count + i
      break outer~ j
    }
  } else {
    -1
  }
  assert_eq!(res, 4)
  assert_eq!(count, 1)
}

test "continue label" {
  let mut count = 0
  let init = 10
  let res =outer~: loop init {
    0 => 42
    i => {
      for {
        count = count + 1
        continue outer~ i - 1
      }
    }
  }
  assert_eq!(res, 42)
  assert_eq!(count, 10)
}

Iterator#

An iterator is an object that traverse through a sequence while providing access to its elements. Traditional OO languages like Java’s Iterator<T> use next() hasNext() to step through the iteration process, whereas functional languages (JavaScript’s forEach, Lisp’s mapcar) provides a high-order function which takes an operation and a sequence then consumes the sequence with that operation being applied to the sequence. The former is called external iterator (visible to user) and the latter is called internal iterator (invisible to user).

The built-in type Iter[T] is MoonBit’s internal iterator implementation. Almost all built-in sequential data structures have implemented Iter:

///|
fn filter_even(l : Array[Int]) -> Array[Int] {
  let l_iter : Iter[Int] = l.iter()
  l_iter.filter(fn { x => (x & 1) == 0 }).collect()
}

///|
fn fact(n : Int) -> Int {
  let start = 1
  let range : Iter[Int] = start.until(n)
  range.fold(Int::op_mul, init=start)
}

Commonly used methods include:

  • each: Iterates over each element in the iterator, applying some function to each element.

  • fold: Folds the elements of the iterator using the given function, starting with the given initial value.

  • collect: Collects the elements of the iterator into an array.

  • filter: lazy Filters the elements of the iterator based on a predicate function.

  • map: lazy Transforms the elements of the iterator using a mapping function.

  • concat: lazy Combines two iterators into one by appending the elements of the second iterator to the first.

Methods like filter map are very common on a sequence object e.g. Array. But what makes Iter special is that any method that constructs a new Iter is lazy (i.e. iteration doesn’t start on call because it’s wrapped inside a function), as a result of no allocation for intermediate value. That’s what makes Iter superior for traversing through sequence: no extra cost. MoonBit encourages user to pass an Iter across functions instead of the sequence object itself.

Pre-defined sequence structures like Array and its iterators should be enough to use. But to take advantages of these methods when used with a custom sequence with elements of type S, we will need to implement Iter, namely, a function that returns an Iter[S]. Take Bytes as an example:

///|
fn iter(data : Bytes) -> Iter[Byte] {
  Iter::new(fn(visit : (Byte) -> IterResult) -> IterResult {
    for byte in data {
      guard let IterContinue = visit(byte) else { x => break x }

    } else {
      IterContinue
    }
  })
}

Almost all Iter implementations are identical to that of Bytes, the only main difference being the code block that actually does the iteration.

Implementation details#

The type Iter[T] is basically a type alias for ((T) -> IterResult) -> IterResult, a higher-order function that takes an operation and IterResult is an enum object that tracks the state of current iteration which consists any of the 2 states:

  • IterEnd: marking the end of an iteration

  • IterContinue: marking the end of an iteration is yet to be reached, implying the iteration will still continue at this state.

To put it simply, Iter[T] takes a function (T) -> IterResult and use it to transform Iter[T] itself to a new state of type IterResult. Whether that state being IterEnd IterContinue depends on the function.

Iterator provides a unified way to iterate through data structures, and they can be constructed at basically no cost: as long as fn(yield) doesn’t execute, the iteration process doesn’t start.

Internally a Iter::run() is used to trigger the iteration. Chaining all sorts of Iter methods might be visually pleasing, but do notice the heavy work underneath the abstraction.

Thus, unlike an external iterator, once the iteration starts there’s no way to stop unless the end is reached. Methods such as count() which counts the number of elements in a iterator looks like an O(1) operation but actually has linear time complexity. Carefully use iterators or performance issue might occur.

Custom Data Types#

There are two ways to create new data types: struct and enum.

Struct#

In MoonBit, structs are similar to tuples, but their fields are indexed by field names. A struct can be constructed using a struct literal, which is composed of a set of labeled values and delimited with curly brackets. The type of a struct literal can be automatically inferred if its fields exactly match the type definition. A field can be accessed using the dot syntax s.f. If a field is marked as mutable using the keyword mut, it can be assigned a new value.

struct User {
  id : Int
  name : String
  mut email : String
}
fn main {
  let u = User::{ id: 0, name: "John Doe", email: "john@doe.com" }
  u.email = "john@doe.name"
  //! u.id = 10
  println(u.id)
  println(u.name)
  println(u.email)
}
Output#
0
John Doe
john@doe.name

Constructing Struct with Shorthand#

If you already have some variable like name and email, it’s redundant to repeat those names when constructing a struct. You can use shorthand instead, it behaves exactly the same:

let name = "john"
let email = "john@doe.com"
let u = User::{ id: 0, name, email }

If there’s no other struct that has the same fields, it’s redundant to add the struct’s name when constructing it:

let u2 = { id : 0, name, email }

Struct Update Syntax#

It’s useful to create a new struct based on an existing one, but with some fields updated.

fn main {
  let user = { id: 0, name: "John Doe", email: "john@doe.com" }
  let updated_user = { ..user, email: "john@doe.name" }
  println(
    $|{ id: \{user.id}, name: \{user.name}, email: \{user.email} }
    $|{ id: \{updated_user.id}, name: \{updated_user.name}, email: \{updated_user.email} }
    ,
  )
}
Output#
{ id: 0, name: John Doe, email: john@doe.com }
{ id: 0, name: John Doe, email: john@doe.name }

Enum#

Enum types are similar to algebraic data types in functional languages. Users familiar with C/C++ may prefer calling it tagged union.

An enum can have a set of cases (constructors). Constructor names must start with capitalized letter. You can use these names to construct corresponding cases of an enum, or checking which branch an enum value belongs to in pattern matching:

/// An enum type that represents the ordering relation between two values,
/// with three cases "Smaller", "Greater" and "Equal"
enum Relation {
  Smaller
  Greater
  Equal
}
/// compare the ordering relation between two integers
fn compare_int(x : Int, y : Int) -> Relation {
  if x < y {
    // when creating an enum, if the target type is known, 
    // you can write the constructor name directly
    Smaller
  } else if x > y {
    // but when the target type is not known,
    // you can always use `TypeName::Constructor` to create an enum unambiguously
    Relation::Greater
  } else {
    Equal
  }
}

/// output a value of type `Relation`
fn print_relation(r : Relation) -> Unit {
  // use pattern matching to decide which case `r` belongs to
  match r {
    // during pattern matching, if the type is known, 
    // writing the name of constructor is sufficient
    Smaller => println("smaller!")
    // but you can use the `TypeName::Constructor` syntax 
    // for pattern matching as well
    Relation::Greater => println("greater!")
    Equal => println("equal!")
  }
}
fn main {
  print_relation(compare_int(0, 1))
  print_relation(compare_int(1, 1))
  print_relation(compare_int(2, 1))
}
Output#
smaller!
equal!
greater!

Enum cases can also carry payload data. Here’s an example of defining an integer list type using enum:

enum List {
  Nil
  // constructor `Cons` carries additional payload: the first element of the list,
  // and the remaining parts of the list
  Cons(Int, List)
}
// In addition to binding payload to variables,
// you can also continue matching payload data inside constructors.
// Here's a function that decides if a list contains only one element
fn is_singleton(l : List) -> Bool {
  match l {
    // This branch only matches values of shape `Cons(_, Nil)`, 
    // i.e. lists of length 1
    Cons(_, Nil) => true
    // Use `_` to match everything else
    _ => false
  }
}

fn print_list(l : List) -> Unit {
  // when pattern-matching an enum with payload,
  // in additional to deciding which case a value belongs to
  // you can extract the payload data inside that case
  match l {
    Nil => println("nil")
    // Here `x` and `xs` are defining new variables 
    // instead of referring to existing variables,
    // if `l` is a `Cons`, then the payload of `Cons` 
    // (the first element and the rest of the list)
    // will be bind to `x` and `xs
    Cons(x, xs) => {
      println("\{x},")
      print_list(xs)
    }
  }
}
fn main {
  // when creating values using `Cons`, the payload of by `Cons` must be provided
  let l : List = Cons(1, Cons(2, Nil))
  println(is_singleton(l))
  print_list(l)
}
Output#
false
1,
2,
nil

Constructor with labelled arguments#

Enum constructors can have labelled argument:

enum E {
  // `x` and `y` are labelled argument
  C(x~ : Int, y~ : Int)
}
// pattern matching constructor with labelled arguments
fn f(e : E) -> Unit {
  match e {
    // `label=pattern`
    C(x=0, y=0) => println("0!")
    // `x~` is an abbreviation for `x=x`
    // Unmatched labelled arguments can be omitted via `..`
    C(x~, ..) => println(x)
  }
}
fn main {
  f(C(x=0, y=0))
  let x = 0
  f(C(x~, y=1)) // <=> C(x=x, y=1)
}
Output#
0!
0

It is also possible to access labelled arguments of constructors like accessing struct fields in pattern matching:

enum Object {
  Point(x~ : Double, y~ : Double)
  Circle(x~ : Double, y~ : Double, radius~ : Double)
}

type! NotImplementedError  derive(Show)

fn distance_with(self : Object, other : Object) -> Double!NotImplementedError {
  match (self, other) {
    // For variables defined via `Point(..) as p`,
    // the compiler knows it must be of constructor `Point`,
    // so you can access fields of `Point` directly via `p.x`, `p.y` etc.
    (Point(_) as p1, Point(_) as p2) => {
      let dx = p2.x - p1.x
      let dy = p2.y - p1.y
      (dx * dx + dy * dy).sqrt()
    }
    (Point(_), Circle(_)) | (Circle(_), Point(_)) | (Circle(_), Circle(_)) =>
      raise NotImplementedError
  }
}
fn main {
  let p1 : Object = Point(x=0, y=0)
  let p2 : Object = Point(x=3, y=4)
  let c1 : Object = Circle(x=0, y=0, radius=2)
  try {
    println(p1.distance_with!(p2))
    println(p1.distance_with!(c1))
  } catch {
    e => println(e)
  }
}
Output#
5
NotImplementedError

Constructor with mutable fields#

It is also possible to define mutable fields for constructor. This is especially useful for defining imperative data structures:

// A set implemented using mutable binary search tree.
struct Set[X] {
  mut root : Tree[X]
}

fn Set::insert[X : Compare](self : Set[X], x : X) -> Unit {
  self.root = self.root.insert(x, parent=Nil)
}

// A mutable binary search tree with parent pointer
enum Tree[X] {
  Nil
  // only labelled arguments can be mutable
  Node(
    mut value~ : X,
    mut left~ : Tree[X],
    mut right~ : Tree[X],
    mut parent~ : Tree[X]
  )
}

// In-place insert a new element to a binary search tree.
// Return the new tree root
fn Tree::insert[X : Compare](
  self : Tree[X],
  x : X,
  parent~ : Tree[X]
) -> Tree[X] {
  match self {
    Nil => Node(value=x, left=Nil, right=Nil, parent~)
    Node(_) as node => {
      let order = x.compare(node.value)
      if order == 0 {
        // mutate the field of a constructor
        node.value = x
      } else if order < 0 {
        // cycle between `node` and `node.left` created here
        node.left = node.left.insert(x, parent=node)
      } else {
        node.right = node.right.insert(x, parent=node)
      }
      // The tree is non-empty, so the new root is just the original tree
      node
    }
  }
}

Newtype#

MoonBit supports a special kind of enum called newtype:

// `UserId` is a fresh new type different from `Int`, 
// and you can define new methods for `UserId`, etc.
// But at the same time, the internal representation of `UserId` 
// is exactly the same as `Int`
type UserId Int

type UserName String

Newtypes are similar to enums with only one constructor (with the same name as the newtype itself). So, you can use the constructor to create values of newtype, or use pattern matching to extract the underlying representation of a newtype:

fn main {
  let id : UserId = UserId(1)
  let name : UserName = UserName("John Doe")
  let UserId(uid) = id // uid : Int
  let UserName(uname) = name // uname: String
  println(uid)
  println(uname)
}
Output#
1
John Doe

Besides pattern matching, you can also use ._ to extract the internal representation of newtypes:

fn main {
  let id : UserId = UserId(1)
  let uid : Int = id._
  println(uid)
}
Output#
1

Type alias#

MoonBit supports type alias via the syntax typealias Name = TargetType:

pub typealias Index = Int

// type alias are private by default
typealias MapString[X] = Map[String, X]

Unlike all other kinds of type declaration above, type alias does not define a new type, it is merely a type macro that behaves exactly the same as its definition. So for example one cannot define new methods or implement traits for a type alias.

Tip

Type alias can be used to perform incremental code refactor.

For example, if you want to move a type T from @pkgA to @pkgB, you can leave a type alias typealias T = @pkgB.T in @pkgA, and incrementally port uses of @pkgA.T to @pkgB.T. The type alias can be removed after all uses of @pkgA.T is migrated to @pkgB.T.

Local types#

Moonbit supports declaring structs/enums/newtypes at the top of a toplevel function, which are only visible within the current toplevel function. These local types can use the generic parameters of the toplevel function but cannot introduce additional generic parameters themselves. Local types can derive methods using derive, but no additional methods can be defined manually. For example:

fn toplevel[T: Show](x: T) -> Unit {
  enum LocalEnum {
    A(T)
    B(Int)
  } derive(Show)
  struct LocalStruct {
    a: (String, T)
  } derive(Show)
  type LocalNewtype T derive(Show)
  ...
}

Currently, local types do not support being declared as error types.

Pattern Matching#

Pattern matching allows us to match on specific pattern and bind data from data structures.

Simple Patterns#

We can pattern match expressions against

  • literals, such as boolean values, numbers, chars, strings, etc

  • constants

  • structs

  • enums

  • arrays

  • maps

  • JSONs

and so on. We can define identifiers to bind the matched values so that they can be used later.

const ONE = 1

fn match_int(x : Int) -> Unit {
  match x {
    0 => println("zero")
    ONE => println("one")
    value => println(value)
  }
}

We can use _ as wildcards for the values we don’t care about, and use .. to ignore remaining fields of struct or enum, or array (see array pattern).

struct Point3D {
  x : Int
  y : Int
  z : Int
}

fn match_point3D(p : Point3D) -> Unit {
  match p {
    { x: 0, .. } => println("on yz-plane")
    _ => println("not on yz-plane")
  }
}

enum Point[T] {
  Point2D(Int, Int, name~: String, payload~ : T)
}

fn match_point[T](p : Point[T]) -> Unit {
  match p {
    //! Point2D(0, 0) => println("2D origin")
    Point2D(0, 0, ..) => println("2D origin")
    Point2D(_) => println("2D point")
    _ => panic()
  }
}

We can use as to give a name to some pattern, and we can use | to match several cases at once. A variable name can only be bound once in a single pattern, and the same set of variables should be bound on both sides of | patterns.

match expr {
  //! Add(e1, e2) | Lit(e1) => ...
  Lit(n) as a => ...
  Add(e1, e2) | Mul(e1, e2) => ...
  _ => ...
}

Array Pattern#

For Array, FixedArray and ArrayView, MoonBit allows using array pattern.

Array pattern have the following forms:

  • [] : matching for an empty data structure

  • [pa, pb, pc] : matching for known number of elements, 3 in this example

  • [pa, ..] : matching for known number of elements, followed by unknown number of elements

  • [.., pa] : matching for known number of elements, preceded by unknown number of elements

test {
  let ary = [1, 2, 3, 4]
  let [a, b, ..] = ary
  inspect!("a = \{a}, b = \{b}", content="a = 1, b = 2")
  let [.., a, b] = ary
  inspect!("a = \{a}, b = \{b}", content="a = 3, b = 4")
}

Range Pattern#

For builtin integer types and Char, MoonBit allows matching whether the value falls in a specific range.

Range patterns have the form a..<b or a..=b, where ..< means the upper bound is exclusive, and ..= means inclusive upper bound. a and b can be one of:

  • literal

  • named constant declared with const

  • _, meaning the pattern has no restriction on this side

Here are some examples:

const Zero = 0

fn sign(x : Int) -> Int {
  match x {
    _..<Zero => -1
    Zero => 0
    1..<_ => 1
  }
}

fn classify_char(c : Char) -> String {
  match c {
    'a'..='z' => "lowercase"
    'A'..='Z' => "uppercase"
    '0'..='9' => "digit"
    _ => "other"
  }
}

Map Pattern#

MoonBit allows convenient matching on map-like data structures. Inside a map pattern, the key : value syntax will match if key exists in the map, and match the value of key with pattern value. The key? : value syntax will match no matter key exists or not, and value will be matched against map[key] (an optional).

match map {
  // matches if any only if "b" exists in `map`
  { "b": _ } => ...
  // matches if and only if "b" does not exist in `map` and "a" exists in `map`.
  // When matches, bind the value of "a" in `map` to `x`
  { "b"? : None, "a": x } => ...
  // compiler reports missing case: { "b"? : None, "a"? : None }
}
  • To match a data type T using map pattern, T must have a method op_get(Self, K) -> Option[V] for some type K and V (see method and trait).

  • Currently, the key part of map pattern must be a literal or constant

  • Map patterns are always open: unmatched keys are silently ignored

  • Map pattern will be compiled to efficient code: every key will be fetched at most once

Json Pattern#

When the matched value has type Json, literal patterns can be used directly, together with constructors:

match json {
  { "version": "1.0.0", "import": [..] as imports } => ...
  { "version": Number(i), "import": Array(imports)} => ...
  _ => ...
}

Generics#

Generics are supported in top-level function and data type definitions. Type parameters can be introduced within square brackets. We can rewrite the aforementioned data type List to add a type parameter T to obtain a generic version of lists. We can then define generic functions over lists like map and reduce.

enum List[T] {
  Nil
  Cons(T, List[T])
}

fn map[S, T](self : List[S], f : (S) -> T) -> List[T] {
  match self {
    Nil => Nil
    Cons(x, xs) => Cons(f(x), map(xs, f))
  }
}

fn reduce[S, T](self : List[S], op : (T, S) -> T, init : T) -> T {
  match self {
    Nil => init
    Cons(x, xs) => reduce(xs, op, op(init, x))
  }
}

Special Syntax#

Pipe operator#

MoonBit provides a convenient pipe operator |>, which can be used to chain regular function calls:

5 |> ignore // <=> ignore(5)
[] |> push(5) // <=> push([], 5)
1
|> add(5) // <=> add(1, 5)
|> ignore // <=> ignore(add(1, 5))

Cascade Operator#

The cascade operator .. is used to perform a series of mutable operations on the same value consecutively. The syntax is as follows:

x..f()

x..f()..g() is equivalent to {x.f(); x.g(); x}.

Consider the following scenario: for a StringBuilder type that has methods like write_string, write_char, write_object, etc., we often need to perform a series of operations on the same StringBuilder value:

let builder = StringBuilder::new()
builder.write_char('a')
builder.write_char('a')
builder.write_object(1001)
builder.write_string("abcdef")
let result = builder.to_string()

To avoid repetitive typing of builder, its methods are often designed to return self itself, allowing operations to be chained using the . operator. To distinguish between immutable and mutable operations, in MoonBit, for all methods that return Unit, cascade operator can be used for consecutive operations without the need to modify the return type of the methods.

let result = StringBuilder::new()
  ..write_char('a')
  ..write_char('a')
  ..write_object(1001)
  ..write_string("abcdef")
  .to_string()

TODO syntax#

The todo syntax (...) is a special construct used to mark sections of code that are not yet implemented or are placeholders for future functionality. For example:

fn todo_in_func() -> Int {
  ...
}