Fundamentals#
Built-in Data Structures#
Unit#
Unit is a built-in type in MoonBit that represents the absence of a meaningful value. It has only one value, written as (). Unit is similar to void in languages like C/C++/Java, but unlike void, it is a real type and can be used anywhere a type is expected.
The Unit type is commonly used as the return type for functions that perform some action but do not produce a meaningful result:
fn print_hello() -> Unit {
println("Hello, world!")
}
Unlike some other languages, MoonBit treats Unit as a first-class type, allowing it to be used in generics, stored in data structures, and passed as function arguments.
Boolean#
MoonBit has a built-in boolean type, which has two values: true and false. The boolean type is used in conditional expressions and control structures. Use ! to negate a boolean value; not(x) is equivalent.
let a = true
let b = false
let c = a && b
let d = a || b
let e = !a
let f = !(a && b)
Number#
MoonBit have integer type and floating point type:
type |
description |
example |
|---|---|---|
|
16-bit signed integer |
|
|
32-bit signed integer |
|
|
64-bit signed integer |
|
|
16-bit unsigned integer |
|
|
32-bit unsigned integer |
|
|
64-bit unsigned integer |
|
|
64-bit floating point, defined by IEEE754 |
|
|
32-bit floating point |
|
|
represents numeric values larger than other types |
|
MoonBit also supports numeric literals, including decimal, binary, octal, and hexadecimal numbers.
To improve readability, you may place underscores in the middle of numeric literals such as 1_000_000. Note that underscores can be placed anywhere within a number, not just every three digits.
Decimal numbers can have underscore between the numbers.
By default, an int literal is signed 32-bit number. For unsigned numbers, a postfix
Uis needed; for 64-bit numbers, a postfixLis needed.let a = 1234 let b : Int = 1_000_000 + a let unsigned_num : UInt = 4_294_967_295U let large_num : Int64 = 9_223_372_036_854_775_807L let unsigned_large_num : UInt64 = 18_446_744_073_709_551_615UL
A binary number has a leading zero followed by a letter "B", i.e.
0b/0B. Note that the digits after0b/0Bmust be0or1.let bin = 0b110010 let another_bin = 0B110010
An octal number has a leading zero followed by a letter "O", i.e.
0o/0O. Note that the digits after0o/0Omust be in the range from0through7:let octal = 0o1234 let another_octal = 0O1234
A hexadecimal number has a leading zero followed by a letter "X", i.e.
0x/0X. Note that the digits after the0x/0Xmust be in the range0123456789ABCDEF.let hex = 0XA let another_hex = 0xA_B_C
A floating-point number literal is 64-bit floating-point number. To define a float, type annotation is needed.
let double = 3.14 // Double let float : Float = 3.14 let float2 = (3.14 : Float)
A 64-bit floating-point number can also be defined using hexadecimal format:
let hex_double = 0x1.2P3 // (1.0 + 2 / 16) * 2^(+3) == 9
When the expected type is known, MoonBit can automatically overload literal, and there is no need to specify the type of number via letter postfix:
let int : Int = 42
let uint : UInt = 42
let int64 : Int64 = 42
let double : Double = 42
let float : Float = 42
let bigint : BigInt = 42
See also
String#
String holds a sequence of UTF-16 code units. You can use double quotes to create a string, or use #| to write a multi-line string.
let a = "兔rabbit"
debug_inspect(a.code_unit_at(0).to_char(), content="Some('兔')")
debug_inspect(a.code_unit_at(1).to_char(), content="Some('r')")
let b =
#| Hello
#| MoonBit\n
#|
println(b)
Hello
MoonBit\n
In double quotes string, a backslash followed by certain special characters forms an escape sequence:
escape sequences |
description |
|---|---|
|
New line, Carriage return, Horizontal tab, Backspace |
|
Backslash |
|
Unicode escape sequence |
MoonBit supports string interpolation. It enables you to substitute variables within interpolated strings. This feature simplifies the process of constructing dynamic strings by directly embedding variable values into the text. Variables used for string interpolation must implement the Show trait.
let x = 42
println("The answer is \{x}")
Note
The interpolated expression can not contain newline, {} or ".
Multi-line strings can be defined using the leading #| or $|, where the former will keep the raw string and the latter will perform the escape and interpolation:
let lang = "MoonBit"
let raw =
#| Hello
#| ---
#| \{lang}
#| ---
let interp =
$| Hello
$| ---
$| \{lang}
$| ---
println(raw)
println(interp)
Hello
---
\{lang}
---
Hello
---
MoonBit
---
Avoid mixing $| and #| within the same multi-line string; pick one style for the whole block.
The VSCode extension includes an action that can turn pasted documents into a plain multi-line string and switch between plain text and MoonBit multi-line strings.
When the expected type is String , the array literal syntax is overloaded to
construct the String by specifying each character in the string.
test {
let c : Char = '中'
let s : String = [c, '文']
inspect(s, content="中文")
}
Char#
Char represents a Unicode code point.
let a : Char = 'A'
let b = '兔'
let zero = '\u{30}'
let zero = '\u0030'
Char literals can be overloaded to type Int or UInt16 when it is the expected type:
test {
let s : String = "hello"
let b : UInt16 = s.code_unit_at(0) // 'h'
assert_eq(b, 'h') // 'h' is overloaded to UInt16
let c : Int = '兔'
// Not ok : exceed range
// let d : UInt16 = '𠮷'
}
Byte(s)#
A byte literal in MoonBit is either a single ASCII character or a single escape, have the form of b'...'. Byte literals are of type Byte. For example:
fn main {
let b1 : Byte = b'a'
println(b1.to_int())
let b2 = b'\xff'
println(b2.to_int())
}
97
255
A Bytes is an immutable sequence of bytes. Similar to byte, bytes literals have the form of b"...". For example:
test {
let b1 : Bytes = b"abcd"
let b2 = b"\x61\x62\x63\x64"
assert_eq(b1, b2)
}
The byte literal and bytes literal also support escape sequences, but different from those in string literals. The following table lists the supported escape sequences for byte and bytes literals:
escape sequences |
description |
|---|---|
|
New line, Carriage return, Horizontal tab, Backspace |
|
Backslash |
|
Hexadecimal escape sequence |
|
Octal escape sequence |
Note
You can use @buffer.T to construct bytes by writing various types of data. For example:
test "buffer 1" {
let buf : @buffer.Buffer = @buffer.new()
buf.write_bytes(b"Hello")
buf.write_byte(b'!')
assert_eq(buf.contents(), b"Hello!")
}
Array literals can also be overloaded to construct a Bytes sequence by
specifying each byte in the sequence.
test {
let b : Byte = b'\xFF'
let bs : Bytes = [b, b'\x01']
inspect(
bs,
content=(
#|b"\xff\x01"
),
)
}
See also
API for Byte: https://mooncakes.io/docs/moonbitlang/core/byte
API for Bytes: https://mooncakes.io/docs/moonbitlang/core/bytes
API for @buffer.T: https://mooncakes.io/docs/moonbitlang/core/buffer
Choosing a Byte Container#
MoonBit has several byte-oriented container types. They are related, but they serve different jobs:
Type |
Ownership / mutability |
Resizable |
Typical use |
|---|---|---|---|
|
owned, immutable |
no |
final byte payloads, API boundaries, serialized data |
|
borrowed, immutable view |
no |
slicing or parsing existing bytes without copying |
|
owned, mutable |
yes |
general-purpose mutable byte storage |
|
owned, mutable |
no |
fixed-size working buffers |
|
borrowed array view |
no |
passing slices of array-backed byte storage without ownership |
|
borrowed, mutable view |
no |
mutating borrowed array-backed byte storage in place |
|
owned, mutable builder |
yes |
incrementally constructing bytes, then calling |
Two common distinctions matter:
BytesversusBytesView: owned immutable data versus a borrowed immutable slice.Array[Byte]versusArrayView[Byte]/MutArrayView[Byte]: owned mutable storage versus borrowed readonly or mutable views over it.
ReadOnlyArray[Byte] and MutArrayView[Byte] are the corresponding read-only
and mutable view types when you need to express those constraints explicitly.
Pattern matching and bitstring parsing also work on these byte containers; see
Array Pattern and Bitstring Pattern.
Tuple#
A tuple is a collection of finite values constructed using round brackets () with the elements separated by commas ,. The order of elements matters; for example, (1,true) and (true,1) have different types. Here's an example:
fn main {
fn pack(
a : Bool,
b : Int,
c : String,
d : Double
) -> (Bool, Int, String, Double) {
(a, b, c, d)
}
let quad = pack(false, 100, "text", 3.14)
let (bool_val, int_val, str, float_val) = quad
println("\{bool_val} \{int_val} \{str} \{float_val}")
}
false 100 text 3.14
Tuples can be accessed via pattern matching or index:
test {
let t = (1, 2)
let (x1, y1) = t
let x2 = t.0
let y2 = t.1
assert_eq(x1, x2)
assert_eq(y1, y2)
}
Ref#
A Ref[T] is a mutable reference containing a value val of type T.
It can be constructed using { val : x }, and can be accessed using ref.val. See struct for detailed explanation.
let a : Ref[Int] = { val: 100 }
test {
a.val = 200
assert_eq(a.val, 200)
a.val += 1
assert_eq(a.val, 201)
}
See also
Option and Result#
Option and Result are the most common types to represent a possible error or failure in MoonBit.
Option[T]represents a possibly missing value of typeT. It can be abbreviated asT?.Result[T, E]represents either a value of typeTor an error of typeE.
See enum for detailed explanation.
test {
let a : Int? = None
let b : Option[Int] = Some(42)
let c : Result[Int, String] = Ok(42)
let d : Result[Int, String] = Err("error")
match a {
Some(_) => assert_true(false)
None => assert_true(true)
}
match d {
Ok(_) => assert_true(false)
Err(_) => assert_true(true)
}
}
See also
API for Option: https://mooncakes.io/docs/moonbitlang/core/option
API for Result: https://mooncakes.io/docs/moonbitlang/core/result
Array#
An array is a finite sequence of values constructed using square brackets [], with elements separated by commas ,. For example:
let numbers = [1, 2, 3, 4]
You can use numbers[x] to refer to the xth element. The index starts from zero.
test {
let numbers = [1, 2, 3, 4]
let a = numbers[2]
numbers[3] = 5
let b = a + numbers[3]
assert_eq(b, 8)
}
There are Array[T] and FixedArray[T]. Views are provided by ArrayView[T]
and MutArrayView[T] (see below).
Array[T] can grow in size, while FixedArray[T] has a fixed size, thus it needs to be created with initial value.
Warning
A common pitfall is creating FixedArray with the same initial value:
test {
let two_dimension_array = FixedArray::make(10, FixedArray::make(10, 0))
two_dimension_array[0][5] = 10
assert_eq(two_dimension_array[5][5], 10)
}
This is because all the cells reference to the same object (the FixedArray[Int] in this case). One should use FixedArray::makei() instead which creates an object for each index.
test {
let two_dimension_array = FixedArray::makei(10, fn(_i) {
FixedArray::make(10, 0)
})
two_dimension_array[0][5] = 10
assert_eq(two_dimension_array[5][5], 0)
}
When the expected type is known, MoonBit can automatically overload array, otherwise
Array[T] is created:
let fixed_array_1 : FixedArray[Int] = [1, 2, 3]
let fixed_array_2 = ([1, 2, 3] : FixedArray[Int])
let array_3 : Array[Int] = [1, 2, 3] // Array[Int]
ArrayView#
Analogous to slice in other languages, the view is a reference to a
specific segment of collections. You can use data[start:end] to create a
view of array data, referencing elements from start to end (exclusive).
Both start and end indices can be omitted.
Note
ArrayView is an immutable data structure on its own, but the underlying
Array or FixedArray could be modified. For a mutable view, use
MutArrayView[T] via data.mut_view(...).
test {
let xs = [0, 1, 2, 3, 4, 5]
let s1 : ArrayView[Int] = xs[2:]
inspect(s1, content="[2, 3, 4, 5]")
inspect(xs[:4], content="[0, 1, 2, 3]")
inspect(xs[2:5], content="[2, 3, 4]")
inspect(xs[:], content="[0, 1, 2, 3, 4, 5]")
let mv : MutArrayView[Int] = xs.mut_view(start=1, end=3)
mv[0] = 99
inspect(xs[1], content="99")
}
See also
Map#
MoonBit provides a hash map data structure that preserves insertion order called Map in its standard library.
Maps can be created via a convenient literal syntax:
let map : Map[String, Int] = { "x": 1, "y": 2, "z": 3 }
Currently keys in map literal syntax must be constant. Maps can also be destructed elegantly with pattern matching, see Map Pattern.
Json#
MoonBit supports convenient json handling by overloading literals.
When the expected type of an expression is Json, number, string, array and map literals can be directly used to create json data:
let moon_pkg_json_example : Json = {
"import": ["moonbitlang/core/builtin", "moonbitlang/core/coverage"],
"test-import": ["moonbitlang/core/random"],
}
Json values can be pattern matched too, see Json Pattern.
Overloaded Literals#
Overloaded literals allow you to use the same syntax to represent different types of values.
For example, you can use 1 to represent UInt or Double depending on the expected type. If the expected type is not known, the literal will be interpreted as Int by default.
fn expect_double(x : Double) -> Unit {
}
test {
let x = 1 // type of x is Int
let y : Double = 1
expect_double(1)
}
The overloaded literals can be composed. If array literal can be overloaded to Bytes , and number literal can be overloaded to Byte , then you can overload [1,2,3] to Bytes as well. Here is a table of overloaded literals in MoonBit:
Overloaded literal |
Default type |
Can be overloaded to |
|---|---|---|
|
|
|
|
|
— |
|
|
|
|
|
|
|
|
|
There are also some similar overloading rules in pattern. For more details, see Pattern Matching.
Note
Literal overloading is not the same as value conversion. To convert a variable to a different type, you can use methods prefixed with to_, such as to_int(), to_double(), etc.
Escape Sequences in Overloaded Literals#
Escape sequences can be used in overloaded "..." literals and '...' literals. The interpretation of escape sequences depends on the types they are overloaded to:
Simple escape sequences
Including
\n,\r,\t,\\, and\b. These escape sequences are supported in any"..."or'...'literals. They are interpreted as their respectiveCharorByteinStringorBytes.Byte escape sequences
The
\x41and\o102escape sequences represent a Byte. These are supported in literals overloaded toBytesandByte.Unicode escape sequences
The
\u5154and\u{1F600}escape sequences represent aChar. These are supported in literals of typeStringandChar.
Functions#
Functions take arguments and produce a result. In MoonBit, functions are first-class, which means that functions can be arguments or return values of other functions. MoonBit's naming convention requires that function names should not begin with uppercase letters (A-Z). Compare for constructors in the enum section below.
Top-Level Functions#
Functions can be defined as top-level or local. We can use the fn keyword to define a top-level function that sums three integers and returns the result, as follows:
fn add3(x : Int, y : Int, z : Int) -> Int {
x + y + z
}
Note that the arguments and return value of top-level functions require explicit type annotations.
Top-level functions and methods can also be introduced with declare.
A declared function has a signature but no body, and a later implementation must match that signature.
This is useful when you want to make an API shape available before placing its implementation.
declare fn declared_add(x : Int, y : Int) -> Int
fn declared_add(x : Int, y : Int) -> Int {
x + y
}
struct DeclaredCounter(Int)
declare fn DeclaredCounter::value(self : Self) -> Int
fn DeclaredCounter::value(self : Self) -> Int {
self.0
}
test "declared functions" {
@test.assert_eq(declared_add(1, 2), 3)
@test.assert_eq(DeclaredCounter(4).value(), 4)
}
If a declared function has an implementation, the declaration and the implementation must agree on the function name, visibility, type parameters, parameters, return type, and effects.
Local Functions#
Local functions can be named or anonymous. Type annotations can be omitted for local function definitions: they can be automatically inferred in most cases. For example:
fn local_1() -> Int {
fn inc(x) { // named as `inc`
x + 1
}
// anonymous, instantly applied to integer literal 6
(fn(x) { x + inc(2) })(6)
}
test {
assert_eq(local_1(), 9)
}
For simple anonymous function, MoonBit provides a very concise syntax called arrow function:
[1, 2, 3].eachi((i, x) => println("\{i} => \{x}"))
// parenthesis can be omitted when there is only one parameter
[1, 2, 3].each(x => println(x * x))
Although local function supports type inference for types of parameters and return value,
effect inference is only supported for the arrow function syntax.
If a fn may raise error
or perform asynchronous operations,
it must be explicitly annotated with raise or async.
Functions, whether named or anonymous, are lexical closures: any identifiers without a local binding must refer to bindings from a surrounding lexical scope. For example:
let global_y = 3
fn local_2(x : Int) -> (Int, Int) {
fn inc() {
x + 1
}
fn four() {
global_y + 1
}
(inc(), four())
}
test {
assert_eq(local_2(3), (4, 4))
}
A local function can only refer to itself and other previously defined local functions.
To define mutually recursive local functions, use the syntax letrec f = .. and g = .. instead:
fn f(x) {
// `f` can refer to itself here, but cannot use `g`
if x > 0 {
f(x - 1)
}
}
fn g(x) {
// `g` can refer to `f` and `g` itself
if x < 0 {
f(-x)
} else {
f(x)
}
}
// mutually recursive local functions
letrec even = x => x == 0 || odd(x - 1)
and odd = x => x != 0 && even(x - 1)
Function Applications#
A function can be applied to a list of arguments in parentheses:
add3(1, 2, 7)
This works whether add3 is a function defined with a name (as in the previous example), or a variable bound to a function value, as shown below:
test {
let add3 = fn(x, y, z) { x + y + z }
assert_eq(add3(1, 2, 7), 10)
}
The expression add3(1, 2, 7) returns 10. Any expression that evaluates to a function value is applicable:
test {
let f = fn(x) { x + 1 }
let g = fn(x) { x + 2 }
let w = (if true { f } else { g })(3)
assert_eq(w, 4)
}
Partial Applications#
Partial application is a technique of applying a function to some of its arguments, resulting in a new function that takes the remaining arguments. In MoonBit, partial application is achieved by using the _ operator in function application:
fn add(x : Int, y : Int) -> Int {
x + y
}
test {
let add10 : (Int) -> Int = x => add(10, x)
println(add10(5)) // prints 15
println(add10(10)) // prints 20
}
The _ operator represents the missing argument in parentheses. The partial application allows multiple _ in the same parentheses.
For example, Array::fold(_, _, init=5) is equivalent to fn(x, y) { Array::fold(x, y, init=5) }.
The _ operator can also be used in enum creation, dot style function calls and in the pipelines.
Labelled arguments#
Top-level functions can declare labelled argument with the syntax label~ : Type. label will also serve as parameter name inside function body:
fn labelled_1(arg1~ : Int, arg2~ : Int) -> Int {
arg1 + arg2
}
Labelled arguments can be supplied via the syntax label=arg. label=label can be abbreviated as label~:
test {
let arg1 = 1
assert_eq(labelled_1(arg2=2, arg1~), 3)
}
Labelled function can be supplied in any order. The evaluation order of arguments is the same as the order of parameters in function declaration.
Optional arguments#
An argument can be made optional by supplying a default expression with the syntax label?: Type = default_expr, where the default_expr may be omitted. If this argument is not supplied at call site, the default expression will be used:
fn optional(opt? : Int = 42) -> Int {
opt
}
test {
assert_eq(optional(), 42)
assert_eq(optional(opt=0), 0)
}
The default expression will be evaluated every time it is used. And the side effect in the default expression, if any, will also be triggered. For example:
fn incr(counter? : Ref[Int] = { val: 0 }) -> Ref[Int] {
counter.val = counter.val + 1
counter
}
test {
inspect(incr(), content="{val: 1}")
inspect(incr(), content="{val: 1}")
let counter : Ref[Int] = { val: 0 }
inspect(incr(counter~), content="{val: 1}")
inspect(incr(counter~), content="{val: 2}")
}
Optional argument values are regular expressions at the call site. You can pass
expressions that may raise errors or call async functions when in a raise or
async context:
fn may_fail(x : Int) -> Int raise Failure {
if x < 0 {
fail("negative")
}
x
}
fn add_with_optional(base : Int, extra? : Int = 1) -> Int {
base + extra
}
test {
inspect(add_with_optional(1, extra=may_fail(2)), content="3")
}
For async functions, optional argument expressions can call async functions as usual:
///|
async fn fetch_default() -> Int {
...
}
///|
async fn build(x? : Int = fetch_default()) -> Int {
...
}
///|
async fn use_value() -> Int {
build(x=fetch_default())
}
If you want to share the result of default expression between different function calls, you can lift the default expression to a toplevel let declaration:
let default_counter : Ref[Int] = { val: 0 }
fn incr_2(counter? : Ref[Int] = default_counter) -> Int {
counter.val = counter.val + 1
counter.val
}
test {
assert_eq(incr_2(), 1)
assert_eq(incr_2(), 2)
}
The default expression can depend on previous arguments, such as:
fn create_rectangle(a : Int, b? : Int = a) -> (Int, Int) {
(a, b)
}
test {
inspect(create_rectangle(10), content="(10, 10)")
}
Optional arguments without default values#
It is quite common to have different semantics when a user does not provide a value.
Optional arguments without default values have type T? and None as the default value.
When supplying this kind of optional argument directly, MoonBit will automatically wrap the value with Some:
fn new_image(width? : Int, height? : Int) -> Image {
if width is Some(w) {
...
}
...
}
let img2 : Image = new_image(width=1920, height=1080)
Sometimes, it is also useful to pass a value of type T? directly,
for example when forwarding optional argument.
MoonBit provides a syntax label?=value for this, with label? being an abbreviation of label?=label:
fn image(width? : Int, height? : Int) -> Image {
...
}
fn fixed_width_image(height? : Int) -> Image {
image(width=1920, height?)
}
Autofill arguments#
MoonBit supports filling specific types of arguments automatically at different call site, such as the source location of a function call.
To declare an autofill argument, simply declare a labelled argument, and add a function attribute #callsite(autofill(param_a, param_b)).
Now if the argument is not explicitly supplied, MoonBit will automatically fill it at the call site.
Currently MoonBit supports two types of autofill arguments, SourceLoc, which is the source location of the whole function call,
and ArgsLoc, which is an array containing the source location of each argument, if any:
#callsite(autofill(loc, args_loc))
fn f(_x : Int, loc~ : SourceLoc, args_loc~ : ArgsLoc) -> String {
(
$|loc of whole function call: \{loc}
$|loc of arguments: \{args_loc}
)
// loc of whole function call: <filename>:7:3-7:10
// loc of arguments: [Some(<filename>:7:5-7:6), Some(<filename>:7:8-7:9), None, None]
}
Autofill arguments are very useful for writing debugging and testing utilities.
Function alias#
MoonBit allows calling functions with alternative names via function alias. Function alias can be declared as follows:
#alias(g)
#alias(h, visibility="pub")
fn k() -> Bool {
true
}
You can also create function alias that has different visibility with the field visibility.
Control Structures#
Conditional Expressions#
A conditional expression consists of a condition, a consequent, and an optional else clause or else if clause.
if x == y {
expr1
} else if x == z {
expr2
} else {
expr3
}
The curly brackets around the consequent are required.
Note that a conditional expression always returns a value in MoonBit, and the return values of the consequent and the else clause must be of the same type. Here is an example:
let initial = if size < 1 { 1 } else { size }
The else clause can only be omitted if the return value has type Unit.
Match Expression#
The match expression is similar to conditional expression, but it uses pattern matching to decide which consequent to evaluate and extracting variables at the same time.
fn decide_sport(weather : String, humidity : Int) -> String {
match weather {
"sunny" => "tennis"
"rainy" => if humidity > 80 { "swimming" } else { "football" }
_ => "unknown"
}
}
test {
assert_eq(decide_sport("sunny", 0), "tennis")
}
If a possible condition is omitted, the compiler will issue a warning, and the program will terminate if that case were reached.
Guard Statement#
The guard statement is used to check a specified invariant.
If the condition of the invariant is satisfied, the program continues executing
the subsequent statements and returns. If the condition is not satisfied (i.e., false),
the code in the else block is executed and its evaluation result is returned (the subsequent statements are skipped).
fn guarded_get(array : Array[Int], index : Int) -> Int? {
guard index >= 0 && index < array.length() else { None }
Some(array[index])
}
test {
debug_inspect(guarded_get([1, 2, 3], -1), content="None")
}
Guard statement and is expression#
The let statement can be used with pattern matching. However, let statement can only handle one case. And using is expression with guard statement can solve this issue.
In the following example, getProcessedText assumes that the input path points to resources that are all plain text,
and it uses the guard statement to ensure this invariant while extracting the plain text resource.
Compared to using a match statement, the subsequent processing of text can have one less level of indentation.
enum Resource {
Folder(Array[String])
PlainText(String)
JsonConfig(Json)
}
fn getProcessedText(
resources : Map[String, Resource],
path : String,
) -> String raise Error {
guard resources.get(path) is Some(resource) else { fail("\{path} not found") }
guard resource is PlainText(text) else { fail("\{path} is not plain text") }
process(text)
}
When the else part is omitted, the program terminates if the condition specified
in the guard statement is not true or cannot be matched.
guard condition // <=> guard condition else { panic() }
guard expr is Some(x)
// <=> guard expr is Some(x) else { _ => panic() }
While loop#
In MoonBit, while loop can be used to execute a block of code repeatedly as long as a condition is true. The condition is evaluated before executing the block of code. The while loop is defined using the while keyword, followed by a condition and the loop body. The loop body is a sequence of statements. The loop body is executed as long as the condition is true.
fn main {
let mut i = 5
while i > 0 {
println(i)
i = i - 1
}
}
5
4
3
2
1
The loop body supports break and continue. Using break allows you to exit the current loop, while using continue skips the remaining part of the current iteration and proceeds to the next iteration.
fn main {
let mut i = 5
while i > 0 {
i = i - 1
if i == 4 {
continue
}
if i == 1 {
break
}
println(i)
}
}
3
2
The while loop also supports an optional nobreak clause. When the loop condition becomes false, the nobreak clause will be executed, and then the loop will end.
fn main {
let mut i = 2
while i > 0 {
println(i)
i = i - 1
} nobreak {
println(i)
}
}
2
1
0
When there is an nobreak clause, the while loop can also return a value. The return value is the evaluation result of the nobreak clause. In this case, if you use break to exit the loop, you need to provide a return value after break, which should be of the same type as the return value of the nobreak clause.
fn main {
let mut i = 10
let r = while i > 0 {
i = i - 1
if i % 2 == 0 {
break 5
}
} nobreak {
7
}
println(r)
}
5
fn main {
let mut i = 10
let r = while i > 0 {
i = i - 1
} nobreak {
7
}
println(r)
}
7
For Loop#
MoonBit also supports C-style For loops. The keyword for is followed by variable initialization clauses, loop conditions, and update clauses separated by semicolons. They do not need to be enclosed in parentheses.
For example, the code below creates a new variable binding i, which has a scope throughout the entire loop and is immutable. This makes it easier to write clear code and reason about it:
fn main {
for i = 0; i < 5; i = i + 1 {
println(i)
}
}
0
1
2
3
4
The variable initialization clause can create multiple bindings:
for i = 0, j = 0; i + j < 100; i = i + 1, j = j + 1 {
println(i)
}
It should be noted that in the update clause, when there are multiple binding variables, the semantics are to update them simultaneously. In other words, in the example above, the update clause does not execute i = i + 1, j = j + 1 sequentially, but rather increments i and j at the same time. Therefore, when reading the values of the binding variables in the update clause, you will always get the values updated in the previous iteration.
Variable initialization clauses, loop conditions, and update clauses are all optional. For example, the following two are infinite loops:
for i = 1; ; i = i + 1 {
println(i)
}
for ;; {
println("loop forever")
}
The for loop also supports continue, break, and nobreak clauses. Like the while loop, the for loop can also return a value using the break and nobreak clauses.
The continue statement skips the remaining part of the current iteration of the for loop (including the update clause) and proceeds to the next iteration. The continue statement can also update the binding variables of the for loop, as long as it is followed by expressions that match the number of binding variables, separated by commas.
For example, the following program calculates the sum of even numbers from 1 to 6:
fn main {
let sum = for i = 1, acc = 0; i <= 6; i = i + 1 {
if i % 2 == 0 {
println("even: \{i}")
continue i + 1, acc + i
}
} nobreak {
acc
}
println(sum)
}
even: 2
even: 4
even: 6
12
for .. in loop#
MoonBit supports traversing elements of different data structures and sequences via the for .. in loop syntax:
for x in [1, 2, 3] {
println(x)
}
for .. in loop is translated to the use of Iter in MoonBit's standard library. Any type with a method .iter() : Iter[T] can be traversed using for .. in.
For more information of the Iter type, see Iterator below.
for .. in loop also supports iterating through a sequence of integers, such as:
test {
let mut i = 0
for j in 0..<10 {
i += j
}
assert_eq(i, 45)
let mut k = 0
for l in 0..<=10 {
k += l
}
assert_eq(k, 55)
}
In addition to sequences of a single value, MoonBit also supports traversing sequences of two values, such as Map, via the Iter2 type in MoonBit's standard library.
Any type with method .iter2() : Iter2[A, B] can be traversed using for .. in with two loop variables:
for k, v in { "x": 1, "y": 2, "z": 3 } {
println(k)
println(v)
}
Another example of for .. in with two loop variables is traversing an array while keeping track of array index:
fn main {
for index, elem in [4, 5, 6] {
let i = index + 1
println("The \{i}-th element of the array is \{elem}")
}
}
The 1-th element of the array is 4
The 2-th element of the array is 5
The 3-th element of the array is 6
Control flow operations such as return, break and error handling are supported in the body of for .. in loop:
fn main {
let map = { "x": 1, "y": 2, "z": 3, "w": 4 }
for k, v in map {
if k == "y" {
continue
}
println("\{k}, \{v}")
if k == "z" {
break
}
}
}
x, 1
z, 3
If a loop variable is unused, it can be ignored with _.
Range expression in for .. in loop#
for .. in loops can also be used with range expressions for iterating over a number range:
fn main {
for x in 0..<5 {
println(x)
}
}
0
1
2
3
4
There are four kinds of range expressions available in for .. in loop:
a..<b: iterate fromatobin increasing order, excludingba..<=b: iterate fromatobin increasing order, includingba>..b: iterate fromatobin decreasing order, excludingaa>=..b: iterate fromatobin decreasing order, includinga
Labelled Continue/Break#
When a loop is labelled, it can be referenced from a break or continue from
within a nested loop. For example:
test "break label" {
let mut count = 0
let xs = [1, 2, 3]
let ys = [4, 5, 6]
let res = outer~: for i in xs {
for j in ys {
count = count + i
break outer~ j
}
} nobreak {
-1
}
assert_eq(res, 4)
assert_eq(count, 1)
}
test "continue label" {
let mut count = 0
let init = 10
let res = outer~: for i = init {
if i == 0 {
break outer~ 42
}
for ;; {
count = count + 1
continue outer~ i - 1
}
}
assert_eq(res, 42)
assert_eq(count, 10)
}
defer expression#
defer expression can be used to perform reliable resource cleanup.
The syntax for defer is as follows:
defer <expr>
<body>
Whenever the program leaves body, expr will be executed.
For example, the following program:
defer println("perform resource cleanup")
println("do things with the resource")
will first print do things with the resource, and then perform resource cleanup.
defer expression will always get executed no matter how its body exits.
It can handle error,
as well as control flow constructs including return, break and continue.
Consecutive defer will be executed in reverse order, for example, the following:
defer println("first defer")
defer println("second defer")
println("do things")
will output first do things, then second defer, and finally first defer.
return, break and continue are disallowed in the right hand side of defer.
Currently, raising error or calling async function is also disallowed in the right hand side of defer.
Iterator#
An iterator is an object that traverse through a sequence while providing access
to its elements. Traditional OO languages like Java's Iterator<T> use next()
hasNext() to step through the iteration process, whereas functional languages
(JavaScript's forEach, Lisp's mapcar) provides a high-order function which
takes an operation and a sequence then consumes the sequence with that operation
being applied to the sequence. The former is called external iterator (visible
to user) and the latter is called internal iterator (invisible to user).
The built-in type Iter[T] is MoonBit's external iterator implementation. It
exposes next() to pull the next value: it returns Some(value) and advances
the iterator, or None when the iteration is finished.
Almost all built-in sequential data structures have implemented Iter:
///|
fn filter_even(l : Array[Int]) -> Array[Int] {
let l_iter : Iter[Int] = l.iter()
l_iter.filter(x => (x & 1) == 0).collect()
}
///|
fn fact(n : Int) -> Int {
let start = 1
let range : Iter[Int] = start.until(n)
range.fold(Int::mul, init=start)
}
Commonly used methods include:
each: Iterates over each element in the iterator, applying some function to each element.fold: Folds the elements of the iterator using the given function, starting with the given initial value.collect: Collects the elements of the iterator into an array.filter: lazy Filters the elements of the iterator based on a predicate function.map: lazy Transforms the elements of the iterator using a mapping function.concat: lazy Combines two iterators into one by appending the elements of the second iterator to the first.
Methods like filter and map are very common on a sequence object e.g. Array.
But what makes Iter special is that any method that constructs a new Iter is
lazy (i.e. iteration doesn't start on call because it's wrapped inside a
function), as a result of no allocation for intermediate value. That's what
makes Iter superior for traversing through sequence: no extra cost. MoonBit
encourages user to pass an Iter across functions instead of the sequence
object itself.
Pre-defined sequence structures like Array and its iterators should be
enough to use. But to take advantages of these methods when used with a custom
sequence with elements of type S, we will need to implement Iter, namely, a function that returns
an Iter[S]. Take Bytes as an example:
///|
fn iter(data : Bytes) -> Iter[Byte] {
let mut index = 0
Iter::new(fn() -> Byte? {
if index < data.length() {
let byte = data[index]
index += 1
Some(byte)
} else {
None
}
})
}
Iterators are single-pass: once you call next() or consume them with methods
like each, fold, or collect, their internal state advances and cannot be
reset. If you need to traverse the sequence again, request a new Iter from
the source.
Custom Data Types#
There are two ways to create new data types: struct and enum.
Struct#
In MoonBit, structs are similar to tuples, but their fields are indexed by field names. A struct can be constructed using a struct literal, which is composed of a set of labeled values and delimited with curly brackets. The type of a struct literal can be automatically inferred if its fields exactly match the type definition. A field can be accessed using the dot syntax s.f. If a field is marked as mutable using the keyword mut, it can be assigned a new value.
struct User {
id : Int
name : String
mut email : String
}
fn main {
let u = User::{ id: 0, name: "John Doe", email: "john@doe.com" }
u.email = "john@doe.name"
//! u.id = 10
println(u.id)
println(u.name)
println(u.email)
}
0
John Doe
john@doe.name
Constructing Struct with Shorthand#
If you already have some variable like name and email, it's redundant to repeat those names when constructing a struct. You can use shorthand instead, it behaves exactly the same:
let name = "john"
let email = "john@doe.com"
let u = User::{ id: 0, name, email }
If there's no other struct that has the same fields, it's redundant to add the struct's name when constructing it:
let u2 = { id: 0, name, email }
Struct Update Syntax#
It's useful to create a new struct based on an existing one, but with some fields updated.
fn main {
let user = { id: 0, name: "John Doe", email: "john@doe.com" }
let updated_user = { ..user, email: "john@doe.name" }
println(
(
$|{ id: \{user.id}, name: \{user.name}, email: \{user.email} }
$|{ id: \{updated_user.id}, name: \{updated_user.name}, email: \{updated_user.email} }
),
)
}
{ id: 0, name: John Doe, email: john@doe.com }
{ id: 0, name: John Doe, email: john@doe.name }
Custom constructor for struct#
MoonBit also supports defining a custom constructor for every struct type.
A constructor is a special method that can be called with the name of the
struct to create a value. First define the struct as usual:
struct IntBox {
value : Int
} derive(Debug)
The constructor should then be implemented as a method whose name is the same as the struct type. Its return value must be the struct itself:
fn IntBox::IntBox(value : Int) -> IntBox {
{ value, }
}
If a struct declares a constructor, it can be constructed by name directly:
let box = IntBox(10)
debug_inspect(box, content="{ value: 10 }")
The constructor call follows the constructor method signature, so unlabeled
parameters can be written in the familiar TypeName(value) form.
Constructors may also use labeled and optional arguments, just like normal functions:
struct StructWithConstr {
x : Int
y : Int
} derive(Debug)
fn StructWithConstr::StructWithConstr(x~ : Int, y? : Int = x) -> StructWithConstr {
{ x, y }
}
let s = StructWithConstr(x=1)
debug_inspect(s, content="{ x: 1, y: 1 }")
Because struct constructors are implemented by normal functions, they may raise errors:
suberror BuildError {
NegativeInput
} derive(Debug)
struct Positive {
value : Int
} derive(Debug)
fn Positive::Positive(x : Int) -> Positive raise BuildError {
guard x >= 0 else { raise NegativeInput }
{ value: x }
}
debug_inspect(try? Positive(10), content="Ok({ value: 10 })")
debug_inspect(try? Positive(-1), content="Err(NegativeInput)")
Asynchronous constructors are declared with async fn TypeName::TypeName and
can be used inside async code:
struct AsyncBox {
value : Int
} derive(Debug)
async fn AsyncBox::AsyncBox(x : Int) -> AsyncBox {
@async.sleep(0)
{ value: x }
}
async test "struct constructor async" {
let box = AsyncBox(10)
debug_inspect(box, content="{ value: 10 }")
}
Creating value via struct constructor has exactly the same semantic as
enum constructors,
except that struct constructors cannot be used for pattern matching.
For example, when creating a foreign struct using constructors,
the package name can be omitted if the expected type of the expression is known.
Since struct constructors are implemented by normal functions,
they may raise error or perform asynchronous operations.
struct constructors also support optional arguments.
Default values for optional arguments are written on the constructor
implementation, just like normal function signatures.
Enum#
Enum types are similar to algebraic data types in functional languages. Users familiar with C/C++ may prefer calling it tagged union.
An enum can have a set of cases (constructors). Constructor names must start with capitalized letter. You can use these names to construct corresponding cases of an enum, or checking which branch an enum value belongs to in pattern matching:
/// An enum type that represents the ordering relation between two values,
/// with three cases "Smaller", "Greater" and "Equal"
enum Relation {
Smaller
Greater
Equal
}
/// compare the ordering relation between two integers
fn compare_int(x : Int, y : Int) -> Relation {
if x < y {
// when creating an enum, if the target type is known,
// you can write the constructor name directly
Smaller
} else if x > y {
// but when the target type is not known,
// you can always use `TypeName::Constructor` to create an enum unambiguously
Relation::Greater
} else {
Equal
}
}
/// output a value of type `Relation`
fn print_relation(r : Relation) -> Unit {
// use pattern matching to decide which case `r` belongs to
match r {
// during pattern matching, if the type is known,
// writing the name of constructor is sufficient
Smaller => println("smaller!")
// but you can use the `TypeName::Constructor` syntax
// for pattern matching as well
Relation::Greater => println("greater!")
Equal => println("equal!")
}
}
fn main {
print_relation(compare_int(0, 1))
print_relation(compare_int(1, 1))
print_relation(compare_int(2, 1))
}
smaller!
equal!
greater!
Enum cases can also carry payload data. Here's an example of defining an integer list type using enum:
enum Lst {
Nil
// constructor `Cons` carries additional payload: the first element of the list,
// and the remaining parts of the list
Cons(Int, Lst)
}
// In addition to binding payload to variables,
// you can also continue matching payload data inside constructors.
// Here's a function that decides if a list contains only one element
fn is_singleton(l : Lst) -> Bool {
match l {
// This branch only matches values of shape `Cons(_, Nil)`,
// i.e. lists of length 1
Cons(_, Nil) => true
// Use `_` to match everything else
_ => false
}
}
fn print_list(l : Lst) -> Unit {
// when pattern-matching an enum with payload,
// in additional to deciding which case a value belongs to
// you can extract the payload data inside that case
match l {
Nil => println("nil")
// Here `x` and `xs` are defining new variables
// instead of referring to existing variables,
// if `l` is a `Cons`, then the payload of `Cons`
// (the first element and the rest of the list)
// will be bind to `x` and `xs
Cons(x, xs) => {
println("\{x},")
print_list(xs)
}
}
}
fn main {
// when creating values using `Cons`, the payload of by `Cons` must be provided
let l : Lst = Cons(1, Cons(2, Nil))
println(is_singleton(l))
print_list(l)
}
false
1,
2,
nil
Constructor with labelled arguments#
Enum constructors can have labelled argument:
enum E {
// `x` and `y` are labelled argument
C(x~ : Int, y~ : Int)
}
// pattern matching constructor with labelled arguments
fn f(e : E) -> Unit {
match e {
// `label=pattern`
C(x=0, y=0) => println("0!")
// `x~` is an abbreviation for `x=x`
// Unmatched labelled arguments can be omitted via `..`
C(x~, ..) => println(x)
}
}
fn main {
f(C(x=0, y=0))
let x = 0
f(C(x~, y=1)) // <=> C(x=x, y=1)
}
0!
0
It is also possible to access labelled arguments of constructors like accessing struct fields in pattern matching:
enum Object {
Point(x~ : Double, y~ : Double)
Circle(x~ : Double, y~ : Double, radius~ : Double)
}
suberror NotImplementedError derive(Debug)
fn Object::distance_with(
self : Object,
other : Object,
) -> Double raise NotImplementedError {
match (self, other) {
// For variables defined via `Point(..) as p`,
// the compiler knows it must be of constructor `Point`,
// so you can access fields of `Point` directly via `p.x`, `p.y` etc.
(Point(_) as p1, Point(_) as p2) => {
let dx = p2.x - p1.x
let dy = p2.y - p1.y
(dx * dx + dy * dy).sqrt()
}
(Point(_), Circle(_)) | (Circle(_), Point(_)) | (Circle(_), Circle(_)) =>
raise NotImplementedError
}
}
fn main {
let p1 : Object = Point(x=0, y=0)
let p2 : Object = Point(x=3, y=4)
let c1 : Object = Circle(x=0, y=0, radius=2)
try {
println(p1.distance_with(p2))
println(p1.distance_with(c1))
} catch {
_ => println("NotImplementedError")
}
}
5
NotImplementedError
Constructor with mutable fields#
It is also possible to define mutable fields for constructor. This is especially useful for defining imperative data structures:
// A set implemented using mutable binary search tree.
struct Set[X] {
mut root : Tree[X]
}
fn[X : Compare] Set::insert(self : Set[X], x : X) -> Unit {
self.root = self.root.insert(x, parent=Nil)
}
// A mutable binary search tree with parent pointer
enum Tree[X] {
Nil
// only labelled arguments can be mutable
Node(
mut value~ : X,
mut left~ : Tree[X],
mut right~ : Tree[X],
mut parent~ : Tree[X]
)
}
// In-place insert a new element to a binary search tree.
// Return the new tree root
fn[X : Compare] Tree::insert(
self : Tree[X],
x : X,
parent~ : Tree[X],
) -> Tree[X] {
match self {
Nil => Node(value=x, left=Nil, right=Nil, parent~)
Node(_) as node => {
let order = x.compare(node.value)
if order == 0 {
// mutate the field of a constructor
node.value = x
} else if order < 0 {
// cycle between `node` and `node.left` created here
node.left = node.left.insert(x, parent=node)
} else {
node.right = node.right.insert(x, parent=node)
}
// The tree is non-empty, so the new root is just the original tree
node
}
}
}
Extensible enum#
An extenum defines an open enum type. Unlike a regular enum, an
extenum can receive more constructors later, including from another package.
This is useful when a package wants to define the shared event, message, or
extension-point type, while other packages contribute their own cases.
pub(all) extenum LogEvent[T] {
Info(T)
}
Use extenum Type += { ... } to add constructors to an extensible enum in the
same package:
pub(all) extenum LogEvent[T] += {
Warning(T)
Critical(T, T)
}
To extend an extensible enum from another package, qualify the target type with the package that defines the type:
pub(all) extenum @base.LogEvent[T] += {
Debug(T)
}
Extensible enum constructors are qualified by the package that defines the
constructor. For constructors from the current package, use the constructor name
directly when the expected type is known. For constructors from another
package, use @pkg.Constructor in expressions and patterns. When you want to
make both the extensible enum type and the constructor origin explicit, write
the constructor as @type_pkg.Type::@constructor_pkg.Constructor.
When a package imports both the base package and an extension package, values from both packages have the same extensible enum type:
pub fn describe(event : @base.LogEvent[String]) -> String {
match event {
@base.Info(message) => "info: \{message}"
@base.Warning(message) => "warning: \{message}"
@base.Critical(code, message) => "critical \{code}: \{message}"
@plugin.Debug(message) => "debug: \{message}"
_ => "unknown"
}
}
pub fn debug_event(message : String) -> @base.LogEvent[String] {
@plugin.Debug(message)
}
pub fn qualified_debug_event(message : String) -> @base.LogEvent[String] {
@base.LogEvent::@plugin.Debug(message)
}
Pattern matching must include a wildcard branch, because more constructors can be added outside the current declaration.
Only extenum declarations can be extended. Regular enum declarations are
closed.
Tuple Struct#
MoonBit supports a special kind of struct called tuple struct:
struct UserId(Int)
struct UserInfo(UserId, String)
Tuple structs are similar to enum with only one constructor (with the same name as the tuple struct itself). So, you can use the constructor to create values, or use pattern matching to extract the underlying representation:
fn main {
let id : UserId = UserId(1)
let name : UserInfo = UserInfo(id, "John Doe")
let UserId(uid) = id // uid : Int
let UserInfo(_, uname) = name // uname: String
println(uid)
println(uname)
}
1
John Doe
Besides pattern matching, you can also use index to access the elements similar to tuple:
fn main {
let id : UserId = UserId(1)
let info : UserInfo = UserInfo(id, "John Doe")
let uid : Int = id.0
let uname : String = info.1
println(uid)
println(uname)
}
1
John Doe
Type alias#
MoonBit supports type alias via the syntax type NewType = OldType:
Warning
The old syntax typealias OldType as NewType may be removed in the future.
pub type Index = Int
pub type MyIndex = Int
pub type MyMap = Map[Int, String]
Unlike all other kinds of type declaration above, type alias does not define a new type, it is merely a type macro that behaves exactly the same as its definition. So for example one cannot define new methods or implement traits for a type alias.
Tip
Type alias can be used to perform incremental code refactor.
For example, if you want to move a type T from @pkgA to @pkgB,
you can leave a type alias type T = @pkgB.T in @pkgA, and incrementally port uses of @pkgA.T to @pkgB.T.
The type alias can be removed after all uses of @pkgA.T is migrated to @pkgB.T.
Local types#
MoonBit supports declaring structs/enums at the top of a toplevel function, which are only visible within the current toplevel function. These local types can use the generic parameters of the toplevel function but cannot introduce additional generic parameters themselves. Local types can derive methods using derive, but no additional methods can be defined manually. For example:
fn[T : Debug] toplevel(x : T) -> Unit {
enum LocalEnum {
A(T)
B(Int)
} derive(Debug)
struct LocalStruct {
a : (String, T)
} derive(Debug)
struct LocalStructTuple(T) derive(Debug)
...
}
Currently, local types do not support being declared as error types.
Pattern Matching#
Pattern matching allows us to match on specific pattern and bind data from data structures.
Simple Patterns#
We can pattern match expressions against
literals, such as boolean values, numbers, chars, strings, etc
constants
structs
enums
arrays
maps
JSONs
and so on. We can define identifiers to bind the matched values so that they can be used later.
const ONE = 1
fn match_int(x : Int) -> Unit {
match x {
0 => println("zero")
ONE => println("one")
value => println(value)
}
}
We can use _ as wildcards for the values we don't care about, and use .. to ignore remaining fields of struct or enum, or array (see array pattern).
struct Point3D {
x : Int
y : Int
z : Int
}
fn match_point3D(p : Point3D) -> Unit {
match p {
{ x: 0, .. } => println("on yz-plane")
_ => println("not on yz-plane")
}
}
enum Point[T] {
Point2D(Int, Int, name~ : String, payload~ : T)
}
fn[T] match_point(p : Point[T]) -> Unit {
match p {
//! Point2D(0, 0) => println("2D origin")
Point2D(0, 0, ..) => println("2D origin")
Point2D(_) => println("2D point")
_ => panic()
}
}
We can use as to give a name to some pattern, and we can use | to match several cases at once. A variable name can only be bound once in a single pattern, and the same set of variables should be bound on both sides of | patterns.
match expr {
//! Add(e1, e2) | Lit(e1) => ...
Lit(n) as a => ...
Add(e1, e2) | Mul(e1, e2) => ...
...
}
Array Pattern#
Array patterns can be used to match on the following types to obtain their corresponding elements or views:
Type |
Element |
View |
|---|---|---|
Array[T], ArrayView[T], FixedArray[T] |
T |
ArrayView[T] |
Bytes, BytesView |
Byte |
BytesView |
String, StringView |
Char |
StringView |
Array patterns have the following forms:
[]: matching for empty array[pa, pb, pc]: matching for array of length three, and bindpa,pb,pcto the three elements[pa, ..rest, pb]: matching for array with at least two elements, and bindpato the first element,pbto the last element, andrestto the remaining elements. the binderrestcan be omitted if the rest of the elements are not needed. Arbitrary number of elements are allowed preceding and following the..part. Because..can match uncertain number of elements, it can appear at most once in an array pattern.
test {
let ary = [1, 2, 3, 4]
if ary is [a, b, .. rest] && a == 1 && b == 2 && rest.length() == 2 {
inspect("a = \{a}, b = \{b}", content="a = 1, b = 2")
} else {
fail("")
}
guard ary is [.., a, b] else { fail("") }
inspect("a = \{a}, b = \{b}", content="a = 3, b = 4")
}
Array patterns provide a unicode-safe way to manipulate strings, meaning that it respects the code unit boundaries. For example, we can check if a string is a palindrome:
test {
fn palindrome(s : String) -> Bool {
for view = s.view() {
match view {
[] | [_] => break true
[a, .. rest, b] => if a == b { continue rest } else { break false }
}
}
}
inspect(palindrome("abba"), content="true")
inspect(palindrome("中b中"), content="true")
inspect(palindrome("文bb中"), content="false")
}
When there are consecutive char or byte constants in an array pattern, the
pattern spread .. operator can be used to combine them to make the code look
cleaner. Note that in this case the .. followed by string or bytes constant
matches exact number of elements so its usage is not limited to once.
const NO : Bytes = b"no"
test {
fn match_string(s : String) -> Bool {
match s {
[.. "yes", ..] => true // equivalent to ['y', 'e', 's', ..]
}
}
fn match_bytes(b : Bytes) -> Bool {
match b {
[.. NO, ..] => false // equivalent to ['n', 'o', ..]
}
}
}
Bitstring Pattern#
Bitstring patterns can match packed bit fields from byte containers. They are
supported on BytesView, Bytes, Array[Byte], FixedArray[Byte],
ReadOnlyArray[Byte], and ArrayView[Byte]. Use explicit widths with
be/le suffixes to make endianness clear.
be supports widths 1..64; le is only defined for byte-aligned widths (8 *
n), since little-endian order is defined on bytes. Without .., the pattern
must consume the entire view.
test {
let packet : Bytes = b"\xD2\x10\x7F"
let header : BytesView = packet[0:2]
let (flag, kind, version, length) = match header {
[u1be(flag), u3be(kind), u4be(version), u8be(length)] =>
(flag, kind, version, length)
_ => fail("bad header")
}
assert_eq(flag, 1)
assert_eq(kind, 0b101)
assert_eq(version, 0b0010)
assert_eq(length, 16)
}
Use literal bit patterns to validate headers, and .. to capture the remaining
data for the next parse step.
test {
let data : Bytes = b"\xF1\xAA\xBB"
let view : BytesView = data[0:]
let tag = match view {
[u4be(0b1111), u4be(tag), .. rest] => {
assert_eq(rest, b"\xAA\xBB"[0:])
tag
}
_ => fail("bad prefix")
}
assert_eq(tag, 0b0001)
}
Examples over common byte containers (note the MutArrayView slice):
test {
let b : Bytes = b"\x80"
guard b is [u1be(1), ..] else { fail("Bytes") }
let a : Array[Byte] = [b'\x80']
guard a is [u1be(1), ..] else { fail("Array[Byte]") }
let f : FixedArray[Byte] = [b'\x80']
guard f is [u1be(1), ..] else { fail("FixedArray[Byte]") }
let r : ReadOnlyArray[Byte] = [b'\x80']
guard r is [u1be(1), ..] else { fail("ReadOnlyArray[Byte]") }
let v : ArrayView[Byte] = a[:]
guard v is [u1be(1), ..] else { fail("ArrayView[Byte]") }
let mv : MutArrayView[Byte] = a.mut_view()
guard mv[:] is [u1be(1), ..] else { fail("MutArrayView[Byte]") }
}
Signed patterns use two's-complement semantics. For example, u1be yields 0
or 1, while i1be yields 0 or -1:
test {
let bytes = b"\x80"
let u : UInt = match bytes[:] {
[u1be(u), ..] => u
_ => fail("u1be")
}
let i : Int = match bytes[:] {
[i1be(i), ..] => i
_ => fail("i1be")
}
assert_eq(u, 1U)
assert_eq(i, -1)
}
Result types depend on width:
Width |
Result type |
|---|---|
1..32 bits ( |
|
33..64 bits ( |
|
33..64 bits ( |
|
Range Pattern#
For builtin integer types and Char, MoonBit allows matching whether the value falls in a specific range.
Range patterns have the form a..<b or a..=b, where ..< means the upper bound is exclusive, and ..= means inclusive upper bound.
a and b can be one of:
literal
named constant declared with
const_, meaning the pattern has no restriction on this side
Here are some examples:
const Zero = 0
fn sign(x : Int) -> Int {
match x {
_..<Zero => -1
Zero => 0
1..<_ => 1
}
}
fn classify_char(c : Char) -> String {
match c {
'a'..='z' => "lowercase"
'A'..='Z' => "uppercase"
'0'..='9' => "digit"
_ => "other"
}
}
Map Pattern#
MoonBit allows convenient matching on map-like data structures.
Inside a map pattern, the key : value syntax will match if key exists in the map, and match the value of key with pattern value.
The key? : value syntax will match no matter key exists or not, and value will be matched against map[key] (an optional).
match map {
// matches if any only if "b" exists in `map`
{ "b": _, .. } => ...
// matches if and only if "b" does not exist in `map` and "a" exists in `map`.
// When matches, bind the value of "a" in `map` to `x`
{ "b"? : None, "a": x, .. } => ...
// compiler reports missing case: { "b"? : None, "a"? : None }
}
To match a data type
Tusing map pattern,Tmust have a methodget(Self, K) -> Option[V]for some typeKandV(see method and trait).Currently, the key part of map pattern must be a literal or constant
Map patterns are always open: the unmatched keys are silently ignored, and
..needs to be added to identify this natureMap pattern will be compiled to efficient code: every key will be fetched at most once
Json Pattern#
When the matched value has type Json, literal patterns can be used directly, together with constructors:
match json {
{ "version": "1.0.0", "import": [..] as imports, .. } => ...
{ "version": Number(i, ..), "import": Array(imports), .. } => ...
...
}
Guard condition#
Each case in a pattern matching expression can have a guard condition. A guard condition is a boolean expression that must be true for the case to be matched. If the guard condition is false, the case is skipped and the next case is tried. For example:
fn guard_cond(x : Int?) -> Int {
fn f(x : Int) -> Array[Int] {
[x, x + 42]
}
match x {
Some(a) if f(a) is [0, b] => a + b
Some(b) => b
None => -1
}
}
test {
assert_eq(guard_cond(None), -1)
assert_eq(guard_cond(Some(0)), 42)
assert_eq(guard_cond(Some(1)), 1)
}
Note that the guard conditions will not be considered when checking if all patterns are covered by the match expression. So you will see a warning of partial match for the following case:
fn guard_check(x : Int?) -> Unit {
match x {
Some(a) if a >= 0 => ()
Some(a) if a < 0 => ()
None => ()
}
}
Warning
It is not encouraged to call a function that mutates a part of the value being matched inside a guard condition. When such case happens, the part being mutated will not be re-evaluated in the subsequent patterns. Use it with caution.
Generics#
Generics are supported in top-level function and data type definitions. Type parameters can be introduced within square brackets. We can rewrite the aforementioned data type List to add a type parameter T to obtain a generic version of lists. We can then define generic functions over lists like map and reduce.
///|
enum List[T] {
Nil
Cons(T, List[T])
}
///|
fn[S, T] List::map(self : List[S], f : (S) -> T) -> List[T] {
match self {
Nil => Nil
Cons(x, xs) => Cons(f(x), xs.map(f))
}
}
///|
fn[S, T] List::reduce(self : List[S], op : (T, S) -> T, init : T) -> T {
match self {
Nil => init
Cons(x, xs) => xs.reduce(op, op(init, x))
}
}
Special Syntax#
Pipelines#
MoonBit provides convenient pipe syntaxes x |> f(y) and f <| x, which can be used to chain regular function calls or make nested builder-style code easier to read:
5 |> ignore // <=> ignore(5)
[] |> Array::push(5) // <=> Array::push([], 5)
1
|> add(5) // <=> add(1, 5)
|> x => { x + 1 }
|> ignore // <=> ignore(add(1, 5))
The MoonBit code follows the data-first style, meaning the function places its "subject" as the first argument.
Thus, the pipe operator inserts the left-hand side value into the first argument of the right-hand side function call by default.
For example, x |> f(y) is equivalent to f(x, y).
You can use the _ operator to insert x into any argument of the function f, such as x |> f(y, _), which is equivalent to f(y, x). Labeled arguments are also supported.
The pipe operator can also connect to an arrow function. When piping into an arrow function, the function body must be wrapped in curly braces, for example value |> x => { x + 1 }.
The reverse pipe operator applies the right-hand side as the final argument of the left-hand side call. For example, f <| x is equivalent to f(x), and f(a, b) <| c is equivalent to f(a, b, c). This is especially useful for DSL-like code, since nested calls such as div([text("hello")]) can instead be written as div <| [text <| "hello"].
let page = div <| [
text <| "hello",
section("toolbar") <| fn() { [text <| "save", text <| "cancel"] },
]
inspect(
page,
content="div(text(hello), toolbar: div(text(save), text(cancel)))",
)
Because reverse pipe attaches the final argument, it also works well with functions whose last argument is a lambda, enabling a trailing-lambda style such as section("toolbar") <| fn () { ... }.
Cascade Operator#
The cascade operator .. is used to perform a series of mutable operations on
the same value consecutively. The syntax is as follows:
let arr = []..append([1])
Here, x..f() is equivalent to { x.f(); x }.
Consider the following scenario: for a StringBuilder type that has methods
like write_string, write_char, write_object, etc., we often need to perform
a series of operations on the same StringBuilder value:
let builder = StringBuilder::new()
builder.write_char('a')
builder.write_char('a')
builder.write_object(1001)
builder.write_string("abcdef")
let result = builder.to_string()
To avoid repetitive typing of builder, its methods are often designed to
return self itself, allowing operations to be chained using the . operator.
To distinguish between immutable and mutable operations, in MoonBit,
for all methods that return Unit, cascade operator can be used for
consecutive operations without the need to modify the return type of the methods.
let result = StringBuilder::new()
..write_char('a')
..write_char('a')
..write_object(1001)
..write_string("abcdef")
.to_string()
is Expression#
The is expression tests whether a value conforms to a specific pattern. It
returns a Bool value and can be used anywhere a boolean value is expected,
for example:
fn[T] is_none(x : T?) -> Bool {
x is None
}
fn start_with_lower_letter(s : String) -> Bool {
s is ['a'..='z', ..]
}
Pattern binders introduced by is expressions can be used in the following
contexts:
In boolean AND expressions (
&&): binders introduced in the left-hand expression can be used in the right-hand expressionfn f(x : Int?) -> Bool { x is Some(v) && v >= 0 }
In the first branch of
ifexpression: if the condition is a sequence of boolean expressionse1 && e2 && ..., the binders introduced by theisexpression can be used in the branch where the condition evaluates totrue.fn g(x : Array[Int?]) -> Unit { if x is [v, .. rest] && v is Some(i) && i is (0..=10) { debug(v) println(i) debug(rest) } }
In the following statements of a
guardcondition:fn h(x : Int?) -> Unit { guard x is Some(v) println(v) }
In the body of a
whileloop:fn i(x : Int?) -> Unit { let mut m = x while m is Some(v) { println(v) m = None } }
Note that is expression can only take a simple pattern. If you need to use
as to bind the pattern to a variable, you have to add parentheses. For
example:
fn j(x : Int) -> Int? {
Some(x)
}
fn init {
guard j(42) is (Some(a) as b)
println(a)
debug(b)
}
Regex Literal Expression#
re"..." is a regex literal expression. Its type is Regex.
Regex literals are ordinary expressions, so they can be stored in local bindings, passed as arguments, used as default argument values, and defined as constants:
let r : Regex = re"a(b+)"
const IDENT_START : Regex = re"[A-Za-z_]"
const IDENT : Regex = IDENT_START + re"[A-Za-z0-9_]*"
Regex values can also be combined with + for sequence and | for
alternation. In places that require a regex constant expression, such as
=~, named const values defined from regex
literals can be referenced directly.
Unlike ordinary string literals, regex literals do not require double-escaping
backslashes. For example, write re"/\*" instead of re"/\\*".
const REGEX_IDENT_START = re"[A-Za-z_]"
const REGEX_IDENT_CONT = re"[A-Za-z0-9_]*"
const REGEX_AB : Regex = re"a" + re"b"
fn regex_default_arg(re? : Regex = re"abc") -> Bool {
re.execute("zabc") is Some(_)
}
test {
let regex : Regex = re"a(b+)"
assert_true(regex.execute("abbb") is Some(_))
assert_true(regex.execute("ac") is None)
assert_true(REGEX_AB.execute("ab") is Some(_))
assert_true(REGEX_AB.execute("ac") is None)
assert_true(regex_default_arg())
}
Invalid regex literals are rejected at compile time.
Regex literals use MoonBit's regex syntax. The supported forms include:
Literal characters: ordinary characters match themselves
Wildcard:
.matches any single character, including newlineCharacter classes:
[abc],[^abc],[a-z]POSIX classes inside character classes:
[[:digit:]],[[:alpha:]],[[:space:]],[[:word:]],[[:xdigit:]], etc.Quantifiers:
*,+,?,{n},{n,},{n,m}Non-greedy quantifiers:
*?,+?,??,{n}?,{n,}?,{n,m}?Grouping and alternation:
( ... ),(?: ... ),(?<name> ... ),a|bAssertions:
^,$,\b,\BScoped modifier:
(?i: ... )for case-insensitive matching
Escape handling is regex-oriented rather than string-oriented. Common escapes
include \n, \r, \t, \f, \v, escaped metacharacters such as \. and
\(, and Unicode escapes \uXXXX / \u{X...}. To match a literal {, use
[{] rather than \{. This leaves room for future interpolation support in
regex literals, where \{ would conflict with the interpolation syntax.
There are several important semantics and restrictions:
^and$are non-multiline anchors: they match only the beginning and end of the whole input\band\Bare currently usable when a regex literal is handled as a first-classRegexvalue They are not currently available inregex match expressionconstant contexts such as=~, but this restriction is expected to be relaxed in the futurePOSIX character classes are ASCII-based
\d,\D,\s,\S,\w, and\Ware not supported Use[[:digit:]],[^[:digit:]],[[:space:]],[^[:space:]],[[:word:]], and[^[:word:]]instead\xHHbyte escapes are not supported inre"..."; use Unicode escapes or ordinary characters insteadLookahead, lookbehind, backreferences, and character-class set operations are not supported
In character classes,
-is used for ranges To match a literal dash, escape it as\-; putting-at the start or end of a character class is not supported
Named capture groups such as (?<id>[0-9]+) belong to the Regex value
itself. They are useful with APIs such as Regex::execute and
MatchResult::named_group, but they do not introduce MoonBit binders by
themselves.
When a regex literal is used as a first-class Regex value, operations such
as Regex::execute use first-match semantics: they return the first match
found from the search position. They do not provide a longest-match mode.
Regex Match Expression#
Regex match expressions use the =~ operator to search a StringView with a
regex constant expression. This is a newer regex-matching form intended to
replace experimental lexmatch. The expression returns Bool.
input =~ re"abc"
input =~ ((PREFIX + SUFFIX) as whole, before=head, after=tail)
input =~ (re"b", before~, after~)
The right-hand side must be a regex constant expression: a regex literal such
as re"abc", a named const, or an expression built from constants with +
(concatenation), | (alternation), and parentheses. Arbitrary runtime values
are not allowed.
Use as to bind the matched substring. Use before and after to bind the
unmatched prefix and suffix as StringView; before~ and after~ are
shorthand forms that bind variables named before and after.
This is separate from regex named capture groups. For example, in
re"(?<id>[0-9]+)", the name id is part of the regex engine's capture
metadata, not a MoonBit binder. If you need a binder in =~, use as, such
as (re"(?<id>[0-9]+)" as digits).
Like is, binders introduced by =~ can be used in the same boolean-flow
contexts, such as the right-hand side of && and the true branch of if.
Regex matching is search-based by default, so "zabc!" =~ re"abc" is true.
Use anchors such as ^ and $ when you need to constrain the match to the
beginning or end of the input.
=~ also uses first-match semantics. It will not support longest-match
behavior.
test {
let input = " let_name = 42 "
if (input =~ (
(REGEX_IDENT_START + REGEX_IDENT_CONT) as ident,
before=head,
after=tail
)) {
assert_true(head is " ")
assert_true(ident is "let_name")
assert_true(tail is " = 42 ")
} else {
fail("expected identifier")
}
if ("abc" =~ (re"b", before~, after~)) {
assert_true(before is "a")
assert_true(after is "c")
} else {
fail("expected middle match")
}
let source : StringView = "abc"
if (source =~ (re"." as ch, after=rest)) {
assert_eq(ch, 'a')
assert_true(rest is "bc")
} else {
fail("expected leading char")
}
assert_true("zabc!" =~ re"abc")
assert_true(!("zabc!" =~ re"^abc"))
}
In the example above, head, ident, tail, before, after, and rest
have type StringView. The binder ch has type Char, because re"."
matches exactly one character.
Lexmatch#
Warning
lexmatch and lexmatch? are deprecated. Prefer
regex match expression in new code.
This section is kept as reference for existing code.
lexmatch matches a String against a regex pattern and lets you bind the
pieces of a match. The search-mode pattern is (before, regex pieces, after),
where before and after are optional bindings for the unmatched prefix and
suffix, separated by commas. The regex pieces in the middle are separated by
whitespace only. The regex itself is written as a sequence of string literals,
so you can split it across lines or insert comments between parts. You can
also bind a matched sub-pattern using as, such as ("b*" as b).
lexmatch? is a boolean check similar to is, and it can introduce binders
for use in the same contexts as is expressions.
In old code, search-mode lexmatch looked like this:
lexmatch text {
(before, "a" ("b*" as b) "c", after) => ...
_ => ...
}
if text lexmatch? ("a" ("b*" as b) "c") && b.length() > 0 {
...
}
In new code, write those search-mode checks with =~ instead.
lexmatch also supports a lexer-style mode: lexmatch <expr> with longest,
which picks the longest match among alternatives (for example, if|[a-z]*
matches iff as iff in longest mode, while first-match search mode matches
if first). Regex match expressions do not provide this longest-match mode.
Regex literals support \b and \B as part of the regex syntax, but these
word-boundary assertions are not currently available in regex match expression constant contexts. They do work when the regex is used as a
first-class Regex value, and this restriction is expected to be relaxed in
the future. Regex literals also do not support \d, \D, \s, \S, \w,
or \W. Use POSIX character classes like [[:digit:]] inside character
classes instead.
test {
let text = "xxabbbcyy"
if text =~ (re"a" + (re"b*" as b) + re"c", before~, after~) {
inspect(before, content="xx")
inspect(b, content="bbb")
inspect(after, content="yy")
} else {
fail("")
}
if text =~ (re"a" + (re"b*" as b) + re"c") && b.length() > 0 {
inspect(b, content="bbb")
}
let keyword = "iff"
lexmatch keyword with longest {
("if|[a-z]*" as ident) => inspect(ident, content="iff")
_ => fail("")
}
}
Spread Operator#
MoonBit provides a spread operator to expand a sequence of elements when
constructing Array, String, and Bytes using the array literal syntax. To
expand such a sequence, it needs to be prefixed with .., and it must have
iter() method that yields the corresponding type of element.
For example, we can use the spread operator to construct an array:
test {
let a1 : Array[Int] = [1, 2, 3]
let a2 : FixedArray[Int] = [4, 5, 6]
let a3 : @list.List[Int] = @list.from_array([7, 8, 9])
let a : Array[Int] = [..a1, ..a2, ..a3, 10]
inspect(a, content="[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]")
}
Similarly, we can use the spread operator to construct a string:
test {
let s1 : String = "Hello"
let s2 : StringView = "World".view()
let s3 : Array[Char] = [..s1, ' ', ..s2, '!']
let s : String = [..s1, ' ', ..s2, '!', ..s3]
inspect(s, content="Hello World!Hello World!")
}
The last example shows how the spread operator can be used to construct a bytes sequence.
test {
let b1 : Bytes = b"hello"
let b2 : BytesView = b1[1:4]
let b : Bytes = [..b1, ..b2, 10]
inspect(
b,
content=(
#|b"helloell\x0a"
),
)
}
TODO syntax#
The todo syntax (...) is a special construct used to mark sections of code that are not yet implemented or are placeholders for future functionality. For example:
fn todo_in_func() -> Int {
...
}