Manifest in Kumori: the CUE language

Kumori generic manifests

In Kumori, an artifact of the model is represented by a manifest with the following structure

#Artifact: {
	spec: #Version      // Version of the spec being used
	ref:  #ManifestRef  // Reference to the element being defined
	description: _      // representation of the element
}

#Version: 3 * [uint]  // Tuple with three uints, [major, minor, patch]

#ManifestRef: {
	version: *[1, 0, 0] | #Version // Version of the element being defined
	domain:  string                // Domain owner of the definition
  module:  string                // The rest of the module name, after the domain name
	name:    string                // name of the artifact being described
	kind:    string                // kind of artifact: service/component
}

The structure is formally expressed using the CUE specification language, a superset of JSON which we present briefly in the next section.

In essence, every artifact’s manifest must specify what manifest spec version it is using (in the spec field), and it must provide a reference that identifies it in the field ref.

Such a a reference includes a name for the artifact, a version, and a domain name to avoid name collisions. Artifact manifests can be distributed within kumori modules. A kumori module is immutable. That is it does not matter where it is stored.

Inmutability is enforce through shasum verification through our xref-deprecated:developers:index.adoc[artifact management mechanism.]

The description field for a particular artifact will depend on its kind. Currently only two kinds are supported: component, and service.

Builtins

In addition, Kumori Platform introduces the notion of a builtin artifact, where only those aspects of their specification that impact interfaceing them are described (config, and srv).

A builtin artifact has the field description: builtin: true.

A builtin service application does not describe any role or connections. It just describes its channels and config.

Likewise, a builtin component foregoes all description of its code.

Manifest storage and distribution

Kumori’s artifact manifests could (and probably should) reside in GIT repositories. A GIT commit hash could be used to fully identify a version of a Kumori’s element manifest. As explained above, it is not important where a concrete manifest is stored, as any given version of a manifest is immutable.

Requiring manifest storage to be part of a GIT repository allows for complex manifest definitions, even sets of manifests to form part of the same project, and contain arbitrary complexity.

As kumori artifact manifests are designd for portability and reuse, Kumori implements its own distribution mechanism mediated by a variety of potential registry engines, supported by our own xref-deprecated:developers:index.adoc[tools].

Kumori requires that any element manifest be formulated as a CUE package exposing a top level #Artifact definition.

As we will show later, CUE has its own way of organizing CUE code, and Kumori manifests piggybacks on it, supplying extra conventions where CUE standard falls short to ensure the properties we need.

A (very) brief primer on CUE

Kumori uses the CUE language for all its specifications. CUE is already being used around the Kubernetes ecosystem of software (e.g., Istio).

Despite the above the CUE language is at an early stage of development (version 0.4.1 as of this writing), and some of its features need some refinement, are incomplete, or present still some buggy behavior.

We expect a somewhat high rate of version production at this stage, which may afect also Kumori’s tooling and, even, the manifest formal specification.

To avoid problems with the versions of the tools being used, Kumori maintains its own builds of the CUE libraries and cli.

It is out of scope to provide a tutorial on CUE in these pages. Instead we will introduce some basic concepts, leaving others out until we need them.

CUE’s Types and values

Much as in JSON, cue handles structures, ultimately built out of atomic types (int,string,number,boolean, null), and arrays. However, unlike JSON CUE fields can be given types (e.g. string, int,..), and, as a matter of fact, a concrete value (e.g., "my name", or, 4) is also considered a type (a subset of a wider type, possibly).

It works as a lattice of types, where types are related by a subsumption (think of it as set inclussion) relationship: a type A is subsumed by another type B if the set of values of A is a subset of the set of values of B.

By way of example we show a valid cue file

CUE is a superset of JSON

ob1: "something"
ob2: string
"ob3": 45
ob4: uint

We see that some fields are acceptable JSON expressions (without necessarily quoting the field name), but other (ob2: string) clearly fall outside of what can be expressed with JSON. A peculiarity of CUE is that types are values and values are types: there is no distinction between them. What this is in essence saying is that a concrete value forms the trivial type containing just that value.

In the above example, ob1’s type is the set of strings containing only `"something". Whereas the type of ob2 is the set of all possible strings.

Type union

We can extend this expressivity of CUE defining types that are the union of other types. This is done using the | operator:

ob1: "something" | "another"
ob2: string | uint

In the above, ob1 can take one of the two string values provided. ob2 can take either a string or a non negative integer.

In that CUE file, we can also assert that `ob1’s type is included within `ob2’s type.

The operation represented by | is known as the disjunction of two types, and it is a binary, conmutative, and associative operation.

Type unification

Analogously to type disjunction, in CUE we can also express the unification (&) of two types. Like so,

ob1: {
  a: string
  b: int
}
ob2: {
  a: "something"
}

ob3: ob1 & ob2

The way to understand the above CUE code is to compute the type of ob3 as that structure that minimally is included in `ob1’s type AND in `ob2’s type.

By minimally we mean that there is no other type subsumed by each one of the original types (ob1, ob2) that also subsumes the resulting type (see the CUE language spec for a deeper explanation).

As with disjunction, unification is also a conmutative and associative operation.

In the above example, `ob3’s resulting type would be

ob3: {
  a: "something"
  b: int
}

In Kumori we make extensive use of both, unifications and disjunctions to define the schemas that manifests must follow to represent the various elements of our Service Model. Of those two operations, unification (&) is the most powerful when trying to avoid boilerplate in configurations, and deserves further discussion.

In our previous example we explicitly used the unification operator. However, we could achieve the same unification operation this way:

ob1: {
  a: string
  b: int
}
ob2: {
  a: "something"
}

ob3: ob1
ob3: ob2

In this example, we observe that ob3 appears to be defined twice! This is no error, as in cue, annotating a field with a type simply adds a constraint to that field. The set of constraints given to a field are then unified (using the unification operation shown above).

The way to interpret the above code is then as follows:

The first appearance of ob3 imposes the restriction that ob3 must have the "largest" type subsumed by `ob1’s type
The second appearance of ob3 imposes the further restriction that ob3 must have the largest type subsumed by `ob2’s type

Adding up both restrictions for ob3 we can satisfy them unambiguously by obtaining the largest type that is subsumed by ob1’s type and by `ob2’s type, that is `ob1 & ob2.

Using this logic, it is easy to provide a type definition with restrictions in its fields, that can later on be completed unifying with concrete values.

Example forcing contraints

def1: {
  size: > 0 & < 100
  age: > 18
}

ob1: def1 & {size: 0} // Error! conflicting contraints for the size field
ob2: def1 & {size: 1} // OK

The above example shows how forcing a constraint on a struct keeps it from violating it when unifying.

Example boilerplate removal with defaults

def1: {
  size: > 0 & < 100
  age: > 18 | * 18
}

ob: def1 & {size: 1} // OK

In the above example, ob is unified to

{
  size: 1
  age: > 18 | * 18
}

Which means that ob can be exported to {size: 1, age: 18} right away, removing the need to fully specify the age field for ob.

CUE definitions

So far, we have used normal fields in all our examples.

There are other kinds of fields that can be expressed in CUE: definitions. A definition field must have a name starting with the character #.

Definition fields also introduce constraints as normal fields, however they have two additional peculiarities:

They are not exported (more about exporting later) to JSON (or yaml)
They define closed structures, that is, structures that when later unified forbid adding new fields.

It is useful to understand the concept of closedness to avoid mistakes later on. Closedness enable the definition of stricter constraints, as they keep users of specifications from adding fields to structures that should not have them.

Exporting values

The cue tool can produce data output out of CUE code. By data we mean JSON (or yaml) compliant output. That is, given a field, they produce its JSON (yaml) representation.

When exporting, only "normal" fields are placed in the output. Definitions are ignored.

Kumori toolset makes use of this property to produce the actual JSON manifests it uses to deploy services.

Non-visible fields

By convention, fields whose names start with an underscore "_" are not visible when exporting.