YAML, go templated YAML, HOCON, HCL(2), GCL, JSON. Just some of the common languages used for configuring the world we live in, because they need to be human readable, not just machine readable. At first, it’s all fine, but then to ensure consistency and reuse, they add templating features, and the templating languages become more complex to handle the emerging needs as projects require more.
Some evolve to be Turing complete, some specifically eschew Turing completeness, as they need to ensure that at runtime, nothing bad happens. As the templating advances, things get harder to understand and debug as the configuration language rarely provides sufficient tooling. It can be hard to understand what the result of all the templating actually is. Eventually the templates, and sometimes their consumers, become impenetrable. That’s before you get to some of the odd quirks of some of them, to ensure you get the quoting or indentation correct as you’re templating. Either way, as things scale up, configuration gets messy.
There’s definitely a better way..
The title spoils what I propound, but it’s Object Building Languages (OBLs). If you’ve seen CDK8s, Pystachio, Bazel (sorta, as it’s intermediate form isn’t really exposed), or others of the sort, you’ll have seen the concepts. For the rest, the idea is that we use a normal programming language to instantiate configuration objects, and then generate the form that’s actually used by whatever the runtime system is.
The advantages are that you can standardize things, get the modularity you need, without having a poorly bolted on programming language to deal with. Not only that, but by picking an existing language as your base language, you get all the standard tooling for that language for debugging, editing, and so forth. Additionally, because the end system isn’t interpreting it directly, you can test the output (e.g. YAML) of the DSL to ensure that it’s actually doing what you think it is. This last one is critical when refactoring larger setups because you can be sure that the output didn’t change.
Because it’s also a real programming language, you can also add whatever validation you need, enforce any opionions your org has, etc. rather than trying to wedge these into a conventional configuration language that never quite seems to fit.
So when there’s a OBL for your use case, you’d be wise to consider it. But if not, or if you don’t like the OBL that you can find, writing one isn’t that hard. For an example, I’ll build a mini version of a Tekton DSL with Python as the base for a Task
.
As to base language choice, Python has the advantage that just about everyone knows Python as it’s widely used, has a good debugger and tooling is plentiful. Typescript could be a good choice also, but I don’t know enough to be sure. Compiled languages like Go or Rust could probably be used also, but the overhead there is probably more than it’s worth.
In any case, the DSL we’re going through would look like this:
build = Step(
name = "build",
image = "ubuntu:20.04",
script = """
git clone git@github.com:foo/bar
cd bar
go build
""")
Task(
name = "build-only",
description = "",
steps = [build],
)
That would generate our desired output:
apiVersion: tekton.dev/v1beta
kind: Task
metadata:
name: build-only
spec:
steps:
- name: build
image: ubuntu:20.04
script: |2
git clone git@github.com:foo/bar
cd bar
go build
Actually, for our Step
here, I’d really like to make a standard way of making steps, as they should all use
ubuntu:20.04
:
def stdstep(name, script):
return Step(name = name,
image = "ubuntu:20.04",
script = script)
build = stdstep(
name = "build",
script = """
git clone git@github.com:foo/bar
cd bar
go build
"""
)
A little overkill for our case, but illustrates the point.
Back to the implementation: it’s evident we need these classes
from ruamel.yaml import YAML, RoundTripRepresenter
import sys
tasks = {}
class Task:
def __init__(self, name, description, steps):
self.name = name
self.description = description
self.steps = steps
tasks[self.name] = self
class Step:
def __init__(self, name, image, script):
self.name = name
self.image = image
self.script = script
The tasks
dictionary might be a bit unexpected, but you need a way to figure out ultimately what to output. You could take the bazel approach and do everything by name, and it wouldn’t be wrong, but it is more complex as you may have to deal with namespacing and other issues that Python’s code namespacing deals with all by itself.
With those in place, it’s just a matter of executing the configuration script
ns = {
"Task": Task,
"Step": Step,
}
f = open(sys.argv[1]).read()
## compile it so filename info will be retained and reported in the event of error
co = compile(f, sys.argv[1], 'exec')
exec(f, ns)
vs = [v.dict() for v in tasks.values()]
y.dump_all(vs, stream=stdout)
For the simple case, that’s all you need, you run your DSL runner against the configuration file and voilà,
properly formatted YAML. Oh, not quite, we never implemented the dict()
methods it calls.
class Task:
...
def dict(self):
return {
"apiVersion": "tekton.dev/v1beta",
"kind": "Task",
"metadata": {"name": self.name},
"spec": {
"steps": [steps[i].dict() for i in self.steps]
}
}
class Step:
...
def dict(self):
return {
"name": self.name,
"image": self.image,
"script": self.script
}
Ok, now you can run the dslrunner and get output. The downside here as the DSL maintainer is that you
need to make sure your DSL exposes the necessary parts of the underlying YAML schema, but it’s only
occasional toil after initially writing it, as your users will let you know if there’s something they
need that you don’t expose. Or they may shoot you a PR because looking at the runner, it’s very clear
what needs to happen. You could cheese out and add a **kwargs
to the constructor and add that dict to
the dict()
method work, but I don’t recommend it as it then makes it so people can specify incorrect
attributes to the objects and nothing will tell them.
If you have some sort of schema file for your output form, it can be a decent idea to write something to translate from the schema to the Python classes you’d require, rather than having to write these out by hand.
First time right considerations
If you choose Python as your base DSL, there are some things you should consider at the beginning, as tightening this kind of thing down after the fact can be difficult/practically impossible:
Overriding the __import__
hook
Allowing arbitrary imports can lead to … unexpected behavior, especially depending on exactly when
evaluation of the DSL occurs, vs. when it’s consumed. So you may want to turn off import
entirely,
or alternatively, have an allowlist of acceptable things users are allowed to import; only adding to
it when you’ve considered the use case. Another way to deal with this is to import the modules you
wish to expose and add them to the top level namespace, or make an import
-like function to load them
in from where you’ve imported them already in the runner.
If you don’t allow import, you’ll still need a way to include things from other files if you want the templating and consistency that caused you to set out to do this in the first place. Because you can control the namespace that the DSL scripts run in, you could implement something like this:
def make_ns():
new_ns = {
"Task": Task,
"Step": Step,
}
new_ns['load'] = lambda spec, *names: load(new_ns, spec, names)
return new_ns
ns = make_ns() # the top level DSL script namespace
def load(ns, some_kind_of_path_specifier, list_of_names):
#get script_text however you choose
script_text = get_the_script(some_kind_of_path_specifier)
load_ns = make_ns() # make a new namespace
co = compile(script_text, path, 'exec')
exec(co, load_ns) #exec the script in the new namespace
#extract the names from the loaded namespace into the caller's namespace
for name in list_of_names:
ns[name] = load_ns[name]
This would enable for us to put stdstep
into a separate file. For the current
example, we’ll put it in common.in
and then in the top level DSL file,
load("common.in", "stdstep")
and the rest continues as before. Not shown in the example, but you’d want load
s
to be relative to the file loading it, and so you’d need to add bookkeeping to track that
and do the right thing.
Depending on your environment, you might want something more than just a file path as the path
specifier. For example /
might not be /
in the filesystem, but relative to some other root.
You could have it be a url, or even use bazel-style targets as the specifier, as long as you know
how to retrieve the configuration stored there.
Consider what’s in __builtins__
The __builtin__
namespace has some things in there you might want to disallow also. The open
, compile
, eval
, and exec
should be at the top of
your list, but examine the rest to see what else to consider. Things that reach outside the interpreter are
the things you want be wary of.
How Paranoid?
You could also consider disallowing use of while
loops or try
/except
or other constructs if you were
so inclined by doing AST examination after compiling the source files. There’s the ast
module which can help here.
Mistakes
Obviously, people make mistakes. Something you can add to catch some is argument validation in the object constructors. Validation is one kind, but syntax errors are another. One nice thing about this implementation is
that you get usable errors for the mere cost of explicitly compile
ing the script
before exec
ing it.
One example: a syntax error in a loaded file:
Traceback (most recent call last):
File "/Users/drew/devel/dsl-py/dslrunner.py", line 63, in <module>
exec(co, ns)
File "DSL3.in", line 3, in <module>
load("common.in", "stdstep")
File "/Users/drew/devel/dsl-py/dslrunner.py", line 46, in <lambda>
new_ns['load'] = lambda spec, names: load(new_ns, spec, names)
File "/Users/drew/devel/dsl-py/dslrunner.py", line 53, in load
co = compile(script_text, path, 'exec')
File "common.in", line 2
Step(name = name,
^
SyntaxError: '(' was never closed
Another example, if I put a validation in the Task
constructor to check that the name
isn’t None
and I change the input to pass None
for the name, you get a usable error:
Traceback (most recent call last):
File "/Users/drew/devel/dsl-py/dslrunner.py", line 63, in <module>
exec(co, ns)
File "DSL3.in", line 20, in <module>
Task(
File "/Users/drew/devel/dsl-py/dslrunner.py", line 11, in __init__
raise ValueError("name cannot be None")
ValueError: name cannot be None
While there is some noise from the dslrunner
in the tracebacks, the error points
in the DSL code are clearly delineated.
As things get built out, you might consider a traceback filter to filter out the noisy parts to make thing easier for your users.
Summary
As configuration language evolve, they don’t evolve especially well. They evolve into complexity, they’re hard to refactor safely, and get hard to debug because they don’t have the tooling that more common languages like Python have. With a little code, most of it the objects you expose, you can write a flexible DSL preprocessor that can evolve much better as you have the language tooling available to help.
The full dslrunner.py
from this post:
from ruamel.yaml import YAML, RoundTripRepresenter
import sys
tasks = {}
steps = {}
class Task:
def __init__(self, name, description, steps) -> None:
self.name = name
self.description = description
self.steps = steps
tasks[self.name] = self
def dict(self):
return {
"apiVersion": "tekton.dev/v1beta",
"kind": "Task",
"metadata": {"name": self.name},
"spec": {
"steps": [steps[i].dict() for i in self.steps]
}
}
class Step:
def __init__(self, name, image, script) -> None:
self.name = name
self.image = image
self.script = script
steps[self.name] = self
def dict(self):
return {
"name": self.name,
"image": self.image,
"script": self.script
}
def make_ns():
new_ns = {
"Task": Task,
"Step": Step,
}
new_ns['load'] = lambda spec, *names: load(new_ns, spec, names)
return new_ns
def load(ns, path, list_of_names):
script_text = open(path).read() #this is whatever you choose
load_ns = make_ns() # make a new namespace
co = compile(script_text, path, 'exec')
exec(co, load_ns) #exec the script in the new namespace
for name in list_of_names: #extract the names from the loaded namespace into the top level namespace
ns[name] = load_ns[name]
ns = make_ns()
f = open(sys.argv[1]).read()
## compile it so filename info will be retained in the event of error
co = compile(f, sys.argv[1], 'exec')
exec(co, ns)
vs = [v.dict() for v in tasks.values()]
# make it so the YAML output looks closer to the way a human would write it
def repr_str(dumper: RoundTripRepresenter, data: str):
if '\n' in data:
return dumper.represent_scalar('tag:yaml.org,2002:str', data, style='|')
return dumper.represent_scalar('tag:yaml.org,2002:str', data)
y = YAML()
y.representer.add_representer(str, repr_str)
y.dump_all(vs, stream = sys.stdout)