Langium: A New Toolkit That Simplifies Textual Language Creation

Monday, May 30, 2022 - 06:00

Langium is a new toolkit for textual languages that are built on the TypeScript programming language. The idea was to keep the concepts and features that made Eclipse Xtext successful, and to lift them onto a new technological foundation by relying completely on TypeScript, which has become the de-facto standard for cloud-based developer tools.

The conceptual similarity makes it easy for developers who are familiar with Xtext to migrate to Langium:

A grammar language is provided to specify the syntax and structure of your domain-specific language (DSL).
Cross-references have first-class support with built-in scoping and indexing.
Dependency injection is used to wire everything together and customize on a fine-grained level.

We also simplified areas where we saw potential for reducing complexity. Most importantly, Langium relies solely on plain TypeScript types. The grammar rules of your language are mapped to JavaScript objects and accompanied with generated TypeScript interface declarations. Though it’s a restriction on direct integration with existing modeling tools, it’s a great simplification for the majority of use cases.

Langium Creates a Compatible Development Environment for DSL Architectures

Before I provide more insight into how Langium works, here’s a closer look at why it’s needed.

In a presentation at EclipseCon 2018, Jan Köhnlein and I showed an architecture for a DSL with automatically generated graphical views in a web IDE (see the recording on YouTube). The example featured a state machine DSL with rich text editor support, side-by-side with a graphical representation of the states and transitions that are specified in text (Figure 1).

Langium-1

Figure 1: Two Representations of a State Machine DSL

The underlying frameworks in that architectural solution are Eclipse Xtext for the DSL, Eclipse Sprotty for the diagrams, and Eclipse Theia for the web IDE. These technologies work together well.

But the solution has a flaw: The language server is based on Java, while the rest of the application is written in TypeScript. Because the two development environments are incompatible, the software engineers implementing the solution must resolve the associated issues, potentially increasing the cost of developing and maintaining the application.

Hence the need for a more coherent technology stack and a new toolkit for textual languages: Langium. Here’s how it works.

Langium Grammar Is Built on Three Rules

A brief example of a Langium grammar declaration is shown below. It contains three grammar rules:

Statemachine
State
Transition

The first rule is marked with the keyword entry to indicate that the parser should start processing input documents with this rule. The rule definitions consist of keywords like 'state', property assignments like name=ID, and cross-references like [State].

grammar States

entry Statemachine:

'statemachine' name=ID

states+=State*;

State:

'state' name=ID

transitions+=Transition*;

Transition:

'=>' state=[State];

Type Declarations Use Syntax Similar to TypeScript

For a given text document, Langium creates a data structure called Abstract Syntax Tree (AST). When written as in the example above, every grammar rule invocation leads to a corresponding node (a JavaScript object) in the AST, and Langium generates a TypeScript interface for every rule to provide static typing for these nodes. The interface properties are inferred from the assignments found in the respective rules. This approach is great for rapid prototyping, when the focus is on designing the syntax of a new language.

However, more mature language projects should declare the AST types explicitly to avoid accidental changes, as large parts of the language-specific code — validation checks, code generator, and others — will depend on these types. In this case, types can be declared within the grammar language with a syntax that is very similar to TypeScript.

The code below shows type declarations for the “States” grammar. The only difference to the TypeScript syntax is the reference notation @State, which means a cross-reference to a State.

interface Statemachine {

name: string

states: State[]

}

interface State {

name: string

transitions: Transition[]

}

interface Transition {

state: @State

}

Langium and Eclipse Sprotty Are a Powerful Combination

Combining Langium and Sprotty in a VS Code extension creates a powerful text editor and graphical view for your DSL. You can embed this extension in a web IDE, based on Theia or VS Code, or publish it to the Open VSX Registry so others can benefit from it. Because the whole project is built with TypeScript, it has a consistent code base that can be compiled, tested, and packaged using a single set of tools. The result is a smooth onboarding experience for developers joining the project, along with long-term maintainability.

The diagram models rendered in the Sprotty frontend are backed by a diagram server. For the integration with Langium, the diagram server is included within the language server implementation, so it can be regarded as a “graphics extension” of the language server. The Language Server Protocol (LSP) that connects the editor with the language server is based on the extensible JSON-RPC protocol. Messages between the diagram server and the graphical view in the frontend are transported using that protocol with additional JSON-RPC methods (Figure 2).

Untitled

Figure 2: IDE-Language Server Communications in Langium

Use VS Code extension to Integrate Language Servers

While Theia generally offers a deep API for customizing every part of your application, its @theia/languages package was deprecated in version 1.4.0 in July 2020. Language servers can no longer be integrated using the Theia API, but only within a VS Code extension. As a result, the Sprotty-based diagrams must be embedded using the VS Code Webview API.

A Webview is an isolated web application embedded in its host IDE using an iframe, which is a page embedding concept in HTML. For tool developers, the isolation of graphical views and the involved communication channels increases complexity. The benefit for users is higher security because you can install a VS Code extension with no risk of interfering with the rest of your IDE.

The sprotty-vscode integration that is part of the Eclipse Sprotty project mitigates the complexity of Webviews by taking care of many technical details. The repository also contains the state machine example, of which an excerpt is shown in this article.

The state machine example is currently available with two language server implementations: one based on Xtext and the other on Langium. This demonstrates the functional equivalence of the two solutions. Aside from the already discussed technological advantage, Langium provides better language server start-up performance in the example: I measured a one-second start-up time for Langium and four seconds for Xtext.

Learn More

If you’re interested in using Langium:

Visit the Langium webpage for a guide and in-depth documentation
Check out this example of how to combine a Langium DSL with Sprotty diagrams
See whether using the Eclipse Layout Kernel or running the layout engine in a background process to deliver layouts of graph-like structures works best for you
Consider using Eclipse GLSP for graphics-first solutions
Follow me on Twitter for more news about DSLs, diagrams, and web IDEs
Watch our recent Cloud Tool Time episode on this topic

Langium: A New Toolkit That Simplifies Textual Language Creation

Langium Creates a Compatible Development Environment for DSL Architectures

Langium Grammar Is Built on Three Rules

Type Declarations Use Syntax Similar to TypeScript

Langium and Eclipse Sprotty Are a Powerful Combination

Use VS Code extension to Integrate Language Servers

Learn More

About the Author

Miro Spönemann

More from this Edition

Eclipse Theia: the Natural Cloud Migration Path for Eclipse Platform Adopters

Eclipse Che Gets a New Dev Environments Engine

Why It’s Time to Try the Latest Version of Eclipse JKube