Lattice 0.6 - Automatic, Fine-Grained Parallelization

Lattice is a high-performance visual scripting system targeting Unity ECS. Read more here.

Lattice 0.6 just released! This is a stability and performance update, with one big exception: automatic parallelization of Lattice scripts. What does that mean?

A Lattice Script looks like this:

Before 0.6, the Lattice compiler generated .NET IL that executed each node one at a time, serially. While that sounds slow, it’s how all programming languages work! Unless you tell it otherwise, code executes in one big line from beginning to end, on a single thread. Actually Lattice is a vector execution engine, so it executes each node across all entities as a chunk, so it’s even a bit faster than that.

In the new version of Lattice, all nodes run in parallel, without any annotations needed. The Lattice compiler automatically figures out how to split up jobs and schedule them under the hood with the Unity job system.

If you’re familiar with dataflow, it’s a variant of that style with some extra goodies.

If you have Lattice scripts from 0.5, in 0.6 they will just run faster — no work needed on your end!

Fine Grained Parallelism

Parallelization is “fine-grained”. Normally when you work with parallel code (like with Unity jobs) you write larger pieces of code to run in parallel. The fundamental unit of parallelization is a function.

In Lattice, a node is made up of several underlying operations: sub-nodes that only the compiler sees. Parallelization works at the resolutions of these operations. Here’s an example of a lattice script as the compiler sees it:

The above script, as the compiler sees it.

Here’s the same script, but colored by ‘work unit’.

Each different color is scheduled in a different Unity job. The compiler is not required to schedule them separately — if work is scheduled too finely, the overhead of moving data around will eat into the benefits of parallel execution. However, so far I’ve found the best speed ups come from maximal parallelization.

Here’s what the Lattice frame tick looks like now in our project, with parallelization enabled. The parallelization is dense, filling all job threads.

Hundreds of lattice nodes split across all threads.

“Fearless Concurrency”

Writing parallel code is hard. It’s easy to run into race conditions, deadlocks, or worse. If you’ve ever worked with Unity’s job system directly, you’ll know that Unity goes to great lengths to catch race conditions.

What’s really cool about the Lattice compiler is that it generates code that is data-race free. You write your scripts and whether they run in parallel or in serial is up to the compiler. No annotations necessary.One note: If any of your nodes call Unity non-thread-safe APIs, you’ll need to tag those nodes with [MainThread]. That will force those nodes onto the main thread.

Lattice is able side-step all the usual complexity, because it’s a pure, data-flow language. Data dependencies are fully defined under the hood, and so scheduling is simply a topological sort across the entire compilation graph. Even cooler, the schedule for a graph is built at compile time, rather than at runtime, as Unity’s job system usually requires!

In a lot of ways, Lattice is just a really fancy scheduler for Unity’s job system. It manages the inputs and outputs of each function, and figures out where to store intermediate values.

Also in 0.6: Full Source!

Folks will be glad to hear that the source for the compiler is now distributed as a part of the package. This should make integrating the library much easier, and means you can quickly fix bugs, submit PRs against the repo, etc. Keep in mind the license is not MIT, but it’s intended to be free for most reasonable use cases. Check the readme for more info and let me know if you have any questions.

This also fixes a pernicious bug in the latest version of Unity ECS, which did not play well with precompiled DLLs.

Overall Status and Roadmap

Lattice is coming along nicely. A big change that isn’t visible is an underlying refactor preparing the way for sub-graphs! I’m very excited about this because it’s going to be a major improvement in ergonomics, and allow making larger, reusable components that you wouldn’t easily be able to define with C#.

I’ve also figured out a way to do subgraphs with zero overhead. To the compiler, sub-graphs will just look like normal, inlined code, and as we add optimizations to the compiler, they will be able to optimize across subgraph boundaries. Excited to share more about this soon.

Thanks for reading!

You can download the latest Lattice release on Github, and you can chat with us on the Unity forum thread, or on our Discord channel! Come say hi!