Animating with img-genner

I talked about img-genner's glitch functions last time. Now you get to see the messier part of the library, as of (january 15th 2022), the part that needs refactoring and reconsideration. For one thing it is enormously picky in terms of types. So much so that much of the code broke regularly with only slightly wrong input. In addition, since I was not yet entirely aware of how the common lisp object system(CLOS) worked, the design does not flow nicely with the mechanics.

I need to warn you that I have a low opinion of this code and thus am looking to rewrite it, it doesn't have a consistent interface and too much of it is hidden away from the user, so this is part tutorial, part self-guided criticism; this isn't a very well organized blog post.

So, is there much value in it? Well, I tend to think so, but I suppose that I might be biased. What works well is the polygon code, the line code, the ellipse code, when it's not complaining that it's not an explicitly typed array you passed in, is quite functional.

Let's start with another file with something simple. Let's get a spray of ellipses.

(ql:quickload :img-genner)
(defparameter *image* (img-genner:make-image 800 800))
(defparameter *circles* nil)
(loop for i from 0 to 10
      do(push (img-genner:make-ellipse (random 800) (* i 20) 20.0 10.0) *circles*))
(loop for i in *circles*
      do(img-genner:fill-shape i *image* (img-genner:static-color-stroker (img-genner:get-random-color))))
(img-genner:save-image *image* "ellipses.png")
A spray of various colors of ellipses.

In this, make-ellipse is a helper function designed to save the user from specifying everything necessary in a make-instance call. Static-color-stroker creates a stroker, which is similar to a brush in various graphics systems where they probably decided deliberately to call it anything other than a stroker. In the case of static-color-stroker it creates a closure over a function that just contains the desired color value.

(defun static-color-stroker(color)
  (lambda (i x y frac)
    (declare (ignore frac)
             (type fixnum x y)
             (dynamic-extent color))
    (set-pixel i x y color)

Other strokers are possible, gradient-color-stroker creates a gradient over the value of frac, or radial-gradient-stroker, which takes two colors, a coordinate in the image, and a max-radius parameter that specifies the distance from the first color to shift to the second.

There are of course other varieties of shapes, such as polygon and convex-polygon(though, they should convert between each other automatically), and they have different interfaces.

We can add (img-genner:fill-shape (img-genner:make-regular-polygon 400.0 400.0 5 30.0) *image* (img-genner:static-color-stroker (img-genner:get-random-color))) to get an additional pentagon as below

The ellipses from before but now with a pentagon in the center

As you can see, there are issues between boundaries in the triangles that it breaks it down into.

classDiagram class Shape{ float rotation; float~2~ origin; } class Polygon{ List~Point~ points; } class Ellipse{ float~2~ radius; } class Rectangle{ float width float height } Shape *-- Ellipse Shape *-- Polygon Shape *-- Rectangle Polygon *--ConvexPolygon

I suppose that rectangles are a type of polygon, but unlike proper polygons, it doesn't make sense to represent them as a series of points, but representing that relationship properly would require more work than it seems like it's worth.

I've been finding that animating with this library is interesting. So let's make those ellipses vibe. I know, this is a still image library, but that doesn't have to stop us. All we have to do is write two functions, one to reset the image, and another to move the ellipses.

We'll also have to assign the ellipses consistent colors, otherwise they will flash constantly, and I doubt that's considered appealing.

(defun reset-image()
  (loop for i from 0 below (array-total-size *image*)
        do(setf (row-major-aref *image* i) 0)))
(defparameter *circles* nil)
(defparameter circle-colors (make-hash-table :test 'equal))
(defparameter circle-speeds (make-hash-table :test 'equal))
(defun mutate()
  (loop for i in *circles*
        do(incf (aref (gethash i circle-speeds) 0) (- (random 1.0) 0.5))
        do(incf (aref (gethash i circle-speeds 0.0) 1) (- (random 1.0) 0.5))
        do(img-genner:rotate-by i (- (random 1.0) 0.5))
        do(img-genner:move-by i (aref (gethash i circle-speeds) 0) (aref (gethash i circle-speeds) 1))))
(loop for i in *circles*
      for c = (img-genner:get-random-color)
      do(setf (gethash i circle-colors) c)
      do(setf (gethash i circle-speeds) (img-genner:point 0.0 0.0))
      do(img-genner:fill-shape i *image* (img-genner:static-color-stroker (img-genner:get-random-color))))

Then we run an animation loop for 160 frames

(loop for i from 0 below 160
      do(loop for circle in *circles*
              do(img-genner:fill-shape circle *image* (img-genner:static-color-stroker (gethash circle circle-colors))))
        do(img-genner:save-image *image* (format nil "~4,'0d.png" i))

And then with the proper ffmpeg invocation, you should end up with something like the following

The ellipses vibe subtly

For the above animation, the following command was used to generate it.

ffmpeg -i %04d.png -vcodec libwebp -filter:v 'fps=fps=20' -pix_fmt yuv420p -q:v 6 -p loop 0 output.webp

In the future however, I think I would like to refactor this so that it isn't necessary to work with the shape objects as the objects are really overkill for drawing ellipses and regular polygons, most libraries at this sort of level just use functions to write directly to the bitmap. I think that's much easier to fool around with since there's less state to manage.

Introducing img-genner, a Common Lisp Image Synthesis Library

I have been working on the img-genner library(repo here) for quite a long time now, over three years if my memory serves me correctly. It contains a number of different modules, and frankly, it is a bit of a Frankenstein. At first it was about polygon graphics, then it became about pixel 'glitch' effects, such as pixel sorting and other stuff, partially because I wanted to write the algorithms myself and gain an understanding of them from implementation.

I'm still not quite sure about the triangularization algorithm(which we won't demonstrate here because I'm not sure it still works), but for the most part it has worked in that way.

For the examples here I am assuming that you have a functional roswell/common lisp environment with quicklisp and all the other amenities.

Getting the library

First clone it to your local projects directory, the command that will work for you will vary from this if you are not using roswell. As of now I have no plans to put this library on quicklisp unless there is some interest.

$> git clone ~/.roswell/local-projects/img-genner

Now you should be able to execute ros run and successfully run (ql:quickload :img-genner) in the repl. But that's probably not a great experience. We need an environment that is at least somewhat durable and reset-able. So let's make and go into a project folder

$> mkdir post-examples-that-are-totally-cool
$> cd post-examples-that-are-totally-cool

Add two very cool images.

Lunala Avi

And a file called something like post.lisp

To start we should probably add load the library, but it would also be useful to have some way to clean up the folder if we don't like what we have so far.

(ql:quickload :img-genner)
(defun clean()
  (loop for i in (directory "*.png")
        when(not (member (pathname-name i) '("lunala" "fox-profile") :test #'string=))
          do(print i) and do(delete-file i)))

With that we can clear any new images that we decide to make.

We should also add (ql:quickload :img-genner) to the top of the file to make loading it easier(no need for defpackage or asdf yet).

So, let's see, let's... downscale the lunala picture. So add this to the end of post.lisp

                                        ; Using defparameter reloads the file each time the compiler 
                                        ; processes it again, which might be useful
(defparameter *lunala* (img-genner:load-image "lunala.png"))
(img-genner:save-image (img-genner:downscale-x2 (img-genner:downscale-x2 *lunala*)) "lunala-smol.png")

Well, that looks more or less as you would expect.

Let's also load the fox-profile picture and then combine the two images.

(defparameter *fox-profile* (img-genner:load-image "fox-profile.png"))
                                           ; The 30 30 refers to the size of the tiles in pixels. 
                                           ; In general, the smaller the tiles the longer this takes
(img-genner:save-image (img-genner:mosaify *lunala* *fox-profile* 30 30) "mosaic.png")

So this has combined the two images by replacing blocks with the most similar blocks of the other image. We can change the way that it scores similarity by changing the distance metric used.

If we use the color-brightness distance function then we can care more about the brightness of the tiles rather than color.

(img-genner:save-image (img-genner:mosaify *lunala* *fox-profile* 30 30 
                                           :distance #'img-genner:color-brightness)

You can see the result below.


And before I finish this post, let me show you a pixel-sort from the library. (This will be rather slow)

(img-genner:save-image (img-genner:fuck-it-up-pixel-sort *lunala* 600 600) "sorted.png")
Sorted.png(run through pngquant to reduce the file size)

Forth and Optimization

Forth is a strange old language. It's among a relatively small number of concatenative languages that achieved any measure of success. Anyway, this isn't something we're going to go into much here because we're thinking about something nearly tangential to the language itself.

How you can make it fast.

Stacks are somewhat slow compared to registers. Using the cache has a cost, even when your program is using optimal access patterns, especially on load store architectures(which seem to be coming back into vogue again).

Anyway, on register machines, forth isn't much in terms of hot shit. Sure there are compilers that manage to do great with it, but that doesn't change the fact that as an execution model it kinda leaves a lot to be desired these days. The other issue is that it's not entirely clear how one is to write a compiler that does that kind of transformation on the execution, or it isn't to us, though that's probably more a gap in our knowledge than any sort of extreme difficulty.

Okay, so forth is simple enough, it has a stack of words of some size, a return stack, and no particular datatypes handled seamlessly other than integers of the word size. Each function(in forth such things are called 'words' because they, at least in the past, were looked up in a 'dictionary), each function can consume any number of items from the stack and result in any number of items being placed onto the stack. There's some stuff about immediate mode execution (compile time stuff), but it uses the same kind of model.

This is hard to map onto registers because a function can return any number of things. But there's a possibility for analysis here that might help. Let's make a simple function that rotates the top three items on the stack twice, so that the topmost item becomes the bottom-most.

: rot2 ( a b c -- b a c ) 
   rot ( 3 in, 3 out)
   rot ( 3 in, 3 out) ;

The first line, the definition doesn't really enforce anything here. The ( a b c -- c b a ) is just a comment to help meagre creatures that need reminders of such things (as opposed to the gods of forth or whoever). But what we can see is that by the end of the function there it remains clearly balanced, it will consume and emit 3 items no matter what.

Let's assume we can assemble rot into something sensible, we're going to use some generic bastardized assembly here because it's the best we can do.

  move S0, T0
  move S1, S0
  move S2, S1
  move T0, S2

Of course, in general you'd want to use something like Single Static Assignment as an intermediate here, but we are even less familiar with that than assembly.

To call this without any analysis we think you would need to load the information from the stack into the registers and then call into the function, but since we know how many registers we need, among other things, we can do something like compile rot2 into something like this.

  pop S2
  pop S1
  pop S0
  call ROT
  call ROT
  push S0
  push S1
  push S2

But then that's not really realistic for our desired solution. Ideally it would generate native code through a transpiler into C, then calling whatever C compiler the user selects.

void rot(word *a, word *b, word *c){
    word tmp=*a;
: quadratic ( a b c x -- n )
 >R swap r@ * +
 swap R> dup * * + ;
: rot2 ( a b c -- c b a )
  rot rot ;
: quadratic ( a b c x -- n )
  >R ( R+1, S-1 )
  swap ( R,S )
  R@ ( R, S+1)
  * ( R, S-2,S+1 )
  + ( R, S-2,S+1 ) 
  R> ( R-1, S )
  dup ( R, S+1)
  * ( R, S-1)
  R@ ( R,S+1) 
  * (R, S-1) + ;

With some luck, the C compiler has a large attention span and optimization space. If it doesn't, then it may not be able to optimize the threaded code at all. And more to the point, the compiler still needs to try to be smart about combining operations prior to that, because some idioms in forth may not have obvious optimization opportunities to a C compiler.

This means that at some point, there is a need to accept the call, there's a need to write a compiler that understands the semantics as they apply to the underlying machine that you are targeting.

Of course, when you're Forth and everything is just words to you, maybe C compilers can peer deeply enough to handle most arithmetic optimizations you might want.

But part of what makes Forth attractive is that it's simple to implement. That it's easy to write something that'll give you lots of program space. It is a low level language, it doesn't bother itself tracking types, it trusts the user, for better for worse, and that's what makes forth so fast. It doesn't pause for garbage collection just as it doesn't tag its types.

Ultimately this sort of environment is more like DOS than it is our memory protected machines these days, and that means that this is likely less than suitable for handling untrusted input(not that you can't write 'provably' correct code in forth).

SB-SIMD, Early and Promising

Common lisp is a dream. Not always a great dream(such as when using strings), but it's sufficient and SBCL is a remarkably interesting compiler. One that tells you how you might let it make the code faster, such as adding type hints, etc.

More recently, I've had the opportunity to work with sb-simd, a library which adds automatic vectorization for certain forms. (Right now it needs to be done in a do-vectorized and it seems to be missing a few SIMD intrinsics, not to mention that if it can't vectorize what you give it it will fail loudly). The general recommendations are to also separate this sort of specialized code into another package that is loaded when it is available.

It's very rough around the edges, but for very simple things it works very well, as in, billions of subtractions a second more than otherwise.

However, it is a framework that was intended to provide acceleration for some sort of machine learning task, so it is much more developed with regards to the floating point type intrinsics than the integer types, but that actually appears to be more a limitation of what intel provides, but maybe that will change(and I'm lead to believe some of the missing operations can be substituted for, but I'm not sure which).

In any case it is likely to improve somewhat further before it makes it into SBCL's contributed libraries, and who knows, in the future it might be part of SBCL's built-in optimizer.

For example, below are a standard lisp version of a diff-image function, and an auto-vectorized one.

(defun diff-image(i1 i2 &optional result-image)
  (declare (type (simple-array u8 (* * 3)) i1 i2)
           (type (or (simple-array u8 (* * 3)) null) result-image)
           (optimize speed (safety 0)))
  (let ((r (if result-image
               (make-array (list (array-dimension i1 0) (array-dimension i1 1) 3) :element-type 'u8 :initial-element 0))))
    (declare (type (simple-array u8 (* * 3)) r))
    (do-vectorized (x 0 (1- (the u64 (array-total-size i1))))
      (setf (u8-row-major-aref r x)
            (u8- (u8-row-major-aref i1 x)
                  (u8-row-major-aref i2 x))))

(defun diff-image-slow(i1 i2 &optional result-image)
  (declare (type (simple-array u8 (* * 3)) i1 i2)
           (type (or null (simple-array u8 (* * 3))) result-image)
           (optimize speed (safety 0)))
  (loop with result = (if result-image result-image (make-array (list (array-dimension i1 0) (array-dimension i1 1) 3) :element-type 'u8 :initial-element 0))
        for y from 0 below (array-dimension i1 0)
        do(loop for x from 0 below (array-dimension i1 1)
                do(setf (aref result y x 0)
                        (- (aref i1 y x 0) (aref i2 y x 0))
                        (aref result y x 1)
                        (- (aref i1 y x 1) (aref i2 y x 1))
                        (aref result y x 2)
                        (- (aref i1 y x 2) (aref i2 y x 2))))
        finally(return result)))

Interestingly, according to the author of sb-simd, this is probably the first code ever written to use it with integers, so it took some work on their end before it worked here.

Optimizing Pointless things

So, we’ve been playing rimworld a lot recently. One of the mods we use is called Rim Factory, which is quite nice overall for building automated bases(yes this breaks the game’s economy, what of it?). One of the mechanics it adds is a buildable object called a sprinkler, which, instead of watering your plants(this isn’t necessary in vanilla rimworld), it causes planets within a certain radius(the second tier is within twelve tiles), to grow by two hours. This is, completely unbalanced and, when you get down to it, just another item in the pursuit of excess, but that’s how we like to play the game.

Anyway, we were wondering about figuring out an optimal pattern for the usage, and since the more rigorous tools we have to measure such things are largely forgotten, we built a little simulation script in python to try and get a feel for various optimization strategies, and the most fun part, of course, is how we went about building this kind of tool.

This tool is remarkably sketchy, it does not attempt to be rigorous, or even follow the watering pattern other than cycling through all of the tiles in range.

Since we’ve been building stuff with python lately, this is in python, which, as always, never ceases to amaze how easy the ecosystem makes it to do this kind of task.

Anyway, to get an idea of where this is going, let’s start with the import block at the top:

from typing import *
import numpy as np
import itertools
import functools
import math
import statistics
import matplotlib
import matplotlib.pyplot as plt
import random
from enum import IntEnum, auto

The building blocks

The core of the script is the iterators that provide the information about where the focus is, and since this is a script without much intention of going particularly far, we can use globals and constants, so let’s start with those!

# The size of the map being kept track of.
# In this case, this means that the map consists of the area from (-24,-24) to (24,24)
# The radius of a sprinkler mkii is 12 blocks


The heart and soul of this program is in its iterators, so let’s start by defining one that will make the rest of our work easier.

def cartesian(start: int, end: int) -> Iterator[Tuple[int,int]]:
    Turn a single range iterator as in range(start,end) into a cartesian product
    return itertools.product(range(start,end),repeat=2)

That’ll be useful for iterating over the map. The next are a bit less sensible, but, they work, and this isn’t meant to be perfect.

def map_tiles():
    r = RADIUS+1
    z = itertools.product(range(-MAP_SIZE,MAP_SIZE), repeat=2)
    z = list(z)
    for x in z:
        yield x

def sprinkler(position:Tuple[int,int]) -> Iterator[Tuple[int,int]]:
    for i, j in cartesian(0, RADIUS+1):
        if i==0 and j == 0:
        s = i**2+j**2
        if math.sqrt(s) <= RADIUS:
            yield (i+position[0], j+position[1])
            yield (position[0]+i,position[1]-j)
            yield (position[0]-i,j+position[1])
            yield (position[0]-i,position[1]-j)


Next comes the fitness function, which, to start off with, will just be a simple mean of how many times each tile has been hit by a sprinkler.

def evaluate_fitness(field: Dict[Tuple[int,int], int]) -> float:
    return sum(field.values()) / (2*MAP_SIZE)**2

Each round in the simulation consists of iterating across a given number of sprinklers n times, and is called for each candidate to figure out the best one.

Putting that together

def simulate(positions: Iterable[Tuple[int,int]], rounds: int)\
             -> Dict[Tuple[int,int],int]:
    field = {(x,y):0 for x,y in cartesian(-MAP_SIZE, MAP_SIZE)
             if (x,y) not in positions}
    for i in positions:
        if i in field:
            del field[i]
    sprinklers = list(map(lambda x: itertools.cycle(sprinkler(x)), positions))
    for _ in range(rounds):
        for s in sprinklers:
            pos = next(s)
            if pos in field:
                field[pos] += 1
    return field

So now that we can evaluate each choice we might make, we have to make it possible to put these together for any number of sprinklers, so we write another function called evaluate_alternatives. Here we decided to limit the breadth of each search step to a given number, here it’s called to_try.

def evaluate_alternatives(positions: Iterable[Tuple[int,int]],
                          to_try: int,
                          rounds: int) -> List[Tuple[int,int]]:
    # Eliminate already extant positions
    options = set(map_tiles()) - set(positions)
    options = list(options)
    to_try = min(to_try, len(options)-1)
    options = options[0:to_try]
    best_fitness = -2**48
    best_found = None
    for i in options:
        f = simulate(positions + [i], rounds)
        fitness = evaluate_fitness(f)
        if fitness > best_fitness:
            best_found = i
            best_fitness = fitness
    print(f"best fitness found this round {best_fitness}")
    return positions + [best_found]

Next we put together the final pieces in a function we called optimize.

def optimize(n: int, candidates:int, rounds: int):
    positions = []
    while len(positions) < n:
        positions = evaluate_alternatives(positions, candidates, rounds)
    fields = simulate(positions, rounds)
    heatmap = np.array([[(fields[x,y] if (x,y) in fields else -5)
                             for x in range(-MAP_SIZE, MAP_SIZE)]
                        for y in range(-MAP_SIZE,MAP_SIZE)])
    fig, ax = plt.subplots()
    im = ax.imshow(heatmap)
    # You can uncomment these if you want to have a label on each cell in the heatmap,
    # for 48x48, it is rather overwhelming
##    for (x,y), v in fields.items():
##        ax.text(x+MAP_SIZE,y+MAP_SIZE, math.floor(fields[x,y]), ha='center', va='center', color='w')

and to top it off, we invoke it at the end of the file:

optimize(10, 80, 1500)

The output should then look something like this

A Heatmap

Okay, that’s fine, but it’s not necessarily what you might want from such a tool. What if you’d rather have as much coverage as possible with a given number of sprinklers?

Well, we can do that.

Alternative fitness metrics

Okay, so what we need to optimize for to get the best coverage possible is to simply count the fields which are touched. In all likelihood, we might end up wanting further metrics to measure by, so we might as well make it a bit… well… Configurable, if you’re going to use a charitable term. So let’s introduce an Enum right before the fitness function.

class FitnessMetric(IntEnum):
    MEAN = auto()
    COVERAGE = auto()

fitness_metric = FitnessMetric.COVERAGE

Then we need to modify the evaluate_fitness function.

def evaluate_fitness(field: Dict[Tuple[int,int],int]) -> float:
    if fitness_metric == FitnessMetric.COVERAGE:
        r = len(list(filter(lambda x:x>0, field.values())))
    elif fitness_metric == FitnessMetric.MEAN:
        r = sum(field.values())/(2*MAP_SIZE)**2
    return r

Let’s run the script again and see what falls out.

Coverage That certainly seems to have the right effect. The majority of the map is covered by the sprinklers.


So, if you’re familiar with this mod, you might realize there are some things this isn’t even trying to account for, namely any infrastructure. Well, that should be easy to add. 🤞

Let’s add a set of positions that we don’t want filled. For now, let’s leave it to the center(we’re assuming that maybe you have a harvester station set up there), so let’s add another constant in front of evaluate_alternatives (yeah, this is getting pretty messy).

excluded_positions = frozenset([(0,0)])

Then we just have to modify evaluate_alternatives to avoid these positions.

def evaluate_alternatives(positions: Iterable[Tuple[int,int]],
                          to_try: int,
                          rounds: int) -> List[Tuple[int,int]]:
    # Eliminate already extant positions
    options = set(map_tiles()) - set(positions) - excluded_positions
#    --- SNIP ---
    heatmap = np.array([[(fields[x,y] if (x,y) in fields and (x,y) not in excluded_positions else -5)
                         for x in range(-MAP_SIZE, MAP_SIZE)]
                    for y in range(-MAP_SIZE,MAP_SIZE)])
#    --- SNIP ---

Alright, what does that make it do?

Alright, that’s a further improvement, we suppose, if nothing else it permits more flexibility in terms of describing a given scenario you want to optimize for.

Okay, so let’s say that you have a specific area that you want to optimize coverage for. How would we go about adding that?

Well, at this point it’s pretty clear that all this nonsense with enums is going to be an increasingly weak abstraction for dealing with additional fitness functions, but let’s keep it for now, if only because we’re not sure how we want to go about fixing this ugly feeling design as of writing this very paragraph.

Okay, so let’s assume that you want to optimize for an automatic harvester with an area between (-5,-5) and (5,5).

special_region = set(cartesian(-5,5))

And we can alter evaluate_fitness to this:

def evaluate_fitness(field: Dict[Tuple[int,int], int]) -> float:
    if fitness_metric == Fitness.MEAN:
        r = sum(field.values())/(MAP_SIZE*2)**2
    elif fitness_metric == Fitness.COVERAGE:
        r = len(list(filter(lambda x:x>0, field.values())))
    elif fitness_metric == Fitness.SPECIAL_REGION:
        r = sum(map(lambda x:x[1], filter(lambda x: x[0] in special_region,field.items())))
    return r

We end up with output like this:

You might not like this output because, among other things, it doesn’t penalize the sprinklers for showing up inside of our growing zone. So, we need to add another line to our much suffering evaluate_fitness function.

def evaluate_fitness(field: Dict[Tuple[int,int], int]) -> float:
    if fitness_metric == Fitness.MEAN:
        r = sum(field.values())/(MAP_SIZE*2)**2
    elif fitness_metric == Fitness.COVERAGE:
        r = len(list(filter(lambda x:x>0, field.values())))
    elif fitness_metric == Fitness.SPECIAL_REGION:
        r = sum(map(lambda x:x[1], filter(lambda x: x[0] in special_region,field.items())))
        r /= max(1, len([(x,y) for x,y in special_region
                         if (x,y) not in field.keys()
                         if (x,y) not in excluded_positions]))
    return r

And this helps a bit: Penalizing the region Let’s label a similar run to see if it has excluded the region altogether, if a square ends up labelled ‘Se’ it indicates that it’s a sprinkler in an excluded area.

Well, damn. Our code has a mistake in it that permitted the single offending sprinkler. We can fix this by changing the function, to this:

def evaluate_fitness(field: Dict[Tuple[int,int], int]) -> float:
    if fitness_metric == Fitness.MEAN:
        r = sum(field.values())/(MAP_SIZE*2)**2
    elif fitness_metric == Fitness.COVERAGE:
        r = len(list(filter(lambda x:x>0, field.values())))
    elif fitness_metric == Fitness.SPECIAL_REGION:
        r = sum(map(lambda x:x[1], filter(lambda x: x[0] in special_region,field.items())))
        penalty_squares = len([(x,y) for x,y in special_region
                              if (x,y) not in field.keys()
                              if (x,y) not in excluded_positions])
        r/= penalty_squares+1
    return r

Alright, and this fixes that issue(and with another fix applied it labels the spots with sprinklers correctly as ‘S’).


Okay, so what we want you to be able to take away from this post:

  1. When you’re processing data for something pointless, don’t bother being clean if doing it in a dirty way lets you figure out what matters faster
  2. Separate it into concerns that model, however roughly, the components that matter
  3. Build a metric for evaluating how good a result is
  4. Descend along randomly chosen positions based on how optimal they are

Feel free to ask questions or comment. The full, considerably messier code can be found here.

Trying Out Zig

So, we have heard good things about Zig. These boil down to the following things:

  • Good speed
  • Fast compilation
  • Decent type system
  • Simple Syntax

So far, a lot of these things seem to be born out by the experience we've had, though, we have some criticism that is probably more along the lines of our taste in programming, rather than issues we expect to be universal. For fairness sake, let's start with things we like, once again expressing our preferences rather than universals.

  • Low Level
    Low level languages feel liberating to us, because we get to make the decisions, rather than the compiler or virtual machine.
  • Trusts you, mostly.
    This is mostly in comparison to Rust, whose borrow checker we have wrestled with a great deal and whose Turing complete type system has, in the past, left us mystified when we have run into errors.
    And if you really, really want to throw caution to the wind, you can mostly just tell zig that you don't give a damn about whether or not the pointer is to a single element or an array.
  • Compile Time computation, Generics
    This is how generics are implemented, and for the most part, it provides a good experience, especially since it compares exceedingly favorably with pre-c++17 C++.
  • Not afraid of Bare Pointers
    Bare pointers aren't something to worry about too much, especially if they're typed. Bare pointers are quite a bit faster and lighter than other (tagged) references.

And honestly, we're impressed at how tight the language feels. There are some facilities we're not sure about, and some that we wish it had, but overall, this isn't a bad language. We like it more than Rust(which, admittedly, really isn't saying much for us), though, from familiarity alone we're likely to revert to using C or C++, but we're not nearly as skeptical as we once were about this little language.

The syntax is familiar enough, pretty similar to C likes, though, with a few elements that we can't really attribute to any language we know(which doesn't mean very much). So, let's see what something simple looks like.

const std = @import("std");
pub fn main() !void {
    const stdout =;
    var i: usize = 1;
    while (i < 10) {
        try stdout.print("{}\n", .{i});
        i += 1;

You can see that Zig is careful about errors, requiring that you either mark that main can return an error (designated with !void), or handle the error somehow. Anything that can fail in Zig has a type denoted by !<type>, which requires you to think about what errors can happen, or so communities as Go users and Rust users insist.

Let's see something that differs even more significantly than C.

const std = @import("std");
fn arraylist_user(alloc: *std.mem.Allocator) !std.ArrayList(i32) {
    var al = std.ArrayList(i32).init(alloc);
    var i: i32 = 0;
    while (i < 100) {
        var item = try al.addOne();
        item.* = i;
        i += 1;
    return al;
pub fn main() !void {
    var gpalloc = std.heap.GeneralPurposeAllocator(.{}){};
    defer std.debug.assert(!gpalloc.deinit());
    const gpa = &gpalloc.allocator;
    const stdout =;
    var array_list = try arraylist_user(gpa);
    defer array_list.deinit();
    for (array_list.items) |*item| {
        try stdout.print("{}\n", .{item.*});

So, in this little snippet we have used several new things, Generic types, Allocator, and the defer keyword(which go users will immediately recognize). Zig does not have a default allocator, and it wants you to pass the allocator you want to use to the function that allocates memory. This is rather different than C, where you would probably just use malloc/realloc/calloc/free to manage memory(Or define a macro that evaluates to that by default if you're building that kind of library). The reason that the documentation has an assert in gpalloc.deinit() is because this particular allocator can track memory leaks, so this causes it to report such occurrences. It also shows one of the syntactic divergences from C, for(array_list.items) |*item|. Pipes don't get much use in C other than as logical operators, but here it tells the loop to unwrap the value(a pointer to a position in the list), and calls it item. In Zig, you dereference this pointer with item.*. Another point of comparison to C++, the addOne method doesn't add the item itself, but returns a pointer to it, so that you can assign it afterwards.

Interestingly, for such a low level language, array_list.items isn't a pointer, it's a slice, which is a pointer+length, so in C++, we would say that it has most in common with a string_view.

Okay, so what if we want to go further? What if we want to do a reprisal of our C N-body simulator written with Raylib? In fact, that's not too bad. It's almost even easy. In fact, we can even use raylib without too much trouble.

To start off, we need to import raylib. Luckily, as it turns out, Zig's compiler can handle C interop quite nicely, it even has builtins for it. (Though, if you do this then you need to call the compiler with -lc -lraylib to make sure that it has all the c libraries linked into it).

const std = @import("std");
const ray = @cImport({

Okay, so the first thing we should do here is define a vector type, not the resizable array type of vector, but the directional vector. Since we prize reuse here, we're going to make it generic even though we probably don't really need to.

pub fn vector2(comptime value: type) type {
    return struct {
        const This = @This();
        x: value,
        y: value,
        //Write methods here!

Okay, so that's an easy enough definition. But let's add some vector math oriented methods to make using it easier and more pleasant.

    return struct {
        const This = @This();
        x: value,
        y: value,
        pub fn add(this: This, other: This) This {
            var t: This = v2{ .x = 0, .y = 0 };
            t = .{ .x = this.x + other.x, .y = this.y + other.y };
            return t;
        pub fn sub(this: This, other: This) This {
            return this.add(.{ .x = -other.x, .y = -other.y });
        pub fn scale(this: This, v: value) This {
            return .{ .x = this.x * v, .y = this.y * v };
        pub fn distance(this: This, other: This) value {
            const sqrt = std.math.sqrt;
            const pow = std.math.pow;
            return sqrt(pow(value, this.x - other.y, 2) + pow(value, this.y - other.y, 2));
        //Wrap the value around, like when pacman reaches the edge of the map
        pub fn wrap(c: This, a: This, b: This) This {
            var r: This = .{ .x = c.x, .y = c.y };
            if (r.x < a.x) {
                r.x = b.x;
            } else if (r.x > b.x) {
                r.x = a.x;
            if (r.y < a.y) {
                r.y = b.y;
            } else if (r.y > b.y) {
                r.y = a.y;
            return r;
        pub fn magnitude(a: This) value {
            const sqrt = std.math.sqrt;
            return sqrt(a.x * a.x + a.y * a.y);
        pub fn normalize(a: This) This {
            return a.scale(1.0 / a.magnitude());

Alright, the main things to note here are how we need to call std.math.pow, namely that we need to call it with a compile time type, in this case value. Later on we'll see it called with f32.

Now we need to define the type we use for particles, and while we're at it, we're going to make a shortcut to the kind of vector we're using here.

const v2 = vector2(f32);
const particle = struct {
    const This = @This();
    position: vector2(f32) = v2{ .x = 0, .y = 0 },
    velocity: vector2(f32) = v2{ .x = 0.0, .y = 0.0 },
    acceleration: vector2(f32) = v2{ .x = 0.0, .y = 0.0 },
    mass: f32,
    //Methods here!

We also need a radius property, but since it's derived from the mass, and later on that can change as the bodies absorb each other, so it needs to be a method. We should also write methods to determine if the particles overlap and just moving the position along by the velocity, as well as calculating the attraction between two particles.

const particle = struct {
    const This = @This();
    position: vector2(f32) = v2{ .x = 0, .y = 0 },
    velocity: vector2(f32) = v2{ .x = 0.0, .y = 0.0 },
    acceleration: vector2(f32) = v2{ .x = 0.0, .y = 0.0 },
    mass: f32,
    pub fn radius(this: *const This) f32 {
        var result: f32 = std.math.ln(this.mass) / std.math.ln(3.0);
        if (result > 40) {
            return 40 - std.math.ln(this.mass) / std.math.ln(5.0);
        return result;
    //Returns true if the two particles overlap
    pub fn overlaps(this: *const This, other: *const this) bool {
        var r1 = this.radius();
        var r2 = other.radius();
        var dist = this.position.distance(other.position);
        return (r1 + r2) > dist;
    //Handles the base movement
    pub fn motion_step(this: *This, timestep: f32) void {
        this.position = this.position.add(this.velocity.scale(timestep));
        this.velocity = this.velocity.add(this.acceleration.scale(timestep));
    pub fn attraction(this: *const This, other: *const This, g: f32) vector2(f32) {
        var dist = this.position.distance(other.position);
        var vector_to = other.position.sub(this.position).normalize();
        return vector_to.scale(g * (this.mass * other.mass) / std.math.pow(f32, dist, 2));

Now we should build a container for many particles, and the properties necessary to simulate them that aren't appropriate to put in the particle structures themselves. We need a method to initialize the particles to random positions, a method to handle the gravity simulation and such, and a method to actually draw the particles.

const ParticleCollection = struct {
    const This = @This();
    particles: [100]particle,
    window_start: vector2(f32) = v2{ .x = 0.0, .y = 0.0 },
    window_end: vector2(f32) = v2{ .x = 100, .y = 100 },
    timestep: f32 = 0.01,
    gravitational_constant: f32 = 1e-3,
    pub fn init_particles(this: *This, rand: *std.rand.Random) void {
        for (this.particles) |*p| {
            p.mass = @intToFloat(f32, rand.intRangeLessThan(i32, 1, 100));
            p.position = .{
                .x = @intToFloat(f32, rand.intRangeLessThan(i32, @floatToInt(i32, this.window_start.x), @floatToInt(i32, this.window_end.x))),
                .y = @intToFloat(f32, rand.intRangeLessThan(i32, @floatToInt(i32, this.window_start.y), @floatToInt(i32, this.window_end.y))),
            p.acceleration = .{ .x = 0.0, .y = 0.0 };
            p.velocity = .{ .x = 0.0, .y = 0.0 };
    pub fn step_world(this: *This) void {
        for (this.particles) |*p| {
            p.position = p.position.wrap(this.window_start, this.window_end);
        for (this.particles) |*a| {
            a.acceleration = .{ .x = 0.0, .y = 0.0 };
            for (this.particles) |*b| {
                //No self attraction please, allowing that would result in division by zero
                if (a == b)
                a.acceleration = a.acceleration.add(a.attraction(b, this.gravitational_constant));
    pub fn drawSystem(this: *const This) void {
        for (this.particles) |p| {
            ray.DrawCircle(@floatToInt(c_int, p.position.x), @floatToInt(c_int, p.position.y), p.radius(), ray.BLACK);

Note how we have to pass a random number generator into init_particles, this is inline with how Zig also requires that you pass the allocators into functions that require memory allocations to be done. You also see some of the somewhat jagged interaction between Zig and C, namely that Zig doesn't specify that its i32 type is equivalent to C's int type(which on many architectures it might not be), it also requires explicit conversions between floating point numbers and integers.

The main function here is the simplest part yet.

pub fn main() !void {
    const width = 800;
    const height = 450;
    ray.InitWindow(width, height, "Nbody");
    //This is very much *not* good practice, but it's the easiest way to start this
    var rand = std.rand.Xoroshiro128.init(0);
    //Don't initialize the particles yet.
    var p: ParticleCollection = .{ .particles = undefined };
    p.window_end = .{ .x = width, .y = height };
    while (!ray.WindowShouldClose()) {
        defer ray.EndDrawing();

And so we have a working prototype for an nbody simulator, considerably shorter than the C version of the same program.

Interestingly, it appears to be smaller compiled than the original program in C, with just zig build-exe nbody.zig -lc -lraylib, we get an executable of around 784Kb. With zig build-exe nbody.zig -lc -lraylib -OReleaseFast, we can get it down to 92Kb, and with the -OReleaseSmall option, we can get down to 84Kb.

All in all, we'd definitely watch Zig carefully, it's very well thought out, and if they get build their package manager right, then their ecosystem might become competitive with Rust's quite quickly. The langauge already quite nice to use, and it might not a bad choice for your next project you might consider doing in C or Rust if you're looking for a new language to mess around in.