Particles Tunneling Through Solid Earth

No, I'm not going to write about neutrinos, or anything else that is 'real'. Pixels are the particles of today's post. Their universe is very different than ours, it is solid in all directions, it is quantized(at a different level than ours).

At each moment, there are a countable number of choices.

To be a pixel is to be blind to all but your nearest neighbors, to be trapped in a matrix far more insidious and probably less pleasing than in the movie. Movement is always a cooperation between at least two particles here, there's no vacuum after all. So let's see what actions a pixel particle can perform:

  1. Move to any of the surrounding coordinates, swapping places with each pixel it displaces.
  2. Change Color
  3. Stop being active

Luckily, those things aren't too much of a tall order.

(require :img-genner)

(defparameter *ffmpeg-writer* (uiop:launch-program "ffmpeg -f png_pipe -i - -y  particles.webp" :input :stream))
(defparameter *image* (make-image 640 480))
(defstruct particle (x 0) (y 0))
(defparameter *particles* 
  (loop for i from 0 below 900 collect (make-particle :x (random 640) :y (random 480))))
(defun copy-pixel(x1 y1 x2 y2)
  (set-pixel *image* x1 y1 (get-pixel *image* x2 y2)))
(defun wrap(value min max)
  (if (< value min)
      (1- max)
      (if (>= value max)
          min value)))
(defun move-particle(particle dx dy)
  (incf (particle-x particle) dx)
  (setf (particle-x particle) (wrap (particle-x particle) 0 640))
  (incf (particle-y particle) dy)
  (setf (particle-y particle) (wrap (particle-y particle) 0 480))
  (swap-pixel (particle-x particle) (particle-y particle)
              (wrap (- (particle-x particle) dx) 0 640)
              (wrap (- (particle-y particle) dy) 0 480)))
(defun change-color(particle)
  (set-pixel *image* (particle-x particle) (particle-y particle) (get-random-color)))
(defun die(particle)
  (setf *particles* (remove particle *particles* :test #'eq)))

;;;; This is how we're going to choose what each particle does:
(defun perform-action(particle)
  (case (random 25)
    ((1 2 0 3 4 5 9  18 20 19 21) (move-particle particle (- (random 3) 1) (- (random 3) 1)))
    ((10 12 13 14 15 11 17 6 16 22 23 24) (change-color particle))
    ((7 ) (die particle))))
(loop while *particles*
      for count from 0
      do(loop for i in *particles*
              do(perform-action i))
      do(img-genner:save-image *image* (uiop:process-info-input *ffmpeg-writer*))
      do(print count))
(uiop:close-streams *ffmpeg-writer*)

It's mostly the same structure as before, and with this, we get the following

Kinda sparse huh? Well, we could add more particles, or we could make the particles leave streaks, so we'll do the latter, because that's what I want to write 🙂

If we replace swap-pixel with copy-pixel, then we get this:

If the first animation is a bunch of pixels tunneling through darkness, this is them devouring it.

That was generated with the following code:

(require :img-genner)

(defparameter *ffmpeg-writer* (uiop:launch-program "ffmpeg -f png_pipe -i - -y -pix_fmt yuv420p -compression_level 6  -qscale 50  particles.webm" :input :stream))
(defparameter *image* (make-image 640 480))
(defstruct particle (x 0) (y 0))
(defparameter *particles* 
  (loop for i from 0 below 60 collect (make-particle :x (random 640) :y (random 480))))
(defun copy-pixel(x1 y1 x2 y2)
  (set-pixel *image* x1 y1 (get-pixel *image* x2 y2)))
(defun wrap(value min max)
  (if (< value min)
      (1- max)
      (if (>= value max)
          min value)))
(defun move-particle(particle dx dy)
  (incf (particle-x particle) dx)
  (setf (particle-x particle) (wrap (particle-x particle) 0 640))
  (incf (particle-y particle) dy)
  (setf (particle-y particle) (wrap (particle-y particle) 0 480))
  (copy-pixel (particle-x particle) (particle-y particle)
              (wrap (- (particle-x particle) dx) 0 640)
              (wrap (- (particle-y particle) dy) 0 480)))
(defun change-color(particle)
  (set-pixel *image* (particle-x particle) (particle-y particle) (get-random-color)))
(defun die(particle)
  (setf *particles* (remove particle *particles* :test #'eq)))

;;;; This is how we're going to choose what each particle does:
(defun perform-action(particle)
  (case (random 25)
    ((1 2 0 3 4 5 9  18 20 19 21) (move-particle particle (- (random 3) 1) (- (random 3) 1)))
    ((10 12 13 14 15 11 17 6 16 22 23 24) (change-color particle))
    ;((7 ) (die particle))
(loop while *particles*
      for count from 0
      repeat 1500
      do(loop for i in *particles*
              do(perform-action i))
      do(img-genner:save-image *image* (uiop:process-info-input *ffmpeg-writer*))
      do(print count))
(uiop:close-streams *ffmpeg-writer*)


One of the other projects I've worked on, the one I'm not sure I've talked about here has been the timetracker tool. What it provides different than others that I have found is that it doesn't ask what you're doing, it just reads the window you're using and files it away for later, along with things like the number of keys pressed and the distance the mouse traversed.

With this you can build a usage pattern of your time on the computer. And honestly, it revealed something that I've should've known about myself, I switch between windows too often. That is, that I move around so much that my focus suffers, this is a pretty typical ADHD symptom, so nothing unusual there. That and that when I monitor my time and I'm aware of it, I tend to do a bit more work on projects rather than messing around on social media. Is that what people refer to as mindfulness? (please don't tell me)

But in the process of writing this horribly intrusive program, which monitors every input to your computer, discarding almost all the details, but still requiring membership in the input group.

The web interface is a study of barren styling and minimal svg rendering, but it lets you customize the categories the events it collects are entered as in your nifty little chart, matching by the window title or class of the window. It's buggy and easily broken by software like kdenlive(due to some specific circumstances in its window classes, but I've never found a reason to investigate), but it's a personal tool so it doesn't bother me too much.

I guess I don't really have much of a point here, just that I'm pleased that it works as well as it does, and that, for the moment, I can still use X11 as my display server.

A chart produced for the day I wrote this blog post

The fact of the matter is that this is likely going to go away in Linux at some point. At least in its current incarnation; it will not be able to be made to work with Wayland without a number of backends that I don't have the interest to develop at this time(I personally use KDE and don't really feel like messing around on other desktops).

Of course at this point we might as well just have it rip the image from the gpu, run ocr on it and guess at which parts are the window titles, but that sounds odious, infeasible, and even more prone to abuse.

Either way, this project remains in a strange place right now. But then I suppose that fact is entirely of my own making as some of the quandries I face with it are derived from the fact that I don't believe I should put things out into the world that can be used for evil. People rarely consider themselves to be doing evil, so any promise on their part to abstain from it is unlikely to be effective.

Recreating an old Visualizer I swear I saw

I have vague memories of a visualizer that I will likely never encounter without looking for it again. It was a waveform display but it had these rectangles that would ride them like boats, not really like surfing. Now, you might be wondering why I'm doing this when my library creates, at best, video.

Well, I'm not even going to recreate that much of it, doing the discrete waveform handling is always a massive pain, so instead we'll cheese it with the combined magic of img-genner, and the magic of limits.

(require :img-genner)
 | The idea here is that the rectangles ripple along a sine wave, rotating with
 | it as it moves.
 | To determine the angle the derivative of the wave must be known at the
 | center of the rectangles, and that must be turned into an angle

(defvar *rectangles* (loop for i from 0 below 24
                           collect(make-instance 'img-genner:rectangle
                                                 :width 10 :height 10
                                                 (img-genner:point (* 30.0 i) 240.0))))

With that in mind we can implement our derivation code, and the slope to angle conversion function. Normally, I wouldn't recommend using defconstant unless you're really sure that the constant won't change.

(defconstant +epsilon+ 0.0001)
(defun derive-at-point(function x)
  (/ (-
      (funcall function (+ x +epsilon+))
      (funcall function x))

(defun slope-to-angle(rise-over-run)
  (atan rise-over-run))

Then we have to write the wave and coordinate conversion functions. We need the coordinate conversions because trigonometric functions aren't very great over the 640.

(defun wave-1(x)
  (+ (sin (* 2.0 x))
     (sin (/ x 3))
     (sin x)))

(defun xcoord(x time)
  (+ (/ x 10.0) time))

Then we need to be able to update each rectangle for a given time and with a given wave function.

 | Move the rectangles to match the wave's height at that point
(defun match-wave-height(func time rectangle)
  (let* ((origin (slot-value rectangle 'img-genner:origin))
         (x (xcoord (aref origin 0) time)))
    (setf (aref origin 1) (+ 240 (* (funcall func x) 10)))
 | This does the same for the angles
(defun match-wave-angle(func time rectangle)
  (let* ((origin (slot-value rectangle 'img-genner:origin))
         (x (xcoord (aref origin 0) time))
         (dy (derive-at-point func x)))
    (setf (slot-value rectangle 'img-genner:rotation) (slope-to-angle dy))
; And this runs both
(defun update-rectangles(func time)
  (loop for i in *rectangles*
        do(match-wave-height func time i)
        do(match-wave-angle func time i))

Then we need to handle the streams and the drawing of it all.

(defparameter *image* (img-genner:make-image 640 480))
(defun draw-rectangles()
  (loop for i in *rectangles*
        do(img-genner:fill-shape i *image* (img-genner:static-color-stroker (img-genner:rgb 255 0 0)))))

(defun draw-wave(func time)
  (loop for ox from 0 below 640
        for x = (+ time (/ ox 10.0))
        for y = (- 480 (+ 240 (* 10 (funcall func x))))
                ; This is a cheap way to do a thick line poorly
                do(loop for off from 0 below 2
                        do(setf (aref *image* (floor (+ y off)) ox 2) 255))

(loop for time from 0.0 by (/ 1.0 25.0)
      for frame from 0
      repeat 1000
      do(update-rectangles #'wave-1 time)
      do(reset-image *image*)
      do(draw-wave #'wave-1 time)
      do(img-genner:save-image *image* (uiop:process-info-input *ffmpeg-writer*))
      do(print frame)

(uiop:close-streams *ffmpeg-writer*)
(uiop:wait-process *ffmpeg-writer*)

And with that we get the above. There's a lot of room for improvement

img-genner Switching away from cl-png

cl-png is a long lived package that mostly works, but it contains a single flaw, it relies on the presence of a library and the competence of the implementation in getting to it.

It grabs the source and tries to build it. That's one reason that I don't much care for it. As much as I don't mind C, I'd rather not force common lisp programmers to interact with that whole can of worms, and if that is its behavior in windows, then there may simply not be a suitable C compiler.

In the place of cl-png will slide pngload and zpng. zpng is a much more capable interface and pngload is actually remarkably fast. However, zpng and pngload are likely heavier than libpng, but, ultimately, the thing that takes time in this library is the rendering itself or pixel manipulation.

In the face of the immense demands of requiring a C compiler introduces, it's not unreasonable to trade a little bit of performance somewhere non-critical for ease of distribution.

The switch was remarkably painless for pngload, but zpng had a few differences that proved a tad inconvenient for me in paritcular

  1. The format was different than the one I had chosen.
    In cl-png the format used for images was a 3 dimensional array rather than a 1 dimensional array as zpng uses.

    I use a 3 dimensional array because it maps more closely to how I think about images,
  2. zpng has an accessor to obtain the 3 dimensional array, but it lacks a setter(which is probably all well and fair as the translation could be troublesome on the other end).
    What would've made this easier for me would've been if zpng had provided a conversion function(maybe it did and I missed it), but writing it myself wasn't terribly difficult in the end.

Rescaling Video Using img-genner

There's not much of a reason to do this because ffmpeg has a much faster and nicer scaler, but maybe we want to write common lisp to handle an effect. When I introduced img-genner, I wasn't sure that this was the direction to take, but now it seems appropriate, if nothing else.

You will most likely want to consult the installation instructions if you do not yet have img-genner set up.

So let's start out with the following to import the libraries we need, as well as start the ffmpeg processes.

(ql:quickload :img-genner)
(ql:quickload :pngload)
(defparameter *ffmpeg-reader* (uiop:launch-program "ffmpeg -i wander.mp4 -c:v png -pix_fmt rgb24 -f image2pipe -" :output :stream))
(defparameter *ffmpeg-writer* (uiop:launch-program "ffmpeg -f png_pipe -i - -y -b:v 1M hello4.webm" :input :stream))

We need pngload because it handles reading from the pipe correctly whereas the libpng binding I have been using does not seem to. I chose to output a webm because it can provide playback sooner, but of course there's no reason that you can't use whatever funky codec format that you prefer.

Okay, so overall, the flow should look like this:

graph LR; a(*ffmpeg-reader*) a -->scaler scaler --> b b(*ffmpeg-writer*)

Hmm, that's not very interesting though, so let's add the implicit stuff and a neat function I recently wrote.

graph LR; a(*ffmpeg-reader*) --> decode(pngload:decode) -->data(pngload:data)--> scaler; scaler --> colorizer(colorize-naive); colorizer-->b(*ffmpeg-writer*) b-->a;

There, that's an excuse to use this library!

In common lisp I couldn't remember if there was a way to check for eof explicitly, so instead I just have it handle the end-of-file condition to terminate this.

    (loop with i = 0
          with colors = (loop repeat 16 collect (img-genner:get-random-color))
          for input = (pngload:data (pngload:load-stream (uiop:process-output *ffmpeg-reader*)))
          for output = (img-genner:colorize-naive (img-genner:upscale-image-by-factor input 0.5 0.5) colors)
          do(img-genner:save-image output (process-input *ffmpeg-writer))
  (end-of-file (c)

Before you run the code like this though, you'll need to add this at the end:

(uiop:close-streams *ffmpeg-reader*)
(uiop:wait-process *ffmpeg-reader*)
(uiop:close-streams *ffmpeg-writer*)
(uiop:wait-process *ffmpeg-writer*)

This closes the input stream, causing ffmpeg to close as gracefully as possible and then collecting its exit code in order to let the process be cleaned up.

Okay, but that's not very efficient. For one thing, it's fairly slow, but it's also parallelizable. You can use the pcall library that img-genner uses in order to make it faster. However, this will require some significant changes in order to make it work well. We need to turn it into tasks and then wait for them in order.

We want to load as many as possible at once, but we have a limited amount of memory and if you're using sbcl it may be a rather painful experience once you hit your heap limit, so we can't load the entire video at once unless it is very small or you have a monstrous amount of memory and have configured your lisp to use it.

(loop ...
      for jobs = (loop with memory-used = 0
                       for input = (handle-case 
                                      (pngout:data (pngload:load-stream (uiop:process-output *ffmpeg-reader*)))
                                      (end-of-file (c) '())
                       while input
                       do(incf memory-used (array-total-size input))
                       until (> memory-used 20000000)
                       collect (let ((input input) (i (incf i)))
                                       (prog1 (img-genner:colorize-naive (img-genner:upscale-image-by-factor input 0.75 0.75) colors)
                                              (print i)))))
     while jobs
     do(loop for i in jobs
             do(img-genner:save-image (pcall:join i) (uiop:process-input *ffmpeg-writer*)))

This is a very substantial change to the control flow as you can see. A few things that are important to note, in pexec calls, you need to have made a closure of whatever variables that change that you use inside pexec, otherwise they may be modified mid use, and quite simply, that's not ideal.

This is what could be called a "Bulk Sequential Process" design, and while it uses all the processors on your system effectively, the utilization tends to have ups and downs unless it is dispatched very carefully. In this case in particular, even if we did everything perfectly, we're just communicating with ffmpeg over a pipe, and there's a lot of overhead there, and even then, it is possible to outpace the encoder ffmpeg uses, even in common lisp(This is more likely with AVIF or vp8/vp9 rather than h264/h265 in my experience), and during these times your program is sitting around doing nothing, and while ffmpeg might be keeping your computer busy at these times, it likely as not won't be able to parallelize on the images of a size that are useful, and for things with unpredictable runtimes, it very well be a bottleneck.

Now theoretically, it should be possible to also splice the audio from the first file into the second with something like a named pipe, or if that fails, just exporting the audio on its own and then using it in the encoding command, but that sounds like a project in itself.

Drawing with img-genner

I talked about img-genner's glitch functions last time. Now you get to see the messier part of the library, as of (january 15th 2022), the part that needs refactoring and reconsideration. For one thing it is enormously picky in terms of types. So much so that much of the code broke regularly with only slightly wrong input. In addition, since I was not yet entirely aware of how the common lisp object system(CLOS) worked, the design does not flow nicely with the mechanics.

I need to warn you that I have a low opinion of this code and thus am looking to rewrite it, it doesn't have a consistent interface and too much of it is hidden away from the user, so this is part tutorial, part self-guided criticism; this isn't a very well organized blog post.

So, is there much value in it? Well, I tend to think so, but I suppose that I might be biased. What works well is the polygon code, the line code, the ellipse code, when it's not complaining that it's not an explicitly typed array you passed in, is quite functional.

Let's start with another file with something simple. Let's get a spray of ellipses.

(ql:quickload :img-genner)
(defparameter *image* (img-genner:make-image 800 800))
(defparameter *circles* nil)
(loop for i from 0 to 10
      do(push (img-genner:make-ellipse (random 800) (* i 20) 20.0 10.0) *circles*))
(loop for i in *circles*
      do(img-genner:fill-shape i *image* (img-genner:static-color-stroker (img-genner:get-random-color))))
(img-genner:save-image *image* "ellipses.png")
A spray of various colors of ellipses.

In this, make-ellipse is a helper function designed to save the user from specifying everything necessary in a make-instance call. Static-color-stroker creates a stroker, which is similar to a brush in various graphics systems where they probably decided deliberately to call it anything other than a stroker. In the case of static-color-stroker it creates a closure over a function that just contains the desired color value, the definition may be seen below.

(defun static-color-stroker(color)
  (lambda (i x y frac)
    (declare (ignore frac)
             (type fixnum x y)
             (dynamic-extent color))
    (set-pixel i x y color)

Other strokers are possible, gradient-color-stroker creates a gradient over the value of frac, or radial-gradient-stroker, which takes two colors, a coordinate in the image, and a max-radius parameter that specifies the distance from the first color to shift to the second.

There are of course other varieties of shapes, such as polygon and convex-polygon(though, they should convert between each other automatically), and they have different interfaces.

We can add (img-genner:fill-shape (img-genner:make-regular-polygon 400.0 400.0 5 30.0) *image* (img-genner:static-color-stroker (img-genner:get-random-color))) to get an additional pentagon as below

The ellipses from before but now with a pentagon in the center

As you can see, there are issues between boundaries in the triangles that it breaks it down into.

classDiagram class Shape{ float rotation; float~2~ origin; } class Polygon{ List~Point~ points; } class Ellipse{ float~2~ radius; } class Rectangle{ float width float height } Shape *-- Ellipse Shape *-- Polygon Shape *-- Rectangle Polygon *--ConvexPolygon

I suppose that rectangles are a type of polygon, but unlike proper polygons, it doesn't make sense to represent them as a series of points, but representing that relationship properly would require more work than it seems like it's worth.

I've been finding that animating with this library is interesting. So let's make those ellipses vibe. I know, this is a still image library, but that doesn't have to stop us. All we have to do is write two functions, one to reset the image, and another to move the ellipses.

We'll also have to assign the ellipses consistent colors, otherwise they will flash constantly, and I doubt that's considered appealing.

(defun reset-image()
  (loop for i from 0 below (array-total-size *image*)
        do(setf (row-major-aref *image* i) 0)))
(defparameter *circles* nil)
(defparameter circle-colors (make-hash-table :test 'equal))
(defparameter circle-speeds (make-hash-table :test 'equal))
(defun mutate()
  (loop for i in *circles*
        do(incf (aref (gethash i circle-speeds) 0) (- (random 1.0) 0.5))
        do(incf (aref (gethash i circle-speeds 0.0) 1) (- (random 1.0) 0.5))
        do(img-genner:rotate-by i (- (random 1.0) 0.5))
        do(img-genner:move-by i (aref (gethash i circle-speeds) 0) (aref (gethash i circle-speeds) 1))))
(loop for i in *circles*
      for c = (img-genner:get-random-color)
      do(setf (gethash i circle-colors) c)
      do(setf (gethash i circle-speeds) (img-genner:point 0.0 0.0))
      do(img-genner:fill-shape i *image* (img-genner:static-color-stroker (img-genner:get-random-color))))

Then we run an animation loop for 160 frames

(loop for i from 0 below 160
      do(loop for circle in *circles*
              do(img-genner:fill-shape circle *image* (img-genner:static-color-stroker (gethash circle circle-colors))))
        do(img-genner:save-image *image* (format nil "~4,'0d.png" i))

And then with the proper ffmpeg invocation, you should end up with something like the following

The ellipses vibe subtly

For the above animation, the following command was used to generate it.

ffmpeg -i %04d.png -vcodec libwebp -filter:v 'fps=fps=20' -pix_fmt yuv420p -q:v 6 -p loop 0 output.webp

This is nice and all, but if we ran the animation for longer then eventually all the ellipses might end up offscreen, which makes viewing significantly less interesting. So we need another function to wrap the position to within the bounds.

(defun bound(pos min-x min-y max-x max-y)
  (setf (aref pos 0) (if (> (aref pos 0) max-x) min-x (if (< (aref pos 0) min-x) max-x (aref pos 0)))
        (aref pos 1) (if (> (aref pos 1) max-y) min-y (if (< (aref pos 1) min-y) max-y (aref pos 1)))))

And then we need to actually use slot-value to obtain the 'img-genner:origin symbol, and then all we need is to add the bounding to the mutate function.

(defun mutate()
  (loop for i in *circles*
        do(incf (aref (gethash i circle-speeds) 0) (- (random 1.0) 0.5))
        do(incf (aref (gethash i circle-speeds 0.0) 1) (- (random 1.0) 0.5))
        do(img-genner:rotate-by i (- (random 1.0) 0.5))
        do(img-genner:move-by i (aref (gethash i circle-speeds) 0) (aref (gethash i circle-speeds) 1))
        do(bound (slot-value i 'img-genner:origin) 0.0 0.0 800.0 800.0)))

If everything worked properly, then it should be easy to change the number of frames to 250 and generate something like the following.

In the future however, I think I would like to refactor this so that it isn't necessary to work with the shape objects as the objects are really overkill for drawing ellipses and regular polygons, most libraries at this sort of level just use functions to write directly to the bitmap. I think that's much easier to fool around with since there's less state to manage.

Update: See this post to avoid needing to run ffmpeg on the output.

Introducing img-genner, a Common Lisp Image Synthesis Library

I have been working on the img-genner library(repo here) for quite a long time now, over three years if my memory serves me correctly. It contains a number of different modules, and frankly, it is a bit of a Frankenstein. At first it was about polygon graphics, then it became about pixel 'glitch' effects, such as pixel sorting and other stuff, partially because I wanted to write the algorithms myself and gain an understanding of them from implementation.

I'm still not quite sure about the triangularization algorithm(which we won't demonstrate here because I'm not sure it still works), but for the most part it has worked in that way.

For the examples here I am assuming that you have a functional roswell/common lisp environment with quicklisp and all the other amenities.

Getting the library

First clone it to your local projects directory, the command that will work for you will vary from this if you are not using roswell. As of now I have no plans to put this library on quicklisp unless there is some interest.

$> git clone ~/.roswell/local-projects/img-genner

Now you should be able to execute ros run and successfully run (ql:quickload :img-genner) in the repl. But that's probably not a great experience. We need an environment that is at least somewhat durable and reset-able. So let's make and go into a project folder

$> mkdir post-examples-that-are-totally-cool
$> cd post-examples-that-are-totally-cool

Add two very cool images.

Lunala Avi

And a file called something like post.lisp

To start we should probably add load the library, but it would also be useful to have some way to clean up the folder if we don't like what we have so far.

(ql:quickload :img-genner)
(defun clean()
  (loop for i in (directory "*.png")
        when(not (member (pathname-name i) '("lunala" "fox-profile") :test #'string=))
          do(print i) and do(delete-file i)))

With that we can clear any new images that we decide to make.

We should also add (ql:quickload :img-genner) to the top of the file to make loading it easier(no need for defpackage or asdf yet).

So, let's see, let's... downscale the lunala picture. So add this to the end of post.lisp

                                        ; Using defparameter reloads the file each time the compiler 
                                        ; processes it again, which might be useful
(defparameter *lunala* (img-genner:load-image "lunala.png"))
(img-genner:save-image (img-genner:downscale-x2 (img-genner:downscale-x2 *lunala*)) "lunala-smol.png")

Well, that looks more or less as you would expect.

Let's also load the fox-profile picture and then combine the two images.

(defparameter *fox-profile* (img-genner:load-image "fox-profile.png"))
                                           ; The 30 30 refers to the size of the tiles in pixels. 
                                           ; In general, the smaller the tiles the longer this takes
(img-genner:save-image (img-genner:mosaify *lunala* *fox-profile* 30 30) "mosaic.png")

So this has combined the two images by replacing blocks with the most similar blocks of the other image. We can change the way that it scores similarity by changing the distance metric used.

If we use the color-brightness distance function then we can care more about the brightness of the tiles rather than color.

(img-genner:save-image (img-genner:mosaify *lunala* *fox-profile* 30 30 
                                           :distance #'img-genner:color-brightness)

You can see the result below.


And before I finish this post, let me show you a pixel-sort from the library. (This will be rather slow)

(img-genner:save-image (img-genner:fuck-it-up-pixel-sort *lunala* 600 600) "sorted.png")
Sorted.png(run through pngquant to reduce the file size)

Forth and Optimization

Forth is a strange old language. It's among a relatively small number of concatenative languages that achieved any measure of success. Anyway, this isn't something we're going to go into much here because we're thinking about something nearly tangential to the language itself.

How you can make it fast.

Stacks are somewhat slow compared to registers. Using the cache has a cost, even when your program is using optimal access patterns, especially on load store architectures(which seem to be coming back into vogue again).

Anyway, on register machines, forth isn't much in terms of hot shit. Sure there are compilers that manage to do great with it, but that doesn't change the fact that as an execution model it kinda leaves a lot to be desired these days. The other issue is that it's not entirely clear how one is to write a compiler that does that kind of transformation on the execution, or it isn't to us, though that's probably more a gap in our knowledge than any sort of extreme difficulty.

Okay, so forth is simple enough, it has a stack of words of some size, a return stack, and no particular datatypes handled seamlessly other than integers of the word size. Each function(in forth such things are called 'words' because they, at least in the past, were looked up in a 'dictionary), each function can consume any number of items from the stack and result in any number of items being placed onto the stack. There's some stuff about immediate mode execution (compile time stuff), but it uses the same kind of model.

This is hard to map onto registers because a function can return any number of things. But there's a possibility for analysis here that might help. Let's make a simple function that rotates the top three items on the stack twice, so that the topmost item becomes the bottom-most.

: rot2 ( a b c -- b a c ) 
   rot ( 3 in, 3 out)
   rot ( 3 in, 3 out) ;

The first line, the definition doesn't really enforce anything here. The ( a b c -- c b a ) is just a comment to help meagre creatures that need reminders of such things (as opposed to the gods of forth or whoever). But what we can see is that by the end of the function there it remains clearly balanced, it will consume and emit 3 items no matter what.

Let's assume we can assemble rot into something sensible, we're going to use some generic bastardized assembly here because it's the best we can do.

  move S0, T0
  move S1, S0
  move S2, S1
  move T0, S2

Of course, in general you'd want to use something like Single Static Assignment as an intermediate here, but we are even less familiar with that than assembly.

To call this without any analysis we think you would need to load the information from the stack into the registers and then call into the function, but since we know how many registers we need, among other things, we can do something like compile rot2 into something like this.

  pop S2
  pop S1
  pop S0
  call ROT
  call ROT
  push S0
  push S1
  push S2

But then that's not really realistic for our desired solution. Ideally it would generate native code through a transpiler into C, then calling whatever C compiler the user selects.

void rot(word *a, word *b, word *c){
    word tmp=*a;
: quadratic ( a b c x -- n )
 >R swap r@ * +
 swap R> dup * * + ;
: rot2 ( a b c -- c b a )
  rot rot ;
: quadratic ( a b c x -- n )
  >R ( R+1, S-1 )
  swap ( R,S )
  R@ ( R, S+1)
  * ( R, S-2,S+1 )
  + ( R, S-2,S+1 ) 
  R> ( R-1, S )
  dup ( R, S+1)
  * ( R, S-1)
  R@ ( R,S+1) 
  * (R, S-1) + ;

With some luck, the C compiler has a large attention span and optimization space. If it doesn't, then it may not be able to optimize the threaded code at all. And more to the point, the compiler still needs to try to be smart about combining operations prior to that, because some idioms in forth may not have obvious optimization opportunities to a C compiler.

This means that at some point, there is a need to accept the call, there's a need to write a compiler that understands the semantics as they apply to the underlying machine that you are targeting.

Of course, when you're Forth and everything is just words to you, maybe C compilers can peer deeply enough to handle most arithmetic optimizations you might want.

But part of what makes Forth attractive is that it's simple to implement. That it's easy to write something that'll give you lots of program space. It is a low level language, it doesn't bother itself tracking types, it trusts the user, for better for worse, and that's what makes forth so fast. It doesn't pause for garbage collection just as it doesn't tag its types.

Ultimately this sort of environment is more like DOS than it is our memory protected machines these days, and that means that this is likely less than suitable for handling untrusted input(not that you can't write 'provably' correct code in forth).

SB-SIMD, Early and Promising

Common lisp is a dream. Not always a great dream(such as when using strings), but it's sufficient and SBCL is a remarkably interesting compiler. One that tells you how you might let it make the code faster, such as adding type hints, etc.

More recently, I've had the opportunity to work with sb-simd, a library which adds automatic vectorization for certain forms. (Right now it needs to be done in a do-vectorized and it seems to be missing a few SIMD intrinsics, not to mention that if it can't vectorize what you give it it will fail loudly). The general recommendations are to also separate this sort of specialized code into another package that is loaded when it is available.

It's very rough around the edges, but for very simple things it works very well, as in, billions of subtractions a second more than otherwise.

However, it is a framework that was intended to provide acceleration for some sort of machine learning task, so it is much more developed with regards to the floating point type intrinsics than the integer types, but that actually appears to be more a limitation of what intel provides, but maybe that will change(and I'm lead to believe some of the missing operations can be substituted for, but I'm not sure which).

In any case it is likely to improve somewhat further before it makes it into SBCL's contributed libraries, and who knows, in the future it might be part of SBCL's built-in optimizer.

For example, below are a standard lisp version of a diff-image function, and an auto-vectorized one.

(defun diff-image(i1 i2 &optional result-image)
  (declare (type (simple-array u8 (* * 3)) i1 i2)
           (type (or (simple-array u8 (* * 3)) null) result-image)
           (optimize speed (safety 0)))
  (let ((r (if result-image
               (make-array (list (array-dimension i1 0) (array-dimension i1 1) 3) :element-type 'u8 :initial-element 0))))
    (declare (type (simple-array u8 (* * 3)) r))
    (do-vectorized (x 0 (1- (the u64 (array-total-size i1))))
      (setf (u8-row-major-aref r x)
            (u8- (u8-row-major-aref i1 x)
                  (u8-row-major-aref i2 x))))

(defun diff-image-slow(i1 i2 &optional result-image)
  (declare (type (simple-array u8 (* * 3)) i1 i2)
           (type (or null (simple-array u8 (* * 3))) result-image)
           (optimize speed (safety 0)))
  (loop with result = (if result-image result-image (make-array (list (array-dimension i1 0) (array-dimension i1 1) 3) :element-type 'u8 :initial-element 0))
        for y from 0 below (array-dimension i1 0)
        do(loop for x from 0 below (array-dimension i1 1)
                do(setf (aref result y x 0)
                        (- (aref i1 y x 0) (aref i2 y x 0))
                        (aref result y x 1)
                        (- (aref i1 y x 1) (aref i2 y x 1))
                        (aref result y x 2)
                        (- (aref i1 y x 2) (aref i2 y x 2))))
        finally(return result)))

Interestingly, according to the author of sb-simd, this is probably the first code ever written to use it with integers, so it took some work on their end before it worked here.

Trying Out Zig

So, we have heard good things about Zig. These boil down to the following things:

  • Good speed
  • Fast compilation
  • Decent type system
  • Simple Syntax

So far, a lot of these things seem to be born out by the experience we've had, though, we have some criticism that is probably more along the lines of our taste in programming, rather than issues we expect to be universal. For fairness sake, let's start with things we like, once again expressing our preferences rather than universals.

  • Low Level
    Low level languages feel liberating to us, because we get to make the decisions, rather than the compiler or virtual machine.
  • Trusts you, mostly.
    This is mostly in comparison to Rust, whose borrow checker we have wrestled with a great deal and whose Turing complete type system has, in the past, left us mystified when we have run into errors.
    And if you really, really want to throw caution to the wind, you can mostly just tell zig that you don't give a damn about whether or not the pointer is to a single element or an array.
  • Compile Time computation, Generics
    This is how generics are implemented, and for the most part, it provides a good experience, especially since it compares exceedingly favorably with pre-c++17 C++.
  • Not afraid of Bare Pointers
    Bare pointers aren't something to worry about too much, especially if they're typed. Bare pointers are quite a bit faster and lighter than other (tagged) references.

And honestly, we're impressed at how tight the language feels. There are some facilities we're not sure about, and some that we wish it had, but overall, this isn't a bad language. We like it more than Rust(which, admittedly, really isn't saying much for us), though, from familiarity alone we're likely to revert to using C or C++, but we're not nearly as skeptical as we once were about this little language.

The syntax is familiar enough, pretty similar to C likes, though, with a few elements that we can't really attribute to any language we know(which doesn't mean very much). So, let's see what something simple looks like.

const std = @import("std");
pub fn main() !void {
    const stdout =;
    var i: usize = 1;
    while (i < 10) {
        try stdout.print("{}\n", .{i});
        i += 1;

You can see that Zig is careful about errors, requiring that you either mark that main can return an error (designated with !void), or handle the error somehow. Anything that can fail in Zig has a type denoted by !<type>, which requires you to think about what errors can happen, or so communities as Go users and Rust users insist.

Let's see something that differs even more significantly than C.

const std = @import("std");
fn arraylist_user(alloc: *std.mem.Allocator) !std.ArrayList(i32) {
    var al = std.ArrayList(i32).init(alloc);
    var i: i32 = 0;
    while (i < 100) {
        var item = try al.addOne();
        item.* = i;
        i += 1;
    return al;
pub fn main() !void {
    var gpalloc = std.heap.GeneralPurposeAllocator(.{}){};
    defer std.debug.assert(!gpalloc.deinit());
    const gpa = &gpalloc.allocator;
    const stdout =;
    var array_list = try arraylist_user(gpa);
    defer array_list.deinit();
    for (array_list.items) |*item| {
        try stdout.print("{}\n", .{item.*});

So, in this little snippet we have used several new things, Generic types, Allocator, and the defer keyword(which go users will immediately recognize). Zig does not have a default allocator, and it wants you to pass the allocator you want to use to the function that allocates memory. This is rather different than C, where you would probably just use malloc/realloc/calloc/free to manage memory(Or define a macro that evaluates to that by default if you're building that kind of library). The reason that the documentation has an assert in gpalloc.deinit() is because this particular allocator can track memory leaks, so this causes it to report such occurrences. It also shows one of the syntactic divergences from C, for(array_list.items) |*item|. Pipes don't get much use in C other than as logical operators, but here it tells the loop to unwrap the value(a pointer to a position in the list), and calls it item. In Zig, you dereference this pointer with item.*. Another point of comparison to C++, the addOne method doesn't add the item itself, but returns a pointer to it, so that you can assign it afterwards.

Interestingly, for such a low level language, array_list.items isn't a pointer, it's a slice, which is a pointer+length, so in C++, we would say that it has most in common with a string_view.

Okay, so what if we want to go further? What if we want to do a reprisal of our C N-body simulator written with Raylib? In fact, that's not too bad. It's almost even easy. In fact, we can even use raylib without too much trouble.

To start off, we need to import raylib. Luckily, as it turns out, Zig's compiler can handle C interop quite nicely, it even has builtins for it. (Though, if you do this then you need to call the compiler with -lc -lraylib to make sure that it has all the c libraries linked into it).

const std = @import("std");
const ray = @cImport({

Okay, so the first thing we should do here is define a vector type, not the resizable array type of vector, but the directional vector. Since we prize reuse here, we're going to make it generic even though we probably don't really need to.

pub fn vector2(comptime value: type) type {
    return struct {
        const This = @This();
        x: value,
        y: value,
        //Write methods here!

Okay, so that's an easy enough definition. But let's add some vector math oriented methods to make using it easier and more pleasant.

    return struct {
        const This = @This();
        x: value,
        y: value,
        pub fn add(this: This, other: This) This {
            var t: This = v2{ .x = 0, .y = 0 };
            t = .{ .x = this.x + other.x, .y = this.y + other.y };
            return t;
        pub fn sub(this: This, other: This) This {
            return this.add(.{ .x = -other.x, .y = -other.y });
        pub fn scale(this: This, v: value) This {
            return .{ .x = this.x * v, .y = this.y * v };
        pub fn distance(this: This, other: This) value {
            const sqrt = std.math.sqrt;
            const pow = std.math.pow;
            return sqrt(pow(value, this.x - other.y, 2) + pow(value, this.y - other.y, 2));
        //Wrap the value around, like when pacman reaches the edge of the map
        pub fn wrap(c: This, a: This, b: This) This {
            var r: This = .{ .x = c.x, .y = c.y };
            if (r.x < a.x) {
                r.x = b.x;
            } else if (r.x > b.x) {
                r.x = a.x;
            if (r.y < a.y) {
                r.y = b.y;
            } else if (r.y > b.y) {
                r.y = a.y;
            return r;
        pub fn magnitude(a: This) value {
            const sqrt = std.math.sqrt;
            return sqrt(a.x * a.x + a.y * a.y);
        pub fn normalize(a: This) This {
            return a.scale(1.0 / a.magnitude());

Alright, the main things to note here are how we need to call std.math.pow, namely that we need to call it with a compile time type, in this case value. Later on we'll see it called with f32.

Now we need to define the type we use for particles, and while we're at it, we're going to make a shortcut to the kind of vector we're using here.

const v2 = vector2(f32);
const particle = struct {
    const This = @This();
    position: vector2(f32) = v2{ .x = 0, .y = 0 },
    velocity: vector2(f32) = v2{ .x = 0.0, .y = 0.0 },
    acceleration: vector2(f32) = v2{ .x = 0.0, .y = 0.0 },
    mass: f32,
    //Methods here!

We also need a radius property, but since it's derived from the mass, and later on that can change as the bodies absorb each other, so it needs to be a method. We should also write methods to determine if the particles overlap and just moving the position along by the velocity, as well as calculating the attraction between two particles.

const particle = struct {
    const This = @This();
    position: vector2(f32) = v2{ .x = 0, .y = 0 },
    velocity: vector2(f32) = v2{ .x = 0.0, .y = 0.0 },
    acceleration: vector2(f32) = v2{ .x = 0.0, .y = 0.0 },
    mass: f32,
    pub fn radius(this: *const This) f32 {
        var result: f32 = std.math.ln(this.mass) / std.math.ln(3.0);
        if (result > 40) {
            return 40 - std.math.ln(this.mass) / std.math.ln(5.0);
        return result;
    //Returns true if the two particles overlap
    pub fn overlaps(this: *const This, other: *const this) bool {
        var r1 = this.radius();
        var r2 = other.radius();
        var dist = this.position.distance(other.position);
        return (r1 + r2) > dist;
    //Handles the base movement
    pub fn motion_step(this: *This, timestep: f32) void {
        this.position = this.position.add(this.velocity.scale(timestep));
        this.velocity = this.velocity.add(this.acceleration.scale(timestep));
    pub fn attraction(this: *const This, other: *const This, g: f32) vector2(f32) {
        var dist = this.position.distance(other.position);
        var vector_to = other.position.sub(this.position).normalize();
        return vector_to.scale(g * (this.mass * other.mass) / std.math.pow(f32, dist, 2));

Now we should build a container for many particles, and the properties necessary to simulate them that aren't appropriate to put in the particle structures themselves. We need a method to initialize the particles to random positions, a method to handle the gravity simulation and such, and a method to actually draw the particles.

const ParticleCollection = struct {
    const This = @This();
    particles: [100]particle,
    window_start: vector2(f32) = v2{ .x = 0.0, .y = 0.0 },
    window_end: vector2(f32) = v2{ .x = 100, .y = 100 },
    timestep: f32 = 0.01,
    gravitational_constant: f32 = 1e-3,
    pub fn init_particles(this: *This, rand: *std.rand.Random) void {
        for (this.particles) |*p| {
            p.mass = @intToFloat(f32, rand.intRangeLessThan(i32, 1, 100));
            p.position = .{
                .x = @intToFloat(f32, rand.intRangeLessThan(i32, @floatToInt(i32, this.window_start.x), @floatToInt(i32, this.window_end.x))),
                .y = @intToFloat(f32, rand.intRangeLessThan(i32, @floatToInt(i32, this.window_start.y), @floatToInt(i32, this.window_end.y))),
            p.acceleration = .{ .x = 0.0, .y = 0.0 };
            p.velocity = .{ .x = 0.0, .y = 0.0 };
    pub fn step_world(this: *This) void {
        for (this.particles) |*p| {
            p.position = p.position.wrap(this.window_start, this.window_end);
        for (this.particles) |*a| {
            a.acceleration = .{ .x = 0.0, .y = 0.0 };
            for (this.particles) |*b| {
                //No self attraction please, allowing that would result in division by zero
                if (a == b)
                a.acceleration = a.acceleration.add(a.attraction(b, this.gravitational_constant));
    pub fn drawSystem(this: *const This) void {
        for (this.particles) |p| {
            ray.DrawCircle(@floatToInt(c_int, p.position.x), @floatToInt(c_int, p.position.y), p.radius(), ray.BLACK);

Note how we have to pass a random number generator into init_particles, this is inline with how Zig also requires that you pass the allocators into functions that require memory allocations to be done. You also see some of the somewhat jagged interaction between Zig and C, namely that Zig doesn't specify that its i32 type is equivalent to C's int type(which on many architectures it might not be), it also requires explicit conversions between floating point numbers and integers.

The main function here is the simplest part yet.

pub fn main() !void {
    const width = 800;
    const height = 450;
    ray.InitWindow(width, height, "Nbody");
    //This is very much *not* good practice, but it's the easiest way to start this
    var rand = std.rand.Xoroshiro128.init(0);
    //Don't initialize the particles yet.
    var p: ParticleCollection = .{ .particles = undefined };
    p.window_end = .{ .x = width, .y = height };
    while (!ray.WindowShouldClose()) {
        defer ray.EndDrawing();

And so we have a working prototype for an nbody simulator, considerably shorter than the C version of the same program.

Interestingly, it appears to be smaller compiled than the original program in C, with just zig build-exe nbody.zig -lc -lraylib, we get an executable of around 784Kb. With zig build-exe nbody.zig -lc -lraylib -OReleaseFast, we can get it down to 92Kb, and with the -OReleaseSmall option, we can get down to 84Kb.

All in all, we'd definitely watch Zig carefully, it's very well thought out, and if they get build their package manager right, then their ecosystem might become competitive with Rust's quite quickly. The langauge already quite nice to use, and it might not a bad choice for your next project you might consider doing in C or Rust if you're looking for a new language to mess around in.