Common lisp is a dream. Not always a great dream(such as when using strings), but it's sufficient and SBCL is a remarkably interesting compiler. One that tells you how you might let it make the code faster, such as adding type hints, etc.

More recently, I've had the opportunity to work with `sb-simd`

, a library which adds automatic vectorization for certain forms. (Right now it needs to be done in a `do-vectorized`

and it seems to be missing a few SIMD intrinsics, not to mention that if it can't vectorize what you give it it will fail loudly). The general recommendations are to also separate this sort of specialized code into another package that is loaded when it is available.

It's very rough around the edges, but for very simple things it works very well, as in, billions of subtractions a second more than otherwise.

However, it is a framework that was intended to provide acceleration for some sort of machine learning task, so it is much more developed with regards to the floating point type intrinsics than the integer types, but that actually appears to be more a limitation of what intel provides, but maybe that will change(and I'm lead to believe some of the missing operations can be substituted for, but I'm not sure which).

In any case it is likely to improve somewhat further before it makes it into SBCL's contributed libraries, and who knows, in the future it might be part of SBCL's built-in optimizer.

For example, below are a standard lisp version of a `diff-image`

function, and an auto-vectorized one.

```
(defun diff-image(i1 i2 &optional result-image)
(declare (type (simple-array u8 (* * 3)) i1 i2)
(type (or (simple-array u8 (* * 3)) null) result-image)
(optimize speed (safety 0)))
(let ((r (if result-image
result-image
(make-array (list (array-dimension i1 0) (array-dimension i1 1) 3) :element-type 'u8 :initial-element 0))))
(declare (type (simple-array u8 (* * 3)) r))
(do-vectorized (x 0 (1- (the u64 (array-total-size i1))))
(setf (u8-row-major-aref r x)
(u8- (u8-row-major-aref i1 x)
(u8-row-major-aref i2 x))))
r))
(defun diff-image-slow(i1 i2 &optional result-image)
(declare (type (simple-array u8 (* * 3)) i1 i2)
(type (or null (simple-array u8 (* * 3))) result-image)
(optimize speed (safety 0)))
(loop with result = (if result-image result-image (make-array (list (array-dimension i1 0) (array-dimension i1 1) 3) :element-type 'u8 :initial-element 0))
for y from 0 below (array-dimension i1 0)
do(loop for x from 0 below (array-dimension i1 1)
do(setf (aref result y x 0)
(- (aref i1 y x 0) (aref i2 y x 0))
(aref result y x 1)
(- (aref i1 y x 1) (aref i2 y x 1))
(aref result y x 2)
(- (aref i1 y x 2) (aref i2 y x 2))))
finally(return result)))
```

Interestingly, according to the author of `sb-simd`

, this is probably the first code ever written to use it with integers, so it took some work on their end before it worked here.