Small Update

Once again we find ourselves working in python. This is part of a cycle of programming preferences that we have noticed that we go through. Sometimes we want something low-level like C, sometimes we want something easy like Python, other times we want something a bit different, like Forth, Haskell, Ocaml or Lisp.

We don't have much to say about what we're doing other than the fact that we are always impressed by how easy it is to use Python, and how often, it manages to even produce nice performant result, or, at the very least, tends to be possible to make fast enough. Of course, there are downsides, the type system sometimes drives us mad when the error happens deep in execution, and then there's the copy semantics that sometimes trips us up, but since we're aware of them by now, we're mostly coping fine.

We don't have much to say otherwise, just felt that we should use our blog for something in the middle of our other posts that we're writing and stories that have yet to be finished. 🙂

Trying Out Zig

So, we have heard good things about Zig. These boil down to the following things:

  • Good speed
  • Fast compilation
  • Decent type system
  • Simple Syntax

So far, a lot of these things seem to be born out by the experience we've had, though, we have some criticism that is probably more along the lines of our taste in programming, rather than issues we expect to be universal. For fairness sake, let's start with things we like, once again expressing our preferences rather than universals.

  • Low Level
    Low level languages feel liberating to us, because we get to make the decisions, rather than the compiler or virtual machine.
  • Trusts you, mostly.
    This is mostly in comparison to Rust, whose borrow checker we have wrestled with a great deal and whose Turing complete type system has, in the past, left us mystified when we have run into errors.
    And if you really, really want to throw caution to the wind, you can mostly just tell zig that you don't give a damn about whether or not the pointer is to a single element or an array.
  • Compile Time computation, Generics
    This is how generics are implemented, and for the most part, it provides a good experience, especially since it compares exceedingly favorably with pre-c++17 C++.
  • Not afraid of Bare Pointers
    Bare pointers aren't something to worry about too much, especially if they're typed. Bare pointers are quite a bit faster and lighter than other (tagged) references.

And honestly, we're impressed at how tight the language feels. There are some facilities we're not sure about, and some that we wish it had, but overall, this isn't a bad language. We like it more than Rust(which, admittedly, really isn't saying much for us), though, from familiarity alone we're likely to revert to using C or C++, but we're not nearly as skeptical as we once were about this little language.

The syntax is familiar enough, pretty similar to C likes, though, with a few elements that we can't really attribute to any language we know(which doesn't mean very much). So, let's see what something simple looks like.

const std = @import("std");
pub fn main() !void {
    const stdout = std.io.getStdOut().writer();
    var i: usize = 1;
    while (i < 10) {
        try stdout.print("{}\n", .{i});
        i += 1;
    }
}

You can see that Zig is careful about errors, requiring that you either mark that main can return an error (designated with !void), or handle the error somehow. Anything that can fail in Zig has a type denoted by !<type>, which requires you to think about what errors can happen, or so communities as Go users and Rust users insist.

Let's see something that differs even more significantly than C.

const std = @import("std");
fn arraylist_user(alloc: *std.mem.Allocator) !std.ArrayList(i32) {
    var al = std.ArrayList(i32).init(alloc);
    var i: i32 = 0;
    while (i < 100) {
        var item = try al.addOne();
        item.* = i;
        i += 1;
    }
    return al;
}
pub fn main() !void {
    var gpalloc = std.heap.GeneralPurposeAllocator(.{}){};
    defer std.debug.assert(!gpalloc.deinit());
    const gpa = &gpalloc.allocator;
    const stdout = std.io.getStdOut().writer();
    var array_list = try arraylist_user(gpa);
    defer array_list.deinit();
    for (array_list.items) |*item| {
        try stdout.print("{}\n", .{item.*});
    }
}

So, in this little snippet we have used several new things, Generic types, Allocator, and the defer keyword(which go users will immediately recognize). Zig does not have a default allocator, and it wants you to pass the allocator you want to use to the function that allocates memory. This is rather different than C, where you would probably just use malloc/realloc/calloc/free to manage memory(Or define a macro that evaluates to that by default if you're building that kind of library). The reason that the documentation has an assert in gpalloc.deinit() is because this particular allocator can track memory leaks, so this causes it to report such occurrences. It also shows one of the syntactic divergences from C, for(array_list.items) |*item|. Pipes don't get much use in C other than as logical operators, but here it tells the loop to unwrap the value(a pointer to a position in the list), and calls it item. In Zig, you dereference this pointer with item.*. Another point of comparison to C++, the addOne method doesn't add the item itself, but returns a pointer to it, so that you can assign it afterwards.

Interestingly, for such a low level language, array_list.items isn't a pointer, it's a slice, which is a pointer+length, so in C++, we would say that it has most in common with a string_view.

Okay, so what if we want to go further? What if we want to do a reprisal of our C N-body simulator written with Raylib? In fact, that's not too bad. It's almost even easy. In fact, we can even use raylib without too much trouble.

To start off, we need to import raylib. Luckily, as it turns out, Zig's compiler can handle C interop quite nicely, it even has builtins for it. (Though, if you do this then you need to call the compiler with -lc -lraylib to make sure that it has all the c libraries linked into it).

const std = @import("std");
const ray = @cImport({
    @cInclude("raylib.h");
});

Okay, so the first thing we should do here is define a vector type, not the resizable array type of vector, but the directional vector. Since we prize reuse here, we're going to make it generic even though we probably don't really need to.

pub fn vector2(comptime value: type) type {
    return struct {
        const This = @This();
        x: value,
        y: value,
        //Write methods here!
    };
}

Okay, so that's an easy enough definition. But let's add some vector math oriented methods to make using it easier and more pleasant.

    return struct {
        const This = @This();
        x: value,
        y: value,
        pub fn add(this: This, other: This) This {
            var t: This = v2{ .x = 0, .y = 0 };
            t = .{ .x = this.x + other.x, .y = this.y + other.y };
            return t;
        }
        pub fn sub(this: This, other: This) This {
            return this.add(.{ .x = -other.x, .y = -other.y });
        }
        pub fn scale(this: This, v: value) This {
            return .{ .x = this.x * v, .y = this.y * v };
        }
        pub fn distance(this: This, other: This) value {
            const sqrt = std.math.sqrt;
            const pow = std.math.pow;
            return sqrt(pow(value, this.x - other.y, 2) + pow(value, this.y - other.y, 2));
        }
        //Wrap the value around, like when pacman reaches the edge of the map
        pub fn wrap(c: This, a: This, b: This) This {
            var r: This = .{ .x = c.x, .y = c.y };
            if (r.x < a.x) {
                r.x = b.x;
            } else if (r.x > b.x) {
                r.x = a.x;
            }
            if (r.y < a.y) {
                r.y = b.y;
            } else if (r.y > b.y) {
                r.y = a.y;
            }
            return r;
        }
        pub fn magnitude(a: This) value {
            const sqrt = std.math.sqrt;
            return sqrt(a.x * a.x + a.y * a.y);
        }
        pub fn normalize(a: This) This {
            return a.scale(1.0 / a.magnitude());
        }
    };
}

Alright, the main things to note here are how we need to call std.math.pow, namely that we need to call it with a compile time type, in this case value. Later on we'll see it called with f32.

Now we need to define the type we use for particles, and while we're at it, we're going to make a shortcut to the kind of vector we're using here.

const v2 = vector2(f32);
const particle = struct {
    const This = @This();
    position: vector2(f32) = v2{ .x = 0, .y = 0 },
    velocity: vector2(f32) = v2{ .x = 0.0, .y = 0.0 },
    acceleration: vector2(f32) = v2{ .x = 0.0, .y = 0.0 },
    mass: f32,
    //Methods here!
};

We also need a radius property, but since it's derived from the mass, and later on that can change as the bodies absorb each other, so it needs to be a method. We should also write methods to determine if the particles overlap and just moving the position along by the velocity, as well as calculating the attraction between two particles.

const particle = struct {
    const This = @This();
    position: vector2(f32) = v2{ .x = 0, .y = 0 },
    velocity: vector2(f32) = v2{ .x = 0.0, .y = 0.0 },
    acceleration: vector2(f32) = v2{ .x = 0.0, .y = 0.0 },
    mass: f32,
    pub fn radius(this: *const This) f32 {
        var result: f32 = std.math.ln(this.mass) / std.math.ln(3.0);
        if (result > 40) {
            return 40 - std.math.ln(this.mass) / std.math.ln(5.0);
        }
        return result;
    }
    //Returns true if the two particles overlap
    pub fn overlaps(this: *const This, other: *const this) bool {
        var r1 = this.radius();
        var r2 = other.radius();
        var dist = this.position.distance(other.position);
        return (r1 + r2) > dist;
    }
    //Handles the base movement
    pub fn motion_step(this: *This, timestep: f32) void {
        this.position = this.position.add(this.velocity.scale(timestep));
        this.velocity = this.velocity.add(this.acceleration.scale(timestep));
    }
    pub fn attraction(this: *const This, other: *const This, g: f32) vector2(f32) {
        var dist = this.position.distance(other.position);
        var vector_to = other.position.sub(this.position).normalize();
        return vector_to.scale(g * (this.mass * other.mass) / std.math.pow(f32, dist, 2));
    }
};

Now we should build a container for many particles, and the properties necessary to simulate them that aren't appropriate to put in the particle structures themselves. We need a method to initialize the particles to random positions, a method to handle the gravity simulation and such, and a method to actually draw the particles.

const ParticleCollection = struct {
    const This = @This();
    particles: [100]particle,
    window_start: vector2(f32) = v2{ .x = 0.0, .y = 0.0 },
    window_end: vector2(f32) = v2{ .x = 100, .y = 100 },
    timestep: f32 = 0.01,
    gravitational_constant: f32 = 1e-3,
    pub fn init_particles(this: *This, rand: *std.rand.Random) void {
        for (this.particles) |*p| {
            p.mass = @intToFloat(f32, rand.intRangeLessThan(i32, 1, 100));
            p.position = .{
                .x = @intToFloat(f32, rand.intRangeLessThan(i32, @floatToInt(i32, this.window_start.x), @floatToInt(i32, this.window_end.x))),
                .y = @intToFloat(f32, rand.intRangeLessThan(i32, @floatToInt(i32, this.window_start.y), @floatToInt(i32, this.window_end.y))),
            };
            p.acceleration = .{ .x = 0.0, .y = 0.0 };
            p.velocity = .{ .x = 0.0, .y = 0.0 };
        }
    }
    pub fn step_world(this: *This) void {
        for (this.particles) |*p| {
            p.motion_step(this.timestep);
            p.position = p.position.wrap(this.window_start, this.window_end);
        }
        for (this.particles) |*a| {
            a.acceleration = .{ .x = 0.0, .y = 0.0 };
            for (this.particles) |*b| {
                //No self attraction please, allowing that would result in division by zero
                if (a == b)
                    continue;
                a.acceleration = a.acceleration.add(a.attraction(b, this.gravitational_constant));
            }
        }
    }
    pub fn drawSystem(this: *const This) void {
        for (this.particles) |p| {
            ray.DrawCircle(@floatToInt(c_int, p.position.x), @floatToInt(c_int, p.position.y), p.radius(), ray.BLACK);
        }
    }
};

Note how we have to pass a random number generator into init_particles, this is inline with how Zig also requires that you pass the allocators into functions that require memory allocations to be done. You also see some of the somewhat jagged interaction between Zig and C, namely that Zig doesn't specify that its i32 type is equivalent to C's int type(which on many architectures it might not be), it also requires explicit conversions between floating point numbers and integers.

The main function here is the simplest part yet.

pub fn main() !void {
    const width = 800;
    const height = 450;
    ray.InitWindow(width, height, "Nbody");
    ray.SetTargetFPS(60);
    //This is very much *not* good practice, but it's the easiest way to start this
    var rand = std.rand.Xoroshiro128.init(0);
    //Don't initialize the particles yet.
    var p: ParticleCollection = .{ .particles = undefined };
    p.window_end = .{ .x = width, .y = height };
    p.init_particles(&rand.random);
    while (!ray.WindowShouldClose()) {
        p.step_world();
        ray.BeginDrawing();
        defer ray.EndDrawing();
        ray.ClearBackground(ray.RAYWHITE);
        p.drawSystem();
    }
}

And so we have a working prototype for an nbody simulator, considerably shorter than the C version of the same program.

Interestingly, it appears to be smaller compiled than the original program in C, with just zig build-exe nbody.zig -lc -lraylib, we get an executable of around 784Kb. With zig build-exe nbody.zig -lc -lraylib -OReleaseFast, we can get it down to 92Kb, and with the -OReleaseSmall option, we can get down to 84Kb.

All in all, we'd definitely watch Zig carefully, it's very well thought out, and if they get build their package manager right, then their ecosystem might become competitive with Rust's quite quickly. The langauge already quite nice to use, and it might not a bad choice for your next project you might consider doing in C or Rust if you're looking for a new language to mess around in.

Technical Interview

This is a different tack than we are used to, so excuse our ineptitude when it comes to the "difficult" art of writing tech related non-fiction.

We have been getting interviewed for a position in a startup lately, and we actually enjoyed doing a technical interview. This was a surprise to us since in the past we've had bad experiences with them. The question is whether or not we've changed, or if we had better tools, but that's not the only focus here. We wanna talk about the fun we had with it.

Our Last Interview

Back in somewhere around 2014-2015, forgive us, our memory doesn't slot neatly in there, we went to interview for a company called Onshape(If you use a parametric cad system, their web based one is actually both good and reasonably performant), for an internship position. We were evidently appealing enough, but we acted with haste and confidence that really, really, really, really, really hurt us. Then our own mental issues also joined in the fun. Also, like, thinking back on it, we're embarrassed by how poorly we thought things through.

Back then we were in a fraternity. We lived in the house, and we were friends with many alcoholics. One of whom happened to be interviewing around the same time as we were, at a different company, that also happened to be in Boston, and most importantly, he had a car. We shared an uncomfortable night in a hotel room, where we didn't sleep at all. Then we started out the morning by taking twice as much adhd medication as we were supposed to(because we had forgotten that we took them).

We were jittery in a way that the word almost fails to describe. We couldn't remember basic things about algorithmic performance. We were afraid and too anxious to do anything right.

We fucked up that interview badly.

But the worst part was that our friend's interview was going on for many more ours than ours. We wandered around Boston from around 11AM to 5PM absolutely tweaking out in a suit. It was a hot day and it was amazing we didn't get badly dehydrated. We had basically no money, and we couldn't get in touch with our friend.

That experience honestly rates as one of the worst multiple clusterfucks that we can think of.

The Current Interview

This time a lot is different. We have our degree, better mental health, several partners that love us very deeply, and a pandemic that means that nobody is asking us to come to them for an interview. For one thing, the company found us instead of us finding them. Triplebyte is a nicer recruitment platform than the Rensselaer Polytechnic Career fair. We have a lot more experience working on code that isn't entirely ours, since we've worked on several open source projects with various purposes. And we even have some professional experience.

So, overall we have a lot less to be worried about, and given that they found us, they seem more likely to be interested in hiring us than the other way around(look, we know that they very well might not treat us any differently than if we applied to them specifically).

So, anyway, we got to actually do a real interview with real actual code. The problem is actually even a bit interesting, since it is the basis for something a lot more powerful.

# We're writing a function called is_match
# It takes two arguments, pattern and query. If it matches, then it returns True, otherwise it returns False
# is_match('abc','red green blue') == True
# is_match('abc','red red red') == False

You might already be seeing the pattern here. If you don't, don't worry, it's easy to mistake what this is, so let's write a short test case. If, for some reason, you're following along in an editor, it's best to add this to the end of your python script.

def test():
    assert is_match('aaa', 'red red red') == True
    assert is_match('aaa', 'red green green') == False
    assert is_match('abc', 'red green blue') == True
    assert is_match('aba', 'red green red') == True

Okay, so let's break it down, since code isn't necessarily what you're used to reasoning with. Each letter of the pattern(the first argument) must correspond to a unique word in the query. How do we achieve that however?

Well, the first thing that we should do is split the query into individual words. Well, in python that looks like this.

def is_match(pattern, query):
    words = query.split(' ')

This calls the split method on the query string, which breaks the string on every occurrence of the string you pass to it. There are plenty of edge cases for the determined newbie to run face-first into, but that's not important to this right now.

So the second thing we should try to do is create some way to store which letter in the pattern corresponds to the individual word, there are lots of ways to do this, but in python you generally use a dict(dictionary). In this case we'll call it matches.

def is_match(pattern, query):
     words = query.split(' ')
     matches = {}

So then we iterate over the pattern and see if the character in the pattern is already in the dictionary. If it is, then we make sure that the match already stored is equal to the current word, returning false if it isn't.

def is_match(pattern, query):
     matches = {}
     words = query.split(' ')
     for index in range(len(words)):
         word = words[index]
         pattern_char = pattern[index]
            if matches[pattern_char] != word:
                return False

Otherwise we add it to the dictionary with the current word.

def is_match(pattern, query):
     matches = {}
     words = query.split(' ')
     for index in range(len(words)):
         word = words[index]
         pattern_char = pattern[index]
            if matches[pattern_char] != word:
                return False
        else:
            matches[pattern_char] = word
    return False

Okay, so if it finishes the loop without returning false it should be correct, right? Well, running test() shows otherwise.

Traceback (most recent call last):
  File "/home/violet/interview_question.py", line 20, in <module>
    test()
  File "/home/violet/interview_question.py", line 15, in test
    assert is_match('aaa', 'red red red') == True
AssertionError
Surprised Pikachu

Dang. That's not right now is it? This is actually a sillier mistake than we intended, but that's why being able to run your code is so useful

def is_match(pattern, query):
     matches = {}
     words = query.split(' ')
     for index in range(len(words)):
         word = words[index]
         pattern_char = pattern[index]
            if matches[pattern_char] != word:
                return False
        else:
            matches[pattern_char] = word
    return True

Okay, that doesn't fail the assertions. but we should probably add more tests, because otherwise who knows what could go wrong?

def test():
    assert is_match('aaa', 'red red red') == True
    assert is_match('aaa', 'red green green') == False
    assert is_match('abc', 'red green blue') == True
    assert is_match('aba', 'red green red') == True
    assert is_match('aba', 'red red red') == False

Then let's run it.

Traceback (most recent call last):
  File "/home/violet/interview_question.py", line 20, in <module>
    test()
  File "/home/violet/interview_question.py", line 19, in test
    assert is_match('aba', 'red red red') == False
AssertionError

Oh no! (This is the error we did intend to run into). So what's wrong? Well, we didn't make sure that the words were unique. Luckily Python makes it easy to fix.

def is_match(pattern, query):
     matches = {}
     words = query.split(' ')
     for index in range(len(words)):
         word = words[index]
         pattern_char = pattern[index]
            if matches[pattern_char] != word:
                return False
        else:
            if word in matches.values():
                return False
            matches[pattern_char] = word
    return True

This fixes the error alright. But for the sake of completion, let's add some more test cases to make sure nothing silly gets through. Stuff like, "What if there were more than three entries?" and "What if it wasn't letters being used in the pattern but numbers?"

def test():
    assert is_match('aaa', 'red red red') == True
    assert is_match('aaa', 'red green green') == False
    assert is_match('abc', 'red green blue') == True
    assert is_match('aba', 'red green red') == True
    assert is_match('aba', 'red red red') == False
    assert is_match('abcd', 'red green magenta blue') == True
    assert is_match('1234', 'red green magenta blue') == True

These all run well because we didn't make any mistakes that these would affect. We would argue this kind of testing is important anyway, because you can't be sure that you didn't do something silly and lazy, but that's besides the point.

Okay, so besides the correctness of our implementation here, how performant is it?

def is_match(pattern, query):
    matches = {}
    words = query.split(' ') # O(n) memory for each word
    for index in range(len(words)):
        word = words[index]
        pattern_char = pattern[index]
        if pattern_char in matches.keys():
            if matches[pattern_char] != word:
                return False
        else:
            if word in matches.values(): (n) time for each word seen
                return False
            matches[pattern_char] = word (n) memory for the number of distinct pattern elements
    return True

Forgive us if this isn't accurate, we're not a pythonista, so we can't speak as to how performant word in matches.values actually is, but if it's naive, then it's linear time. If that is indeed true, we can do better there by adding a set, this brings it down from O(n2) to O(n).

def is_match(pattern, query):
    matches = {}
    match_values = set()
    words = query.split(' ') # O(n) memory for each word
    for index in range(len(words)): # O(n) time for each word
        word = words[index]
        pattern_char = pattern[index]
        if pattern_char in matches.keys():
            if matches[pattern_char] != word:
                return False
        else:
            if word in match_values: (1) time
                return False
            matches[pattern_char] = word (n) memory for the number of distinct pattern elements
            match_values.add(word)
    return True

But there are still some potentially thorny issues here. Like, what if the pattern and query have mismatched lengths? Well, that should return false, shouldn't it?

def is_match(pattern, query):
    matches = {}
    match_values = set()
    words = query.split(' ') # O(n) memory for each word
    if len(words) != len(pattern):
        return False
    for index in range(len(words)): (n) time for each word
        word = words[index]
        pattern_char = pattern[index]
        if pattern_char in matches.keys():
            if matches[pattern_char] != word:
                return False
        else:
            if word in match_values: (1) time
                return False
            matches[pattern_char] = word (n) memory for the number of distinct pattern elements
            match_values.add(word)
    return True

The remaining issues that you might want to tackle here include things like making sure that both the query and pattern aren't None, ensuring they're both strings(or making it work with other sequences in general), but these things are just refinements that don't change the basis of the function.

Little Warning

Hi, Uh, the one thing we feel we ought to make clear for those who helped us diagnose the following issue is that we are going to post a lot of weird stuff here. Lots of it lewd, and we don't want to cross your boundaries, and if you're not here for that, it's probably not a good idea to follow this account