This is a different tack than we are used to, so excuse our ineptitude when it comes to the “difficult” art of writing tech related non-fiction.
We have been getting interviewed for a position in a startup lately, and we actually enjoyed doing a technical interview. This was a surprise to us since in the past we’ve had bad experiences with them. The question is whether or not we’ve changed, or if we had better tools, but that’s not the only focus here. We wanna talk about the fun we had with it.
Our Last Interview
Back in somewhere around 2014-2015, forgive us, our memory doesn’t slot neatly in there, we went to interview for a company called Onshape(If you use a parametric cad system, their web based one is actually both good and reasonably performant), for an internship position. We were evidently appealing enough, but we acted with haste and confidence that really, really, really, really, really hurt us. Then our own mental issues also joined in the fun. Also, like, thinking back on it, we’re embarrassed by how poorly we thought things through.
Back then we were in a fraternity. We lived in the house, and we were friends with many alcoholics. One of whom happened to be interviewing around the same time as we were, at a different company, that also happened to be in Boston, and most importantly, he had a car. We shared an uncomfortable night in a hotel room, where we didn’t sleep at all. Then we started out the morning by taking twice as much adhd medication as we were supposed to(because we had forgotten that we took them).
We were jittery in a way that the word almost fails to describe. We couldn’t remember basic things about algorithmic performance. We were afraid and too anxious to do anything right.
We fucked up that interview badly.
But the worst part was that our friend’s interview was going on for many more ours than ours. We wandered around Boston from around 11AM to 5PM absolutely tweaking out in a suit. It was a hot day and it was amazing we didn’t get badly dehydrated. We had basically no money, and we couldn’t get in touch with our friend.
That experience honestly rates as one of the worst multiple clusterfucks that we can think of.
The Current Interview
This time a lot is different. We have our degree, better mental health, several partners that love us very deeply, and a pandemic that means that nobody is asking us to come to them for an interview. For one thing, the company found us instead of us finding them. Triplebyte is a nicer recruitment platform than the Rensselaer Polytechnic Career fair. We have a lot more experience working on code that isn’t entirely ours, since we’ve worked on several open source projects with various purposes. And we even have some professional experience.
So, overall we have a lot less to be worried about, and given that they found us, they seem more likely to be interested in hiring us than the other way around(look, we know that they very well might not treat us any differently than if we applied to them specifically).
So, anyway, we got to actually do a real interview with real actual code. The problem is actually even a bit interesting, since it is the basis for something a lot more powerful.
# We're writing a function called is_match # It takes two arguments, pattern and query. If it matches, then it returns True, otherwise it returns False # is_match('abc','red green blue') == True # is_match('abc','red red red') == False
You might already be seeing the pattern here. If you don’t, don’t worry, it’s easy to mistake what this is, so let’s write a short test case. If, for some reason, you’re following along in an editor, it’s best to add this to the end of your python script.
def test(): assert is_match('aaa', 'red red red') == True assert is_match('aaa', 'red green green') == False assert is_match('abc', 'red green blue') == True assert is_match('aba', 'red green red') == True
Okay, so let’s break it down, since code isn’t necessarily what you’re used to reasoning with. Each letter of the pattern(the first argument) must correspond to a unique word in the query. How do we achieve that however?
Well, the first thing that we should do is split the query
into individual words. Well, in python that looks like this.
def is_match(pattern, query): words = query.split(' ')
This calls the split method on the query string, which breaks the string on every occurrence of the string you pass to it. There are plenty of edge cases for the determined newbie to run face-first into, but that’s not important to this right now.
So the second thing we should try to do is create some way to store which letter in the pattern corresponds to the individual word, there are lots of ways to do this, but in python you generally use a dict
(dictionary). In this case we’ll call it matches
.
def is_match(pattern, query): words = query.split(' ') matches = {}
So then we iterate over the pattern and see if the character in the pattern is already in the dictionary. If it is, then we make sure that the match already stored is equal to the current word, returning false if it isn’t.
def is_match(pattern, query): matches = {} words = query.split(' ') for index in range(len(words)): word = words[index] pattern_char = pattern[index] if matches[pattern_char] != word: return False
Otherwise we add it to the dictionary with the current word.
def is_match(pattern, query): matches = {} words = query.split(' ') for index in range(len(words)): word = words[index] pattern_char = pattern[index] if matches[pattern_char] != word: return False else: matches[pattern_char] = word return False
Okay, so if it finishes the loop without returning false it should be correct, right? Well, running test()
shows otherwise.
Traceback (most recent call last):
File "/home/violet/interview_question.py", line 20, in <module>
test()
File "/home/violet/interview_question.py", line 15, in test
assert is_match('aaa', 'red red red') == True
AssertionError
Dang. That’s not right now is it? This is actually a sillier mistake than we intended, but that’s why being able to run your code is so useful
def is_match(pattern, query): matches = {} words = query.split(' ') for index in range(len(words)): word = words[index] pattern_char = pattern[index] if matches[pattern_char] != word: return False else: matches[pattern_char] = word return True
Okay, that doesn’t fail the assertions. but we should probably add more tests, because otherwise who knows what could go wrong?
def test(): assert is_match('aaa', 'red red red') == True assert is_match('aaa', 'red green green') == False assert is_match('abc', 'red green blue') == True assert is_match('aba', 'red green red') == True assert is_match('aba', 'red red red') == False
Then let’s run it.
Traceback (most recent call last):
File "/home/violet/interview_question.py", line 20, in <module>
test()
File "/home/violet/interview_question.py", line 19, in test
assert is_match('aba', 'red red red') == False
AssertionError
Oh no! (This is the error we did intend to run into). So what’s wrong? Well, we didn’t make sure that the words were unique. Luckily Python makes it easy to fix.
def is_match(pattern, query): matches = {} words = query.split(' ') for index in range(len(words)): word = words[index] pattern_char = pattern[index] if matches[pattern_char] != word: return False else: if word in matches.values(): return False matches[pattern_char] = word return True
This fixes the error alright. But for the sake of completion, let’s add some more test cases to make sure nothing silly gets through. Stuff like, “What if there were more than three entries?” and “What if it wasn’t letters being used in the pattern but numbers?”
def test(): assert is_match('aaa', 'red red red') == True assert is_match('aaa', 'red green green') == False assert is_match('abc', 'red green blue') == True assert is_match('aba', 'red green red') == True assert is_match('aba', 'red red red') == False assert is_match('abcd', 'red green magenta blue') == True assert is_match('1234', 'red green magenta blue') == True
These all run well because we didn’t make any mistakes that these would affect. We would argue this kind of testing is important anyway, because you can’t be sure that you didn’t do something silly and lazy, but that’s besides the point.
Okay, so besides the correctness of our implementation here, how performant is it?
def is_match(pattern, query): matches = {} words = query.split(' ') # O(n) memory for each word for index in range(len(words)): word = words[index] pattern_char = pattern[index] if pattern_char in matches.keys(): if matches[pattern_char] != word: return False else: if word in matches.values(): #O(n) time for each word seen return False matches[pattern_char] = word #O(n) memory for the number of distinct pattern elements return True
Forgive us if this isn’t accurate, we’re not a pythonista, so we can’t speak as to how performant word in matches.values
actually is, but if it’s naive, then it’s linear time. If that is indeed true, we can do better there by adding a set, this brings it down from O(n2)
to O(n)
.
def is_match(pattern, query): matches = {} match_values = set() words = query.split(' ') # O(n) memory for each word for index in range(len(words)): # O(n) time for each word word = words[index] pattern_char = pattern[index] if pattern_char in matches.keys(): if matches[pattern_char] != word: return False else: if word in match_values: #O(1) time return False matches[pattern_char] = word #O(n) memory for the number of distinct pattern elements match_values.add(word) return True
But there are still some potentially thorny issues here. Like, what if the pattern and query have mismatched lengths? Well, that should return false, shouldn’t it?
def is_match(pattern, query): matches = {} match_values = set() words = query.split(' ') # O(n) memory for each word if len(words) != len(pattern): return False for index in range(len(words)): #O(n) time for each word word = words[index] pattern_char = pattern[index] if pattern_char in matches.keys(): if matches[pattern_char] != word: return False else: if word in match_values: #O(1) time return False matches[pattern_char] = word #O(n) memory for the number of distinct pattern elements match_values.add(word) return True
The remaining issues that you might want to tackle here include things like making sure that both the query and pattern aren’t None
, ensuring they’re both strings(or making it work with other sequences in general), but these things are just refinements that don’t change the basis of the function.
Leave a Reply
Only people in my network can comment.