Want to Feel Old?
I make no attempt to hide that I am a fan of Randall Munroe’s excellent xkcd. I have, in fact, read all of the comics Mr. Monroe has ever published on xkcd.com. Despite this, I occasionally return to the website to enjoy a comic I haven’t seen for a while. This morning, I happened upon this comic:
https://xkcd.com/891/
With all due respect to Mr. Munroe, it is no longer 2011, but I would
still like to be able to make my acquaintances feel old.Not my friends of course; that would be mean!
Thus I wanted to be able to generate as many of this
particular shape of fact as I want!
Data Acquisition
The first step in any project of this sort is to find data to work
with. After a brief attempt at scraping IMDbWhich did not go well. The page was too big to grab by
copy/paste and they seemed to block curl’s user agent, so I
quickly gave up.
a simple web search turned up this
csv, which purports to be exactly what I want. A goal of mine is to
make this small script modular, so we’ll start with the phrasing used in
the first several of the examples and then upgrade from there. Remember,
fast prototyping works great at first, until it doesn’t.
Structural Overview
My first task was to sit down and decide how to organize the script to be as simple to write and extensible as possible. The idea is as follows:
- Input age and calculate maximum nostalgia age range.
- Take nostalgia age range and find movie within that range (randomly).
- Select desired style of pain-causage (e.g. phrasing generator).
- Pass 2 and 3 to a constructor function which outputs a painful string.
This seems like a pretty modular system, so let’s get to building it. I’ll start at the end and then work my way up; this system is simple enough that I don’t have to be concerned about integration hell.
Step 4
We’ll start with our sentence constructor. It’s quite simple, and should looks like this:
return(s)def construct_sentence(film, date, response_fn):
s = "Did you realize that " # note the extra space
# note the extra space here too
s += film + " came out "
s += response_fn(date, datetime.now().year)
return(s)This function shows us an important bit of implementation detail: the API our response building function will follow. It takes the year of the movie release and the current year.
Step 3
Let’s build our ways to cause pain. The simplest is to list the decades it’s been since then:
def decades_since(date1, date2):
# yes, we're doing dates as year integers. Deal with it.
assert date1 < date2
# get decade difference
date1 -= date1 % 10
date2 -= date2 % 10
decades = (date2 - date1) // 10
if decades != 1:
plural = "s"
else:
plural = ""
return("{} decade{} ago?".format(num_2_words(decades), plural))num_2_words is lifted from a stack overflow post and is
entirely uninteresting.
This function makes one more implicit claim about the functions: they will include the question mark at the end!
Step 2
The tricky part of making this script generate actually painful
movie-based facts, is coming up with a way to generate the effective
nostalgia age range; that is, the range of ages the movies within will
cause maximum emotional damage nostalgia.
After a bit of thinking, I came up with this: \Delta = 15-\sqrt{\text{age}} where \text{Range} = [16-\Delta, 16+\Delta]
This is pretty simple, and easy to tune if you think your soon
to be enemies friends have slightly different demographics than
mine. It’s also relatively simple to implement this function:
def make_range(age):
import math
Delta = 15 - math.sqrt(age)
l = 16-Delta
r = 16+Delta
year = datetime.now().year
lh = year - (age - l)
rh = year - (age - r)
return((lh,rh))This takes an age, and then figures out the maximally nostalgic movie range for the person.
Step 1
Now we just have to take this age and pick a movie, which it turns out pandas makes trivial:
def pick_movie(arange, db):
l,h = arange
dbf = db[db['year'].isin(range(int(l),int(h)))]
return(dbf.sample())Now, all that’s left is to build a mini interface:
def main():
age = int(input("How old are you? "))
db = pd.read_csv("./movies.csv")
film = pick_movie(make_range(age), db)
# pandas shenanigans
movie = str(film['original_title'].iloc[0])
year = int(film['year'].iloc[0])
print(construct_sentence(movie, year, decades_since))
if __name__ == "__main__":
main()Putting It All Together
If we run this script, we get something like:
$ python3 pain.py
How old are you? 28
Did you realize that The Lord of the Rings: The Return of the King came out two decades ago?
This is pretty great for a simple prototype! It works, hurts me deeply, and was simple to write. You can find this script on my GitHub.
Exercises for the Reader
If you want to play with this script, here are a few good ideas to add to it:
- [Medium] Make more generator functions and randomly select one.
- [Easy] Make the UI nicer; error checking on input, looping until done to generate more facts, etc. etc.
- [Hard-ish] Instead of generating a single fact, generate every possible fact for every age in a given range and save it to a CSV file.