Want to Feel Old?

I make no attempt to hide that I am a fan of Randall Munroe’s excellent xkcd. I have, in fact, read all of the comics Mr. Monroe has ever published on xkcd.com. Despite this, I occasionally return to the website to enjoy a comic I haven’t seen for a while. This morning, I happened upon this comic:

An xkcd comic of 2 columns. First column is labeled ‘Their Age,’ and is numbered 16 through 35 & ‘over 35.’ The second column is labeled ‘You Say’ and is divided into four sub-columns. The first sub-column reads ‘“Did you realize that…’ from 16-35, and the third sub-column says ‘Came Out’ from 16-35. Transcript from explainxkcd.com.https://xkcd.com/891/

With all due respect to Mr. Munroe, it is no longer 2011, but I would still like to be able to make my acquaintances feel old.Not my friends of course; that would be mean!

Thus I wanted to be able to generate as many of this particular shape of fact as I want!

Data Acquisition

The first step in any project of this sort is to find data to work with. After a brief attempt at scraping IMDbWhich did not go well. The page was too big to grab by copy/paste and they seemed to block curl’s user agent, so I quickly gave up.

a simple web search turned up this csv, which purports to be exactly what I want. A goal of mine is to make this small script modular, so we’ll start with the phrasing used in the first several of the examples and then upgrade from there. Remember, fast prototyping works great at first, until it doesn’t.

Structural Overview

My first task was to sit down and decide how to organize the script to be as simple to write and extensible as possible. The idea is as follows:

  1. Input age and calculate maximum nostalgia age range.
  2. Take nostalgia age range and find movie within that range (randomly).
  3. Select desired style of pain-causage (e.g. phrasing generator).
  4. Pass 2 and 3 to a constructor function which outputs a painful string.

This seems like a pretty modular system, so let’s get to building it. I’ll start at the end and then work my way up; this system is simple enough that I don’t have to be concerned about integration hell.

Step 4

We’ll start with our sentence constructor. It’s quite simple, and should looks like this:

return(s)def construct_sentence(film, date, response_fn):
    s = "Did you realize that " # note the extra space

    # note the extra space here too
    s += film + " came out "
    s += response_fn(date, datetime.now().year)

    return(s)

This function shows us an important bit of implementation detail: the API our response building function will follow. It takes the year of the movie release and the current year.

Step 3

Let’s build our ways to cause pain. The simplest is to list the decades it’s been since then:

def decades_since(date1, date2):
    # yes, we're doing dates as year integers. Deal with it.
    assert date1 < date2

    # get decade difference
    date1 -= date1 % 10
    date2 -= date2 % 10

    decades = (date2 - date1) // 10
    if decades != 1:
        plural = "s"
    else:
        plural = ""

    return("{} decade{} ago?".format(num_2_words(decades), plural))

num_2_words is lifted from a stack overflow post and is entirely uninteresting.

This function makes one more implicit claim about the functions: they will include the question mark at the end!

Step 2

The tricky part of making this script generate actually painful movie-based facts, is coming up with a way to generate the effective nostalgia age range; that is, the range of ages the movies within will cause maximum emotional damage nostalgia.

After a bit of thinking, I came up with this: \Delta = 15-\sqrt{\text{age}} where \text{Range} = [16-\Delta, 16+\Delta]

This is pretty simple, and easy to tune if you think your soon to be enemies friends have slightly different demographics than mine. It’s also relatively simple to implement this function:

def make_range(age):
    import math

    Delta = 15 - math.sqrt(age)
    l = 16-Delta
    r = 16+Delta
    year = datetime.now().year
    lh = year - (age - l)
    rh = year - (age - r)
    return((lh,rh))

This takes an age, and then figures out the maximally nostalgic movie range for the person.

Step 1

Now we just have to take this age and pick a movie, which it turns out pandas makes trivial:

def pick_movie(arange, db):
    l,h = arange
    dbf = db[db['year'].isin(range(int(l),int(h)))]
    return(dbf.sample())

Now, all that’s left is to build a mini interface:

def main():
    age = int(input("How old are you? "))
    db = pd.read_csv("./movies.csv")
    film = pick_movie(make_range(age), db)

    # pandas shenanigans
    movie = str(film['original_title'].iloc[0])
    year = int(film['year'].iloc[0])

    print(construct_sentence(movie, year, decades_since))

if __name__ == "__main__":
    main()

Putting It All Together

If we run this script, we get something like:

$ python3 pain.py
How old are you? 28
Did you realize that The Lord of the Rings: The Return of the King came out two decades ago?

This is pretty great for a simple prototype! It works, hurts me deeply, and was simple to write. You can find this script on my GitHub.

Exercises for the Reader

If you want to play with this script, here are a few good ideas to add to it:

  1. [Medium] Make more generator functions and randomly select one.
  2. [Easy] Make the UI nicer; error checking on input, looping until done to generate more facts, etc. etc.
  3. [Hard-ish] Instead of generating a single fact, generate every possible fact for every age in a given range and save it to a CSV file.

Site Nav