Resiliency and Game Day Exercises at Acquia

In March of 2017 I came across the idea of “Game Day” in the DevOps Handbook by Gene Kim and others. Game Day is brilliantly advocated by Jesse Robbins in his presentation from 2011. It’s the idea that deliberately staging periodic system outages forces engineers to think about and design for resiliency in those systems. The extreme programming example is Chaos Monkey, which operates under only the one constraint that the outages should happen during working hours. Other than that, the outages caused by Chaos Monkey can happen anywhere in the system (even production!) and at any time.

Game Day is a step removed from Chaos Monkey, conceived of as a planned activity for engineers to resolve systemic outages. The resiliency exercises held at Acquia were yet another step away from the extreme towards the approachable. Our exercise included two activities, one geared for non-support engineers and the other for support. The non-support engineers had to bring back up a down site, and the support engineers had to attack and compromise an insecure site. The idea was to challenge engineers to step outside their comfort zone, and attempt to resolve technical challenges beyond the requirements of their every-day work.

The Team
The personalities involved in Game Day were a strong influence on the event. There’s Amin Astaneh, an Ops manager with the temperament of the proverbial town crier, faithfully and urgently supporting us in our DevOps transformation. Then there’s Apollo Clark, expert in secure systems who contributed the idea of doing a security vulnerability exercise. Finally there’s James Goin, seasoned Ops warrior relentlessly invested in the improvement of systems administration, including resiliency and disaster recovery training.

It just so happened that the idea for Game Day came two months in advance of Acquia’s annual engineering-wide event called Build Week, a truly awesome gathering of the entire team at Acquia HQ in Boston (read more on Dries’ blog!). Holding our Game Day at the same time would allow it to reach a broader audience across the company, so we requested a slot on the calendar. We ended up with 8-9pm on the Tuesday during Build Week. We had our opportunity!

Build Week imposed two constraints that had a significant and positive influence on our interpretation of Game Day. The whole event needed to fit in a single hour, and the event had to be accessible to engineers other than just the Ops subject matter experts. A Game Day exercise typically involves only the core engineering team which works directly with critical systems, and it takes however long they need to bring the systems back up. These constraints made the whole thing more approachable, and inspired the introduction of an Easy Mode and a Hard Mode.

Game Day as Exercise
The original idea was to have a trouble-shooting session with an Acquia development installation of a Drupal site (managed Enterprise-grade Drupal being the chief product of Acquia). The site would have some failure that either smaller teams or the whole group would have to resolve. Since we needed to accommodate varying levels and areas of expertise in the product, we settled on two “modes”, Easy Mode and Hard Mode, that participants would opt into based on their familiarity with troubleshooting techniques. The difference between the modes would only be in the level of difficulty. Easy Mode would be for those who don’t handle troubleshooting support calls as part of their regular day-job, Hard Mode for those who do.

The Identity Crisis
At this point, it hit home for me that the exercise was not going to be what I had originally intended – it wasn’t going to be a cookie-cutter Game Day. Although this seemed disappointing at the time, looking back it was a blessing in disguise, since it motivated us to create a new idea instead of copying someone else’s.

Apollo’s suggestion which we ended up following was to stage a Hard Mode Capture the Flag exercise instead of a site outage. Capture the Flag in a security context is an exercise where teams gain access to privileged resources in a system by leveraging security vulnerabilities. We could hide hashes – randomized strings of a fixed length – throughout the site. The winner of the competition would be the team that found all the hashes first.

The exercise would demonstrate that a site that works from a user perspective can still need work to become secure and performant. We would have Easy Mode to include some troubleshooting, which would then flow directly into the Capture the Flag exercise.

Trying It Out
We ran through the whole event a few weeks before Build Week. Easy Mode troubleshooting took up the first half hour, transitioning to Hard Mode Capture the Flag for the second half hour. This was pure thought experiment at this stage, and shockingly for me, it worked really really well.

During Easy Mode, non-Ops engineers drove the resolution with Ops experts only acting as consultants. Once the site was back up, we switched over to Capture the Flag. For this run through we only had one shared site for all the Hard Mode participants. One mischievous participant who found the site credentials deliberately locked out everyone else. This incident motivated much of the end-game setup for prevention of cross-site hacking.

Game Day!
Our Game Day-inspired exercise followed the flow established in our run through, with the addition of the isolated environments for Capture the Flag.

The Easy Mode troubleshooting took less time than we had allowed for, putting the start of Hard Mode right on time. The teams dove in, probing their environment – a Drupal site – for weaknesses. The narrative revolved around a fictional user submitting a question to the forum about how to enable the PHP module in Drupal, which would allow access to the bash shell on the server. The fictional admin replied that she had enabled the module for him, and reset his login to a “temporary password”. These were the credentials the participants were expected to use to hack the site. Since the user had access to the PHP module, they could also use it to gain shell access. Using this shell access to the server, they had easy access to the privileged resources and opportunities to discover the hashes.

When time ran out at 8:55, three of our twelve teams and forty participants had found all five of their hashes. The first team with all five hashes won the grand prize, an invitation for morning coffee with our resident tech celebrity, Drupal founder and Acquia CTO Dries Buytaert. As an aside, when I thanked Dries for agreeing to have coffee with our winners, he graciously replied, “No, thank you – now I get to have coffee!”

The decision to pivot from the established Game Day resulted in a new kind of learning in the spirit of Game Day. This learning was more accessible for our engineers and bridged the gap between where we are and where we are headed. While this isn’t the end of the story, I think it’s a fantastic start. Game Day, Day 2, here we come …

All about Free Geek Providence

I’m on the Board of Directors for a nonprofit called Free Geek Providence based in RI. Free Geek Providence provides a call to refurbish, reuse and recycle older computers. The organization also promotes the use of Linux, an open source operating system, as well as the open source ecosystem around it. I joined Free Geek Providence when it was starting up in 2008 and found it a perfect fit.

Access into modern society via technology is the new divide between the haves and the have-nots. As a result of this divide, we need to build bridges to bring back those people on the outside and introduce them to the skills needed as part of a computerized workforce.

The work we have done giving out free computers has benefited many nonprofits and individuals. Rhode Island Nurses Institute or RINI became the pilot for our Adopt a Classroom program. We installed 14 computers in a computer lab that students without computers at home could use to do their homework. An added benefit was exposure to the free operating system Linux. At first the students were uncomfortable using Linux because they were used to Microsoft Windows, but they soon made the switch and had no problem. Now those same students know that Linux and open source software is out there and can take advantage of it for the rest of their lives.

Follow freegeekpvd on Twitter and learn more on Facebook!

Adventures in Slipstreaming

I had the chance to get fairly geeky back in mid February with the purchase of a new copy of Adobe Creative Suite 4. The issue was that this product was not compatible with the operating system on my Dell Vostro 1500 laptop which comes with Vista Home Basic out of the box. My options were either to downgrade to Windows XP or upgrade to Windows Vista Business. I chose the former and that’s when the ‘fun’ began …

I have a student account that lets me download certain software for free, so I used it to grab a copy of Windows XP. I loaded it on my laptop and found that it just booted from Vista rather than install XP. Keep in mind I have tried to backup my computer before with Macrium Reflect but never figured it out successfully. Faced with the dilemma of apparently not being able to install XP with Vista still on there, I went for broke and started installing XP in the C drive, overwriting the current installation.

The first issue I ran into was the SATA Drivers on my laptop – XP predates this type of hard disk driver so it doesn’t install correctly when it runs into them. I found a solution through googling on my functioning XP desktop. If you press F2 quickly as the computer boots up, it puts the computer into BIOS mode, which is a level lower than the operating system and works even when there is no operating system. If you arrow down you’ll see the option for Drivers and SATA. Hit enter and it lets you change the option from ATA to AHCI. This is compatible with XP.

I went along and got the installation page that looks like Windows XP. That’s when I ran into the i386/asms Access Denied error. This was as far as I got. From here on out it was one denial after another.

I tried to slipstream my Windows Installation CD – this means inserting the drivers that XP needs into the installation itself so it can run the computer. I followed a guide I found online – No matter how I tried, either Windows wouldn’t recognize the CD as a proper XP boot CD or it would get back to the same asms Access Denied error.

In the end, I accepted defeat and started contacting my geeky friends. It just so happened that one of them has access to Vista Business DVDs for free at his work, so I got out of having to pay the $200 or so for the OS, plus I got back full use of the laptop. Unfortunately, the data on my laptop was lost and I had to reinstall absolutely everything.

Here’s what I learned

  1. Slipstreaming=customizing an installation CD, usually by inserting necessary drivers
  2. $OEM$ is a folder that needs to include the license for Windows XP. The dollar signs mean it will be copied over to the new computer automatically. You can find a more complete guide to these folders at
  3. There is a folder structure basic to Windows installations that also contains identifiers for the installation. All these components must be included for it to work.
  4. XP folder structure

Here’s what I did wrong

  1. Overwrote my operating system AND my data
    • I should have done something, anything, to back up my data. Luckily for me I have FTP access to both of my live websites so I was able to get a working copy from there, plus I have a repository for one of them. Still.
    • All my music and pictures were erased because they were stored in the Windows My Documents folders.
    • I have since learned that Macrium Reflect allows you to create a Linux boot CD from a restore point, which would have come in really handy had I not been able to obtain a Vista Business DVD, and it might have saved my data. Restore point=good
    • I should not have wiped Vista at the start. If I had not had another computer to google with and find solutions, I wouldn’t have even had the option to troubleshoot the issue. Lesson being NEVER do this to a primary or g*d forbid a work computer. Check out comment #8 on this forum post for an idea of the process the way it should be done.

  2. Took my chances muddling about with the command line and WOULD’VE tried to figure out the registry key if I could have created a restore point for that.
    • This would have been WAY too advanced in the sense that the only thing wrong with my laptop at this point was there was no operating system. If I had messed up the registry keys that would have been bad.
    • There’s only so much you can do to solve a problem the hard way before you should give up and go with the easy way (in my case, giving up on XP and going with the Vista Business DVD). I know I am the type that wants to figure everything out and learn which is good to a point, but when nine hours go by and you still hit the brick wall, that would be time to stop learning and fix the thing.

  3. I should have read ALL the requirements of Adobe CS4.

I realized after the fact that Adobe CS4 also requires a 1.8 gHz processor at least. My laptop has a 1.4 processor. So in the end I still couldn’t install the software on my laptop and it went on my desktop. This would be the part where I go “doh!”

URLs, Models, Views, Templates, and Shell – Useful Snippets for Each

There have been quite a few times when I have resorted to my coding friends and mentors and good ole Prof. Google to teach me how to write the code I needed to get the job done.

In particular, Adam (@adamjt) and Andy (@ashearer) have been right there alongside me helping me out.

Learning Django has been particularly involved because there are so many “in” things to know.  My understanding of Django has come down to the fact that there are 5 things to master in order to code Django and python successfully – the urls, the models, the views, the templates, and the python shell (command line interface).  I’ll touch on each one and list something I didn’t get at first until my friends explained it to me or I found it on google.  It is written in the spirit of writing it down so I don’t forget it …

URLS:   All about the media … explained to me by Andy (@ashearer on twitter)

In order to link my templates to the site media such as images and javascript and all that fun stuff I had to follow some specific steps in a very un-PHP manner:

  1. put a special global variable in the settings file of my project:
    MEDIA_URL = ‘/site_media/’
  2. add an extra expression in my urls file:
    (r’^site_media/(?P<path>.*)$’, ‘django.views.static.serve’, {‘document_root’: os.path.join(os.path.dirname(__file__), ‘site_media’)}),
  3. make a directory on the same level as my urls and settings files called “site_media” that holds all my files (which can then contain subdirectories for css, images, javascript that I can include in my links in the template)
  4. link to the MEDIA_URL in the template itself, like so:
    <link href=”{{ MEDIA_URL }}your_stylesheet_name.css” type=”text/css” rel=”stylesheet” />Notice that in the template itself, the {{MEDIA_URL}} is a reverse look-up.  There’s no need to write out the exact relative path to the site_media folder.

MODELS: Optional ForeignKey attribute … explained to me by Adam (@adamjt on twitter)

When I specify a one to many relationship in my models file using an ForeignKey, sometimes I want that relationship to be optional.  Just blank=true or null=true doesn’t work though, you have to use both:

mynamedattribute = models.ForeignKey(MyModel, blank=True, null=True)

This is an optional field, not required, as ForeignKey fields usually are.

VIEWS: Authenticate me … Adam and Django Docs (@adamjt)

Logging users in and out is quite the hassle, but there are some built-in methods that take care of the work for me.  In order to get a user logged in and out, there needs to be a url side of it, a view side, and a template side.  There are more advanced registration and authentication modules out there ( Google code has one I recommend ) but there are also some short-cuts that come with Django that work very well.

  1. On the urls side, these are the urls that link to the views that are built into django to log users in and out:
    (r’^login/$’, ‘django.contrib.auth.views.login’),
    (r’^logout/$’, ‘django.contrib.auth.views.logout’),
    Teeny sidenote here:  there was a release of django that had the login template moved from registration to admin but referenced the old location in the view, so to fix it I took the non-coding way out and copied the template over to the proper directory
  2. For views that need authentication, “login_required” is an easy way to check for logged in users.  The view file first imports the necessary decorators:
    from django.contrib.auth.decorators import login_required
    from django.contrib.auth import logout
  3. Then the view can either start with login_required:@login_required
    def view(request):
    return render_to_response(
    {“ElementName”: variable_name}, context_instance=RequestContext(request))
  4. or it can use “if request.user.is_authenticated” within the view (if it came from a login form it will look like the code below):
    def otherview(request):
    if request.method == ‘POST’:
    if request.user.is_authenticated():

    variable_name=”Not logged in so no name”
    return render_to_response(
    {“LoggedInName”: variable_name}, context_instance=RequestContext(request))
  5. Then in the template there is the same operator for segments of code that should be displayed for logged in users:
  6. {% if user.is_authenticated %}
    <a href=”/account/”>My Account ({{ user }})</a>
    {% else %}
    <a href=”/login/” >Login or Register</a>
    {% endif %}

TEMPLATES:  Ordering output and Grouping … explained to me by Adam (@adamjt)

For an article-based engine, I needed to group a list of articles in order to show them according to category and then by date within the category.  First I had to order them in the view and then regroup them in the template:

  1. The view is simple enough, generate a variable that stands for all objects in class MyArticle in order by category:
    variable_articles = MyArticle.objects.order_by(“category”)
  2. Pass the variable to the template in the dictionary:
    return render_to_response(‘articles.html’, {‘articles’: variable_articles}, context_instance=RequestContext(request))
  3. Then in the template, I make use of the “regroup” template tag to take the already ordered-by-category articles to order them further by date:
  4. {% regroup articles by category as article_list %}
    {% for article in article_list %}
    <h3>Category: {{article.grouper}}</h3><ul>
    {% regroup article.list by date as datedarticle_list %}
    {% for datedarticle in datedarticle_list %}
    {% ifchanged %}<li>{{|date:”F j, Y”}}{% endifchanged %}
    <ul>{% for item in datedarticle.list %}
    <li><a href = “/article/{{}}/”>{{item.title}}</a>, {{}}</li>{% endfor %}

    {% endfor %}</ul><br />{% endfor %}</ul>
    Notice that we had to regroup by category aside from using “order_by” in the view, but that was for the purpose of printing out the attribute by which the variable was grouped.

Finally the

SHELL:  Python shell scripts … figured out through pieces on Google

Using the administration interface is nice since you can see everything neatly represented by menus and drop down fields in a GUI, but there are tasks that are more efficiently accomplished using the command line.  Even that has its limit, in the sense that I could only ever figure out how to give it one command at a time.  A shell script on the other hand will give me the efficiency of the command line with the advantage of being able to write out an automated, saveable, multi-command executable.  You do need to have access to a Linux-type terminal in order to do this (the script starts off with the bash shebang).

  1. Create a file called and chmod it to have execute privileges:
    chmod +x
  2. Create the python environment with the python path and the django environmental setting:
    #!/usr/bin/env python
    import sys, os
    # Add a custom Python path.
    sys.path.insert(0, “/usr/bin/python”)

    # replace myname with your username
    sys.path = [‘/home/myname’] + sys.path

    # Set the DJANGO_SETTINGS_MODULE environment variable.
    os.environ[‘DJANGO_SETTINGS_MODULE’] = “myproject.settings”

    from django.contrib.auth.models import User #for example, this can be any django code

  3. Write in any commands you normally would type into the command line, for example making repetitious code into for loops like so:
    for i in range(1,51,5): # where i is the iterator and range(start,stop,step) – stop is non-inclusive
    article=MyArticle.objects.get(pk=i) #to select all objects with indices 1,6,11 and so forth up to 50
  4. Execute the script from your normal command line:

And there you have it!

Is the Graphic Designer a Geek?

How geeky is a graphic designer?  Can’t code, probably doesn’t know half the acronyms your first year newbie does, and uses a lot of ‘short-cuts’ to generating web pages – Photoshop and Dreamweaver, to name the most likely tools in the toolbox.

Some would say a graphic designer does the ‘soft core’ web development – asks questions like how exactly should that image be placed in relation to the text, how can we achieve drop-shadow, or which font is most effective in which circumstance.  The fundamental question behind graphic design is does form follow function – is the purpose of the website being conveyed clearly by its visual elements?  As long as this is the question a graphic designer is asking, then yes it is a useful and vital task in web development.

So is a graphic designer a geek?  I would say a thorough to the point of obsessive competence in anything, especially something computer related, makes one a geek in that area.  Sure a graphic designer can be a geek – a very specific type of geek, whose expertise has great relevance to the work of the ‘hard core’ back-end coding geeks.

Geeky Bloopers

So part of the learning experience, as I as a newbie grow to learn more about coding and web development, is the collection of silly things I mistakenly do as I am trying to accomplish a task.  Rather than feel embarrassed about them, I want to learn from them and even have a good chuckle over them.  Once I do solve the problem, that is – it’s not as much fun when you still haven’t figured out where you made the mistake.

So here goes, some good geeky bloopers – and I welcome readers to submit their own in comments (preferably mistakes you made because you were inexperienced and that were easily fixed, once you found out how):

  •  php-ini – if anyone has had to install PHP on their machine, you will know what I’m talking about.  My problem was I would call up the php info document and it told me I was running a previous version than the one I had installed.  Took forever but we finally figured out I had a single outdated php.ini left over in my Apache folder which the web server was reading – I should have put the php.ini in my PHP folder, not my Apache folder.  Oops.
  • Capital letters – capital letters screw me up sometimes.  I had a jpeg image with the file extension in all caps, so the server couldn’t find the image.  I kept re-checking the spelling and finally it dawned on me the server was case-sensitive.  You don’t need to learn that one twice!
  • Uploading via FTP – had to figure out that the port number really does matter when I first started uploading files to my website via FTP

Then there are bloopers using software – way easier to think of these because they affect how I interact with others:

  • Operator 11 – spent 20 minutes talking to myself thinking I was live on the air because I didn’t realize I had to press the Netcast button
  • WordPress – sent out e-mails to a bunch of people to read a post I had written but had forgotten to publish – they kept getting Not Found errors trying to access the link I posted to the page
  • Skype – tried to host a video conference call with multiple people, it was kind of a disaster because I kept inadvertently putting people on hold, not realizing that Skype doesn’t allow multi-user video conference calls

These are just the few I can think of off the top of my head, but I’m sure there have been many more.  All part of the newbie experience 🙂

Taking back “Geek”

Pinhead, nerd, encyclopedia, and … geek?  In fourth grade that’s an insult.   Now it’s increasingly used as a compliment in various media. 

By the way, did you know a technologically savvy person isn’t the first definition given on Merriam-Webster?


In fact, some people actually promote themselves as geeks.  Where I live there are the Providence Geeks, an IT professional’s networking dream ( ).  Speaking of whom, the headline on the Providence Phoenix a few weeks ago was “Geek Power”

And you must have heard by now of the “Geek Auction” at Washington State University, in which the resident computer club members (apparently all men) have decided that auctioning themselves to a sorority is a good way to recruit women to their club and get dates.

As for whether I consider myself a geek, I actually aspire to geekiness, or as I would call it, geekdom (that sounds more dignified, IMO).  Call me a geek-in-training.  Just not a pinhead.