The Sledgehammer – Version 2.0

January 11, 2014

What Software Testers Can Learn From Video Game Speedrunners

Filed under: Games, Quality Assurance — Tags: , , — Brian Lutz @ 2:53 pm

I see a lot of this when I attempt to play video games.

Over the course of the past week, I’ve spent far more time than I care to admit watching other people play video games far better than I could possibly do it.  Every year around this time, Speed Demos Archive puts on an event known as Awesome Games Done Quick, where a group of speedrunners gets together and plays games nonstop as fast as they possibly can for an entire week, streaming it online as a fundraiser for the Prevent Cancer Foundation (and a pretty successful one too, raising nearly $450,000 last year, and as of this writing the total for AGDQ 2014 is sitting at roughly $663,000 with about a day to go, plus whatever bonus streams follow the main event.)    For someone such as myself who has pretty much no skill whatsoever when it comes to anything requiring fast twitch reflexes, it is fascinating to watch this type of thing for several reasons.  First of all, the amount of skill being put on display by the various speedrunners is amazing.  And the second (and perhaps more compelling) reason is that as the various speedrunners go through their runs, they tend to provide a running commentary explaining what they’re doing as they go along.  And quite a bit of what they’re doing is, quite frankly, breaking the games.

But as I’ve watched the marathon and seen the types of techniques that speedrunners use, it has occurred to me that there are actually some things I can learn in my professional career as a software QA engineer from watching this type of thing.  Even though I don’t do anything related to games in my job (and only one or two things I have ever done in my career have come even remotely close to it) it seems to me that a lot of what of people do in the course of speedrunning games is quite similar to what I do in testing software, with one significant difference:  As a tester, I’m trying to find problems to get them fixed, speedrunners are typically trying to find them to completely break things.  And to be perfectly honest, I think the speedrunners might be winning on this one, judging from some of the ways they can take tiny little glitches and completely break entire games with them.  In most cases this has no real impact other than to beat games far more quickly than they were ever intended to be beaten, but we’re generally talking about twenty year old games here.  If you’re running mission-critical software in an enterprise environment and things like this are happening, you might find the impact of something like this to be far more problematic.  Naturally, it’s best to catch these types of things well before the software (be it a game or something more functional) goes out into the wild.  As such, I thought I’d put together a post that goes through some of the lessons that I have learned from watching speedrunners during AGDQ.


1. People will go to great lengths to make even trivial gains in performance.  Although the speedruns in AGDQ are compelling enough on their own, the part that really makes it interesting is the commentary that goes along with most of the runs.  Whether it’s coming from the speedrunner(s) playing the game or from providing a play-by-play from the couch, it quickly becomes clear that the people doing this stuff have put as much thought and effort into this as most people would put into far more serious subjects.  I suspect that the collective knowledge that has been gleaned from one of the more popular speedrunning games such as Super Mario Bros. could fill a book, or at least an article in an academic journal.  Nonetheless, even for games that have been well documented and well understood for years, people are still trying to find ways to shave fractions of seconds off their times.  In particular, one of the popular (yet somewhat controversial) categories in speed running tool-assisted speedrunning, also known as TAS.  Tool-assisted speedrunners use various tools to do things like run games a single frame at a time and use savestates to keep running through segments until they can figure out the optimal paths through or pull off difficult tricks, which allows them to eventually work toward what could be considered a fully optimized run.  In many cases, these optimized runs can be much faster (often by multiple minutes) than what even the best human players can manage, but they also tend to do this by using tricks that human players would not be able to do.  Nonetheless, the TAS players can find hidden strategies that can save time in regular speedruns, but at the same time can also be very difficult and/or risky.  It’s not uncommon to see speedrunners taking big risks on difficult tricks that might save them a fraction of a second if they pull it off, but can cost them much more than that if they don’t.  Speedrunning is by its very nature competitive, and at times it can be mere fractions of a second that can separate players in a racing each other on an hour-long speedrun (I don’t have a way to link to it yet, but the 4-way Super Metroid race from AGDQ 2014 is a very good illustration of this.)

Although this isn’t a scenario that necessarily translates to real-world software in the same manner (as you might imagine, when working with most types of hardware and software the goal is far more to reduce risk as much as possible than to reward it,) one thing I do typically see in the course of my daily workflow as a tester is that there are a lot of repetitive tasks that come up, not just in the actual testing, but in the course of dealing with the other associated tasks that come along with it such as bug tracking, test case management, setting up test environments and reporting results.  Although the use of automation in test case execution is widespread and can save significant time over manual testing in situations where it can be applied, I’m not dealing with much of it in my current job.  Nonetheless, even if you’re not automating  your test cases, you can probably identify little repetitive tasks here and there that you might be able to automate with something like a batch script or a macro.  Even little things that don’t seem like much can add up over time, and in the long run you can make significant performance gains out of little things.

2. Things that may seem random rarely are.  As you watch the various speedrunners going through their runs, one of the things they point out frequently is where things are or aren’t random in the games.  As you watch the various runs, you realize that at least under specific conditions, most seemingly random things aren’t actually random.  This is frequently important because a lot of the strategies (speedrunners typically call them “Strats”) depend on certain things happening at certain times.  On the flip side of the coin, random events tend to be a hindrance, as they can interfere with things often.  Mostly through exploration using TAS and other playthroughs of the game, it is possible for them to determine what is going to happen when, and also to figure out ways to precisely control the circumstances in which certain things happen and manipulate them to their advantage.  While testing software, often one of the biggest challenges testers face is trying to come up with consistently reproducible scenarios for bugs that have been reported because you have no way to verify if you actually fixed a bug if you don’t have a reliable way to get that bug to manifest itself in the first place.  This can be difficult, especially for bugs that may have been seen only once or twice, or issues that have been reported by non-technical users who provide only limited information and in a production environment where you might not have access to the debugging tools you’re used to having on your test bench.  It is for this reason that you need to be familiar with the environment you’re working in, and that you know what circumstances might lead to one particular code path instead of another.  If possible, you also want to have ways to collect at least some sort of data from low-information users in situations like this.  In many cases, understanding what circumstances might cause certain unwanted behaviors to occur in a piece of software can be largely a matter of determining the state of the environment at the time the problem happened.  Granted, this can require going rather deep into things, but speedrunners (and especially TAS runners) have gone surprisingly deep into the games they’re speedrunning, and have managed to do some rather surprising things, as this tool-assisted run of Super Mario World from AGDQ2014 illustrates.  It starts out unusual, gets downright weird, and goes…  Well, you’ll just have to watch.

3. You’re always going to miss something no matter how much testing you do.  The vast majority of games being played in the AGDQ marathon were some of the best-selling and best known games of the time when they were created.  Although the tools and services available today to game developers has allowed many smaller indie developers to put out products that can rival the big-name studios, in general a lot of games being shown were produced by rather large teams of developers, testers, artists and other support staff, often across multiple companies.  That means that by the time these products made it to the store shelves back in the day (something that has, ironically, become less and less of a reliable indicator of a product’s quality as console technology has reached the point where patching has become not only possible but practically expected)  they may have had hundreds of people involved along the way, including large numbers of testers dedicated to finding and reporting bugs to be fixed.  In spite of all that, the speedrunners still manage to find glitches, exploits and other bugs.  Not all of these are necessarily going to be useful for reducing speedrun time (in fact, a lot of these don’t do much more than crash things.) but these can be little things, big things, or somewhere in between.

Of course, very few of these glitches are things that a player going through the course of the game in the intended manner would ever run into (a lot of them involve finding ways into areas that the player is not supposed to be able to go into,) but unless they’re specifically restricting themselves to this, most speedrunners are going to use every glitch they can manage to get.  And I’m sure that there are developers and testers out there who have smacked themselves in the head after seeing some of the stuff that the speedrunners have pulled off in their stuff.  In the course of running a test pass on a game like the ones featured here, a lot of the scenarios where the glitches appear would be considered edge cases, which are things that very few users would even go anywhere near.  The main problem with these edge cases is that you’re generally wandering well off the “happy path” that normal users would be on, and in general the returns on these test scenarios tend to be very low in terms of the amount of time spent running them.  Then again, if you aren’t going to find the problems here, there’s a very good chance that someone else will gladly find the problems for you.  And you’re probably not going to like the results when they do.

4. Anything that is deemed unnecessary will be skipped one way or another.  During this year’s AGDQ, one of the featured runs was for Resident Evil 4, a game that I’ve never played (it’s not the type of genre I’m interested in) but which was still quite interesting to watch.  One of the biggest things I took away from this particular speedrun was that the player basically just ran right by probably 75% of the enemies in the game without a second thought, and suffered no ill consequences for doing so.  A lot of these fights would likely be rather difficult (and time-consuming) if the player was to actually do them the way the developers intended, but oftentimes it turns out to be completely unnecessary, as they just run right by and keep going.  Of course, in a speedrun saving as much time as possible wherever possible is crucial, so a lot of effort goes into cutting out even trivial things.  In particular, cutscenes and dialog are frequent targets of speedrunners, who will often take rather unusual steps to keep them from happening or exit them as quickly as possible.  In some cases, you’ll see people literally reset the game or quit out and reload in the middle of a speedrun, because starting from scratch and reloading from a save can in some cases be much quicker than watching a cutscene.  As long as the established ground rules for a particular game allow it, this is considered perfectly normal.

Another thing you see that happens quite a bit is that players will intentionally take damage in a lot of instances in order to use the temporary invincibility that typically goes with it  to bypass things.  In games, people tend to think of health or energy (or even lives) as something they have to try to keep as much of as possible, but speedrunners tend to treat these things primarily as a tool.  In particular, games like F-Zero GX (one of the most notoriously difficult games in recent memory, and one of the major highlights of the past couple of AGDQs as speedrunners have absolutely destroyed it) give you an energy bar that acts as both your health meter and something that can be consumed as a boost, allowing you to go faster but significantly increasing your risk of failure by doing so.  Then again, this is a normal (and expected) mechanic of this particular game, but taking intentional damage to bypass obstacles and improve speed is surprisingly common in many speedruns, especially for 8-bit games like the Mega Man and Ninja Gaiden series.  In some games, strategically placed intentional deaths are a common occurrence as well if some advantage can be obtained by doing so.  Then again, most players need to use that health and those lives just to keep themselves from hitting the game over screen too soon, so a lot of this comes down to having enough skill in the first place to avoid unintentional damage as much as possible, since it becomes a lot riskier when people play this way.  This means that in addition to all the various strategies and optimizations involved in the whole process, there’s also quite a bit of raw skill required just to even be able to think about speedrunning a game (of course, even if you can’t do live speedruns you can always try to do TAS, but that’s basically something entirely different.)

5. If there’s a way to pull things off the rails, someone will find it.  In many ways, this really ties into #4, but I feel it should also be considered separately.  One of the most popular genres of games for speedrunners is the so-called “Metroidvania” games (of which the 2D Metroid and the non-linear Castlevania games such as Symphony of the Night are the most prominent examples,) which typically are played on large non-linear maps but ultimately still have a linear progression that the user is expected to follow.  Of course, it is possible to follow this linear progression and do a speedrun that way, but most of the time the goal is to finish things as quickly as possible no matter how this is accomplished, so when a new game of this genre comes out, the first thing the speedrunners do is try to find so-called “sequence breaks,” which are strategies that allow the player to subvert the expected linear progression of the game and skip significant portions of the game entirely and acquire items that they aren’t expected to have  until much later in the game.  Of course, it’s gotten to the point that a lot of developers these days just hide intentional sequence breaks into the games, but in most cases these have come about as a result of players messing around with things they aren’t supposed to be messing around with, trying to actively subvert the intended order of the game. 

The effort that goes into testing a particular piece of software is as much a matter of planning as it is execution.  After all, you (generally) have a specific set of requirements that the software must be able to meet, and you need to be able to demonstrate that the software can meet those requirements.  And these days more than ever, security testing becomes a very important part of those test plans.  After all, no matter what type of software you are working with, it’s highly likely that someone out there will be trying to find ways to get around whatever limitations happen to be in it, especially if you’re dealing with any system that stores sensitive data.  But even aside from that, you can find yourself surprised by some of the things you’ll see users try to do with your software, things you would never expect.  As you go through the various test passes and validations that you might do over the course of a software development life cycle, you start to develop a surprisingly deep understanding of how things tend to work in the system, even if you aren’t working directly with the code.  As a result, you tend to build a bit if an intuition for some of the unusual things users might try somewhere along the line.  Don’t hesitate to try some of these things out; you never know just what kind of weird issues you might manage to run into.  Not that all of it will necessarily get fixed (after all, developers’ time is a finite resource, and you eventually have to ship something) but if you can think of it, chances are that at some point someone else will do the same.

All things considered, there’s actually a surprisingly large correlation in the methods used by speedrunners and software testers for their respective tasks.  In both cases, people are going deep into the inner workings of the software they’re using to try to find things that don’t work the way they’re supposed to.  Both use a lot of the same methods, and both find a lot of the same issues.  It’s how these issues are used where things tend to diverge though.  As a tester, it’s naturally your job to find these issues in order to get them fixed.  As a speedrunner, you’re trying to find issues that you can use to break things even further.  Either way, the results can be fascinating to watch.

Advertisements

Leave a Comment »

No comments yet.

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Blog at WordPress.com.

%d bloggers like this: