Category Archives: Productivity

Adventures in Story Points

Story points are a corner stone of the agile development process, giving a key piece of information from those that implement the story to those that decide what gets built. It’s not the only piece of information that goes into planning at the product level, but it is the most important from a process point of view. Story points don’t just estimate how long a story will take, but bundle in estimates of technical risk and uncertainty. From a PM’s perspective two stories can sound the same, but end up with a large gap in the estimated story points. This could be because the engineers are deep within the code base that pertains to one story and not the other; and thus feel more confident about exactly what needs to be done. Or it could be because some things sounds easy to implement but aren’t. In my days as a consultant I’ve seen story points described in a few different ways:

  1. The number of days an engineer thinks it will take.
  2. An estimate of the size and complexity of a story; time estimate + risk factor
  3. A comparison of how big one story is to another; story1.size == story2.size

The pure agilest will tell you that story points are #3, but in the real world its hard to compare the size of a story with another without giving an inherent engineering hours estimate. Where teams get in trouble is when they start measuring their velocity from one sprint to the next as a ‘team performance metric’. If engineers get hounded when their velocity dips they are incentivized to either over-point or give points in terms of time, aka method #1 which is a flawed way of pointing a story. Points are for size which why we estimate hours for story tasks during sprint planning as well, because you have the best ability to judge how long a task will take right before you start it. If you point a story, and it sits for six months in the back log and then comes out and the assumptions around implementing it have changed, like say it was pointed assuming a certain library would be available and now the legal team has nixed using that library, then you’re going to be in trouble when it comes to measuring your velocity. Ideally you’d re-point anything that has sat around that long, but unless someone flagged that story as dependent on library xyz then the team might miss the fact that it no longer has accurate points.

A team’s velocity will change over time. So using story point velocity as a golden ruler of performance will make for a lot of unneeded stress. As a project moves from laying the architectural framework, to fleshing out all the details the team will gain or lose velocity depending on everyone’s abilities, throw in team member churn, ramping up new members, triage meetings to support legacy code, taking time to fix bugs and seeing a downward velocity in story points per engineer per sprint isn’t unheard of.

So how should you use story points? There is no perfect solution. And that’s what agile is all about. You have to find the right implementation for your team. How’s that for copping out of a decent answer? On a more specific note I tend to favor deciding how many points an engineer should take in a sprint first, and then using that as a measuring stick for how big a story is. As an added bonus to help keep story points consistent over time I find it helpful to pick out stereotypical stories during sprint closing as a reference for future grooming.

Story points are an integral part of the agile development process, and it’s one that is often contentious.  I’ve found it helps to have a written definition of the teams interpretation of story points (a long with all other process terms) to ensure everyone is on the same page.  So, don’t fall into the trap of militant Points per Sprint velocity measurements or bickering about what a story point means each print. In the end, story points are what you make of them.

MapReduce for Clearing Clutter

My desk is cluttered.  Some would call it a train wreck.  Some might even feel terrible about it being so cluttered.  I am one of those people.  But I’ve let it slide because of the priorities in my life; family, work, and personal health take precedent over battling a chaotic desk.

Of course, everything on it is there because I didn’t have time to deal with it in the first place.  But as I have transitioned out of school and into the working world, my life has become more routine, with more free time. And my desk has been taunting me.  It calls me names when I walk by, and earlier this week it started a war when it tried to dump a kitchen knife prototype on my foot.  The line had been crossed.

I dove right into the problem in ‘Naive Desk Clearing’ mode and soon felt overwhelmed.  I needed a strategy, and in a flash I decided my giant cluttered desk was a clustering problem.  Before me lay a giant pile of unstructured data.  There were distinct categories of stuff, each of which required a different thought process to deal with.  So trying to just iterate through the pile would have me context switching with each Desk Object, and thus wasting lots of time. And since I’ve been working with Hadoop at work, it seemed like an interesting way to tackle this real world problem. As they say, if all you have is a hammer, then every problem becomes a nail.

Abstracting a monoid from a sea of random stuff on a desk is tough, but seeing it as a clustering problem came fromthinking in terms of the attributes of the Desk Objects as what is being processed and not the Objects themselves.  Attributes -> Features -> Feature Sets -> Vectors -> K-means clusters. With my mind in feature set mode, it was time to do some mapping.

Mapping and Reducing the Desk Objects:

I start by mentally chunking out sections of the desk. Next, I process each chunk and score each object in the chunk mentally and put it into a pile based on its highest scored feature.  This is where my single processor humanity was at odds with MapReduce.  If I were a cluster, I would score all the objects in one chunk of Desk Objects while others scored the other chunks; then we’d switch gears, shuffle up our objects, do some clustering on each new chunk, then try to combine our chunks of data.  But I’m not a cluster of computers, so I put all the financial docs in one pile, kid’s art in another, electronics in another etc, in one step; and then moved onto the next chunk of desk space.

In the end it was more of an iterative approach because I couldn’t parallelize the process, but seeing the problem in terms of MapReduce helped me get past the overwhelming boredom that comes with a mundane task.

And pulled from the chaos was an old Seahawks belt, just in time for the Superbowl.