How we approach engineering problems

In previous posts we’ve described a bit about our product design processes here at the Zoo.  Now we’d like to share some of the lessons we’ve learned about how to approach the engineering processes for building out the product.

We’re strongly influenced by Agile methodologies in general.  Though we’re not religious about any particular methodology, we apply our core principles in our daily work — when we’re writing stories, prioritizing, setting direction, and developing implementation plans.

Here are some principles we find particularly useful:

Reflective design and refactoring are at least as useful as predictive design.

This comes into play in many places, but especially when we are deploying a major new feature.

For example, when we developed our ‘try’ feature over IM (it sends you a question on-demand), we first created an architecture that would allow for the basic end-to-end functionality.  Then we iteratively evolve the design to allow for more intelligence to be included in the targeting of these questions to the user.

Since the development is incremental, we can interweave user testing and user feedback with the evolving design process, and make sure that the changes we’re introducing are always delivering real value.  Sometimes this involves refactoring a previous version of the implementation — but that always means that the new implementation is much cleaner, and has clear benefits.

Transparency to what is happening in the live system is priceless.

Since developer time is much more precious than machine time, it is worth spending extra machine resources in order to capture and organize all of the data we need to evaluate our system’s health.

For example, we are happy to run our servers with much larger memory, storage, and CPU requirements than is minimally required in order to allow for detailed logging and telemetry about the internal state.  This makes our code much easier to debug and test.

Some of the tools we use for monitoring and measuring include SplunkMunin, and Monit.

Solve real problems we actually have now, and trust ourselves to solve the problems of the future in the future.

For example, this comes into play in the way we approach scaling issues here.  We first gather as much data as possible about current system performance to find any bottlenecks and determine their impact on overall latency throughput.  Note that this is starkly contrasted with a strategy of jumping ahead and trying to architect the kind of system we imagine might be needed later.  (The short rule, here as elsewhere:  don’t speculate!)

It turns out that by carefully monitoring and measuring each component, and investigating curious things when they pop up, we have been able to smoothly add capacity to our system in an incremental manner.

Process is like code:  these principles also apply to developing, maintaining, and evolving the way that we work.

One of the things we’re most proud of here is that we continue to evolve our process on literally a monthly basis.  As the team grows, we figure out how to change our design pipeline, our prioritization meetings, our acceptance process, and so forth, to accommodate these things.  In weekly meetings we have quick post-mortems on how the process served us in the past week, and what tweaks are needed in the week to come — and then after every major cycle, we have a full retrospective and determine how we want to organize our efforts for the next round.

Some of these points might seem obvious or common-sensical, especially to those in Agile circles.  But, as many startups can attest, it’s not easy to keep your principles in mind when in the throes of intense sprints and rapid course corrections.  We’ve found that the challenge of sticking with these principles has really paid off in terms of our overall efficiency and quality — and has been hugely helpful in making this an exciting and rewarding place to work.

So to the developers out there – any other ideas along these lines to suggest?  We’d love to hear them.

4 Comments

  1. Posted July 24, 2009 at 5:29 pm | Permalink

    Wow. We are a MSP/ IT Support Company here in NYC. We just began writing about solving problems and teh philosophy behind our approach to it (You can find it here: Problem Solving: A Philosophical View)

    This will be the first article of many. Defining your ideology / philosophy is very important for developing an ongoing company culture which determines the quality of what you produce.

    Still learning….:)

  2. Posted July 27, 2009 at 9:43 pm | Permalink

    “Note that this is starkly contrasted with a strategy of jumping ahead and trying to architect the kind of system we imagine might be needed later.”

    Can you tell us more about how you think about the longer term (3 mos, 6 mos, 1yr)?

    Thanks for an interesting post.

  3. Posted June 9, 2010 at 1:53 am | Permalink

    Thanks for an interesting post.

  4. Posted June 9, 2010 at 1:54 am | Permalink

    Transparency to what is happening in the live system is priceless.

One Trackback

  1. By Welcome Yahoo! Messenger users! on August 19, 2009 at 10:49 am

    [...] to move fast, with agility, while keeping bugs and production incidents rare. (Read more about our approach to [...]

Post a Comment

Your email is never published nor shared. Required fields are marked *

*
*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>