In our latest study, over 85% of the experimentation program leaders we spoke to share a strong desire to be testing a lot more than they currently are.
But that’s not necessarily the case for Brianna Warthan who leads testing at AT&T. With an output that’s grown to hundreds of tests per year, Brianna knows firsthand that velocity isn’t necessarily the best metric–or goal–for a testing program, whether you’re at the largest telecom provider in the world, or only just building momentum for your program at a fast-growing startup.
We connected with Brianna to discuss why more isn’t always more.
Velocity is often a metric used to measure the success of an overall program, but you’ve spoken to us in the past about a propensity for high velocity, low impact programs. Why isn’t velocity the best metric for an experimentation program?
I should begin by saying that I don’t want to discount the value of consistent, stable velocity. If you have a fluid process in place where every win can get implemented, then velocity can be a really valuable metric. But unless there’s measurable output and data utilization from a testing program, an increase in velocity doesn’t necessarily equate to an increase in impact. It can be used as part of the strategy, but using it as the only strategy or goal won’t lead to success.
You started to encounter this in your experience at AT&T. How did you know you were running too many tests?
I realized that we just weren’t utilizing enough of our resulting test data and learnings. The processes that supported the program weren’t scaled to meet our velocity, which led to a lack of deployed results. Customer needs, online experiences, and your products themselves can all change quite rapidly, so there’s a shelf life on test results. We realized we had to increase the bar for our intake quality to help ensure our team’s efforts weren’t in vain.
You ended up reducing the number of tests you were running by as much as 40% at your peak. How did you do it?
We did not intentionally decrease velocity, but rather prioritized concept relevance and quality with an understanding that velocity might suffer. We insisted on data driven test cases and worked very hard to teach best practices and develop good skills across our stakeholder teams. We prioritized product sponsored concepts to increase alignment early in the process, and eventually our velocity recovered. This gave our product teams awareness of what results might be on the horizon so that they could ready their backlogs and more easily connect concepts to features that were already active in their spaces. It helped us to deprioritize test concepts that our product teams just weren’t going to use, and instead focus our resources on tests that would have the most immediate impact on the business.
What helped you to create that alignment between these different stakeholder groups?
We refined our criteria for intake decisioning and worked to ensure that each concept we approved for the testing backlog had proactive collaboration between stakeholder groups. We also created a cross functional committee to collaborate across teams to share ideas and results, best practices, and talk about cross impacts. The committee has become a regular part of our program and a valuable interaction with our partners.
Executives often still look to metrics like velocity as a key indicator of program success. How do you manage their expectations when velocity starts to dial back? Are there more relevant metrics?
When you’re creating a program, participation is the first barrier you have to break through–and velocity can be a good measure to drive interest and engagement. But as excited as executives tend to be about driving investment into testing, they want to see how a testing program will realize its potential value. As an alternative, I recommend using program metrics that focus on value and demonstrate how your program data is impacting decisions.
How did this approach impact your program?
Doing this has shortened our runway to production–and by quite a bit! Our teams are more frequently testing concepts that are related to active projects, and our test data is informing day-to-day decisions. In the past, our test ideas were great but not necessarily relevant to active projects, so results were useful to inform funding of new projects. In a large organization it can take a long time to progress through project planning and approvals. When we align with active projects and win, we can often influence the next Program Increment. We still test transformative things, but we’ll prioritize a concept that we can implement faster when we have to choose.
We’d be remiss not to address one elephant in the room: AT&T is a massive organization and your testing output is huge. What can smaller companies and programs learn from your own outlook on velocity?
Gather data to inform your test concepts and consider how your tests will inform decisions. Leverage analytics to find high value opportunities, prioritize things that you can build faster, and use planning to make sure you’re not cannibalizing your own program by throwing too much into the testing pipeline at one time. If an SMB or someone building their program can hone in on concept quality in parallel to that engagement velocity metric, they will be well ahead of the curve.
Brianna Warthan is Associate Director of Product Management & Development at AT&T where she manages a team focused on optimization and experimentation that supports multiple products and business stakeholder teams. Check out Widerfunnel’s latest experimentation study – be notified when it’s ready:
Subscribe to get Spotlight Series and experimentation-related updates straight to your inbox.