Our most recent webinar, Driving Business Growth in the Experimentation Era, sparked a lot of discussion around the best practices and challenges that some might face when launching an experimentation program. Many of the questions submitted are applicable to many, if not all, businesses across industries. The Q&A session from the webinar with our CEO, Chris Goward, will give you some best practices and tips for experimentation design, identifying and overcoming roadblocks, scaling your experiments, and more.
Q: What are the most common agility blocks that you’ve seen and how are they resolved?
Chris Goward (CG): Agility can be a lot of different things. There are five categories of factors that need to be developed: Process, Accountability, Culture, Expertise and Technology. This is what I call the Experimentation PACET®.
All of them can be agility blockers, depending on the situation in the organization. Expertise is one of the biggest ones. To create a full experimentation program, you need several types of experts: Planning, Data & Analytics, Qualitative Research, Quantitative Research, Behavioral Science, UX design, UI design for conversion, engineering, project management, and more.
Agility can be a lot of different things. There are five categories of factors that need to be developed: Process, Accountability, Culture, Expertise and Technology.
Companies that can’t hire or find all those experts at once usually use experimentation consulting agencies like Widerfunnel, where they benefit from fractional ownership. In other words, they get access to dozens of specialist experts who are at the top of their field, without having to pay all of their full salaries. Plus, these experts spend the rest of their time working on other business models and situations, where they’re constantly learning new ideas to bring back to the business.
Q: How do you connect an a/b test audience to a customer satisfaction audience?
CG: Yeah, that’s a really good question. We’ve actually developed qualitative testing methodologies, such as Motivation Lab, within our behavioral science team. We have a user testing process that combines looking for UX problems and insights with emotional frameworks to allow us to understand the users’ emotional expectations and their emotional types. Essentially we’ve developed screeners to help us identify the right target audience that matches the customer profiles and different segments.
You can, of course, go through your customer list and find representative customers. Or, you can look for potential representative customers by using emotional and psychographics screening, which is what we do and we’ve developed some pretty advanced screening methodologies. One of the most important steps in user testing is to not take the blanket panel that comes with the off-the-shelf online user testing tools because those are often professional user testing participants. They’re not necessarily the ones who represent your customers and they’re likely not in the market for your product. So are they really answering questions from a representative perspective? Usually they’re not. It’s very difficult and very important to find the right representative emotional context and product need for your behavioral sample.
Q: You said at the beginning of the presentation that some companies run thousands of experiments a year. How does it affect the test results when there are a large number of other tests running on the same page that could be hurting or creating false positive test results?
CG: Right. So, you have to have the volume of customer touch points to be able to handle that kind of velocity. Those kinds of testing programs are very high traffic, with high enterprise volume and level of programs.
It actually takes a lot of sophistication to run that kind of a testing program, not only in making sure that your tests are controlled and run for the right amount of time, but also that the tests are valid and you have enough sample size. There are also components within all five areas – Process, Accountability, Culture, Expertise, and Technology – to consider. In the technology area, you have to have a lot of robust ways to keep your sample groups clean and reliable. You’ve got to make sure that you can run parallel experiments without polluting each other and have flags for stopping the experiment(s) if things go sideways.
You’ve got to make sure that you can run parallel experiments without polluting each other and have flags for stopping the experiment if things go sideways.
We’ve learned a lot by working with some of these large enterprise companies. They’re often building their own testing tools that are there baked into their technology and into their product, so they have a deeper set of metrics and flags built in. So, yes, there are all kinds of complexities to planning and executing that kind of high volume deployment. It doesn’t happen overnight. That’s why there are those five levels of maturity. You have to build the processes and technologies at every stage to handle the next stage of velocity.
Recommended next reading
Going from 10 to 100 experiments per year: Building the frame
Q: What challenges do you see with large multinational companies that have many separate stakeholders over the world, and how can they overcome that decentralization?
CG: So we actually are very experienced with this. In 2010, I created an organization that is now called the Global Optimization Group. It is a network of companies, similar to Widerfunnel, who are the leaders in experimentation. We’ve created a joint venture, so that we can service large multinational global companies with aligned frameworks, methodologies, and perspectives on experimentation.
And so the answer, specifically, depends on the situation in some companies. If the culture of the organization is decentralized, then the experimentation program should often be decentralized and match it. If the culture is more centralized, then really what you need is regionalization to make it applicable, with a central control and a center of excellence for experimentation expertise.
We’ve done both and we know how to navigate these kinds of situations, but there’s definitely some complexity to that because there are so many variables when you’re talking about a global organization. We have a network of companies that spans Germany, Austria, France, the Netherlands, the UK, Australia, and India. With this, we have the expertise to work with a wide spectrum of different types of cultures and organizational structures.
Q: Normally we measure the conversion rate or other main metrics during the tests, but there are a lot of factors that can affect that rate. Is it ok to only check the conversion rate while running the tests and understand the uplift as the outcome of the program?
CG: What we always do in our experiments is not just measure one metric. When you only measure for conversion rate, for example, you can end up with a misleading result because conversion rate is very one dimensional. We’re often measuring many micro conversions. If you increase your conversion rate, for example, what happens to the return rate? What happens to the average order value? What happens to the customer satisfaction?
When you only measure for conversion rate, for example, you can end up with a misleading result because conversion rate is very one dimensional.
We had a really interesting example with with our HP clients a few months ago, where an increase in subscription rate actually increased unhappiness or dissatisfaction with the service because there was a critical piece of information that became less prominent. People were signing up for something, but they thought they were getting something a little bit different. So it looked like a successful result, but if we hadn’t been measuring the post sign-up customer satisfaction, we would have missed a really, really important insight into that particular experience design. This actually led us to create something better that increased the conversion rate even more while maintaining customer satisfaction. So that’s an example of why a blend of metrics is really important.
Q: There’s a lot of work that goes into an experimentation program and getting executive buy-in is easier said than done, how would you recommend we can go about this?
CG: Probably the most common thing is that there’s some decision-making methodology that’s in place that stops you from moving quickly and making fast decisions. The best thing you can do is to find a way to get executive level support, or to find a way to get around the roadblocks, perhaps by doing something similar to some of our clients that create Skunk Works projects.
For example, if you can carve off a piece of your digital experience to create an individual program that flies under the radar, then you can build quick wins and successes to validate this approach. You can then sell it internally with more weight behind it and find your champions in the executive level that can help to fight that battle with you. There are a few techniques to doing that. I wrote a blog post about that a few years ago now, which you can find here.
Q: If you could run only one experiment at a time, how do you prioritize what to test?
CG: This is exactly why we created the PIE prioritization framework. It identifies the criteria that are important for the business and customers to make sure resources and traffic are allocated to the most beneficial experiments.
Q: Regarding experimentation, what are the differences between how experimentation is applied in a B2B or a B2C organization?
CG: I would say that for the most part, there is no difference between how experimentation is run between B2B and B2C.
There are some general trends that tend to be more likely. For example, in the consideration cycle, the way decisions are made is often different. B2B tends to have more decision makers involved, it’s usually a more considered purchase with a longer decision cycle, and the decision tends to be more weighted towards lead generation versus immediate e-commerce. But all of those things aren’t necessarily the case. Sometimes B2B is a fast decision with immediate purchase and B2C is a longer decision cycle.
What’s most important is to understand is that people are people. Regardless of what they’re buying or the type of decision they’re making, they go through the same decision steps. As they’re making a decision, whether or not they’re a decision maker or an influencer, they’re asking the same kinds of questions.
What’s most important is to understand is that people are people. Regardless of what they’re buying or the type of decision they’re making, they go through the same decision steps.
The LIFT model, for example, applies to every type of purchase. The clients that we’ve worked with span B2B, B2C, e-commerce, lead gen, affiliate marketing, publishing, you name it. What we’ve done is try to keep a broad perspective on the types of companies so that we can bring ideas from different types of industries across to your industry. For example, we try things that maybe haven’t been tried before in your sector, which often are innovative. Most people only look at their immediate competitive set to try to get ideas for how to improve their business, and that’s really limiting. It’s more valuable to look across the aisle at different types of industries to find out how they’re answering similar user experience problems and conversion funnels.
Q: What is Liftmap?
CG: Liftmap is a cloud platform for experimentation planning and insight management. Widerfunnel designed and developed it to manage and store all the experiments we run for our clients. It’s now become a powerful database of validated insights that we use to find patterns and research into what drives customer behavior.
Within Liftmap, our team conducts LIFT Analyses on customer experiences, designs experiment plans, manages variation design approvals, tracks development stages and workflows, and captures experiment results and insights. It is also an interface with our clients so they can see realtime updates on the experimentation roadmap and a dashboard of the performance of experiments as well as the overall program metrics.
Q: Do you offer Liftmap software separately?
CG: We actually are doing that on a beta trial basis right now. So if you want to be in the beta program you can contact us and we can see if it’s a good fit for you. We’re being very selective as it’s normally only offered to our enterprise clients, but we are we are offering it as a beta program.
Q: Will be there a future research report and 2020 and how can we participate?
CG: Yes, there will be one. We’ve started planning the one next year as we speak. If you’d like to be involved in that, just contact us and we’ll get in touch to make sure you’re involved in that research.
Discover how your experimentation program stacks up!
Benchmark your experimentation maturity with our new 7-minute maturity assessment and get proven strategies to develop an insight-driving growth machine.Get started