Lee Wong, Experimentation & Data Analytics Program and Product Manager at Intuit, on how to scale an experimentation program & foster an experimentation culture.
The foundation is the same for both, really for all products. We start with the customer problem, compose measurable hypotheses with clear success metrics, prioritize effective hypotheses to experiment and analyze the data without bias. This ensures that we test the right leaps of faith based on customer insights, craft the right experiences for customers, get the right learnings when we run the experiment, and iterate on our business plans
What’s unique between the products are the customer problems and their behaviors while using the product and how to visualize and analyze the results as the success metrics differ.
Engage cross-functional teams early and communicate, communicate, communicate.
Teams are more invested in the experiment when they understand the hypothesis and success metrics at kickoff and are involved throughout the entire journey. It’s better to over-communicate to ensure alignment of the experiment and to share any implementation pivots that could change our learning plan, especially when teams are in various geographies and time zones. This is needed more so now with a majority of the workforce working from home.
Teams are more invested in the experiment when they understand the hypothesis and success metrics at kickoff and are involved throughout the entire journey.
While leading op mechs to review experiment results, I consistently noticed churn when the key metrics didn't clearly prove or disprove the hypothesis. Our experiment dashboards have guardrail metrics to protect the business as well as another set of data to review for goodness when key metrics are not clear. However, the analyst would spend days digging into the data to assess if the goodness is accurate and actionable. By raising this question early and identifying the next steps in the learning plan, we accelerate our speed to insights to learn and iterate quickly if the results are flat.
This question is very timely as our data teams were in the midst of data validation after upgrading our data collection library at the enterprise level. Since the new library modified data definitions and how data was collected, new business logic was needed to support this change. Our analytics and data engineering teams had spent weeks validating the data, improving business processes and assuring year-over-year business decisions. We also collaborated with product teams on the requirements to ensure data collection was implemented correctly.
The resources and time prioritized for this effort speak to the criticality of a solid data foundation. To have confidence in our data to make business decisions and drive optimization, a solid data foundation has to start with trustworthy data and reliable data availability.
To have confidence in our data to make business decisions and drive optimization, a solid data foundation has to start with trustworthy data and reliable data availability.
Experimentation is a proven technique to test ideas with real customers, collect data, and analyze their interactions to learn if an experience serves the customers' needs. An experimentation program can only be effective if we measure metrics tied to the desired learning with clear success metrics and analytics for decision-making.
How do product managers and marketers craft the right experiences to serve our customers' needs? At Intuit, we use Customer-Driven Innovation (CDI) to assess which opportunities to pursue and which solutions to focus on. CDI empathizes with the customer problem to deliver a customer-centric experience. When we identify problems that are important for the customer and understand why existing solutions do not solve these problems, we can deliver experiences that will benefit and improve our customers’ lives. By applying CDI in our experimentation program, we create user-centric experiences and evaluate actual customer behaviors based on data and test results, not opinions or conjecture.
An experimentation learning plan that supports business priorities should pursue multiple opportunities simultaneously to find the one with the most impact and value. If we only experiment for acquisition and not retention, we won’t learn if new customers will return next week, next month or next year to consume our products and services.
An end-to-end model that spans the product consists of multiple experiments and iterations. We start with rapid experiments and quick prototypes with a small percentage of customers to quickly learn what works and what does not. We review the data to understand what delighted the customer and iterate on these experiences to a higher percentage of customers. We also drill into what didn't work to improve these experiences and continue to experiment so these customers will return to our product.
By looking at the customer journey across the product, we increase the opportunities to improve their experience through an iterative learning cycle, saving valuable time and resources when making our next decision.
Some of the challenges at the enterprise level are standardized metrics platform, a community of experimentation expertise, and data literacy.
1. A standardized metrics platform
A standardized metrics platform with consistent metrics definitions across the organization can quickly surface top performing segments. Using business intelligence and advanced data analytics, it can detect anomalies in experiments. A standardized metrics platform can make experiment dashboards readily available after test launches without the need to publish the results, and enable self-service analysis. This however requires an efficient, automated framework that accommodates schema changes and new metrics from data sources. Data ingestion needs to adjust for these changes. Data compliance and regulations are handled automatically. This is no small undertaking as it requires alignment across the business on the metric definitions, how to apply business logic to metric definitions, and visualization of key dashboards.
2. A community of experimentation expertise
A community of experimentation expertise guarantees the right practice of testing to design measurable experiments and analyze the results accurately. The key components of recognizing important considerations for creating a test plan, composing measurable hypotheses with clear success metrics, and analyzing test data without bias seem simple. With a community to support each other, Yet without adequate training or experts within the organization to support their peers, teams may not have tools to create great learning plans and knowledge to avoid pitfalls.
3. Data literacy
Data literacy is the ability to read and communicate data in a purposeful manner. This is typically thought of as the role of data analysts and scientists; however, it should be everyone’s role in a data-driven culture. To enable self-service analysis and increase shared ownership of business analysis, organizations should implement learning programs to educate experimenters to data-driven analysis and how to interpret results from multiple inputs without relying on data analysts and scientists, and partner with them if deep dives into impacted segments for their experiments are needed.
Data literacy is typically thought of as the role of data analysts and scientists; however, it should be everyone’s role in a data-driven culture.
Balancing short-term and long-term gains of experimentation depends on the goal of the initiative, business priority, and resourcing. Let’s say an experimenter has an idea for a new feature. But she is not sure customers are interested in the feature and don’t have the resources to develop a fully engaging experiment. A quick method to gauge interest is to run iterative lean experiments where the first experiment is a design prototype for a smaller number of customers. If experiment 1 results in high interest, the next iteration can be an interactive feature where 30% of the feature works to a larger sample size. If experiment 2 also yields high interest, the experimenter can use the data to ask for resources to code an experience where most of the feature works to an even larger sample size as the next iteration. However if experiment 1 results in low interest, the experimenter learned fast that the idea failed without investing in a lot of resources.
To scale experimentation in companies with a culture of experimenting, companies need to strengthen their experimentation framework to enable a low cost and low effort process to successfully run experiments. A standardized metrics platform that automatically calculates sample size and test duration needed for statistical significance of overall evaluation criterion and uses advanced data analytics to surface top segments and analyze impact across treatments in multiple experiments will increase speed to insights.
By leveraging AI/ML capabilities, the experimentation framework can automate decision-making with capabilities to dynamically shift traffic to higher performing variants or stop an unperforming variant or the whole experiment without human intervention. This of course requires real-time alerting to notify collaborators when changes are made to their experiments so as to not surprise the business.
This framework takes the burden off analysts and test owners to manually analyze and deliver accurate data-driven decisions. The automated capabilities eliminate manual processes and allow teams to focus on growth initiatives and move big needles.