
Background info
During my time in Trustpilot between 2015 and 2017, I had the pleasure to work on many interesting initiatives. One of them was the design of an A/B split testing tool for the Trustbox widgets that show reviews and scores on customers’ websites.
The main objective was to have data-based proof of how customer reviews impact the conversion rate on our customer’s websites. This way they can be able to select the right widget for the right place to increase the conversion rate.
I contributed with UI/UX designs and prototyping to enable the development team and validation of the ideas. I was also responsible for facilitating design sprint workshops and user interviews.
Outcomes
Easy to use self-service A/B testing tool.
Enabled customer success and sales reps to upsell.
Proof of how Trustboxes support the conversion rate in the checkout flow.
One customer upgraded to a higher paid tier in the process of user testing the idea. This served as strong evidence for the business value of developing further the tool.
The hypothesis we started with
If we enable our customers to transparently test the Trustbox widgets’ impact on their revenue, we will be able to upsell them to higher tiers and Trustpilot will have data to help us improve the performance of the existing widgets.
To test this hypothesis we used the discovery approach that was based on the book “Inspired” by Marty Cagan. We needed to test our assumptions as quickly as possible and validate if the idea we had will bring additional value for our customers and Trustpilot.

Discovery approach
After an initial Discovery (design) sprint, the team continued with Rapid prototyping. I was making prototypes with Axure RP8 at the time and in parallel, the developers were working on designs and ideas that were validated with user testing. The PO and I were doing weekly user tests/interviews to enable further learning and make sure the feature is being developed on the right track.
Assumptions
Our team was working with too many assumptions to list them here. The point I want to make is that a big part of them got disproven with iterative testing with customers and having weekly interviews with beta testers of the new feature.
Our teams’ developers made a minimum viable technical solution to be able to get tracking data and then I was using the data to make prototype reports to show to customers during interviews and get feedback on how much detail is the right amount.
For example one of the assumptions was that Customers will want full data transparency. However, that was not the case for the majority of the interviewed people. They wanted a high-level overview and a summary of the results with an option to get the extra details if needed.
If you’re curious to learn more feel free to reach out and we can talk about the process.

Designing a user-friendly setup flow
By doing the user interviews and testing out the setup flow multiple times we ended up with a simplified version that even the not-so-technical customers could use to set up their own tests. Alternatively, the users could send the setup instructions and code to a developer that can help them.
If you are curious to learn more I can give a tour of one of the prototypes that were used to test the designs and flows.
Main challenges during this project
Learning and understanding how Bayesian statistics can apply to the design and how the tracking data will result in actionable points for our customers.
Creating a simple enough setup flow for “Mom&Pop’s shop users” so they can set up the A/B test without a developer.
Designing a report page that provides both simples to understand data but also gives the detailed numbers behind the statistical analysis.
Many of the Beta users did not have high enough traffic to achieve statistical significance of the results for under 3 weeks of testing. This made testing the report page with real data very challenging and slow.
What would I change if I could
The initial design sprint could have been more productive if we had better preliminary research on A/B testing and Statistical methods that can help us develop the feature. The whole team was new to the domain and the initial plans were perhaps too innovational. 🙂
The criteria for user recruiting could have been better so we can avoid the low traffic users especially for the A/B test reporting part where we had to wait for a long time to get the real data. Or perhaps finding a prospect that we could win by getting them to test the idea with us and proving that having the widget can increase their revenue.