How To Run an A/B Test in App Store Connect (aka. Product Page Optimization)

Ariel

9 minute read 5/3/22

How To Run an A/B Test in App Store Connect (aka. Product Page Optimization)

App Store Optimization is about discovery, which is why keywords are so important. But... Keywords don't earn downloads. You know what does? The story you tell users that land on your app's product page. And that story is best told through screenshots, which draw attention almost immediately.

Which means your screenshots earn downloads. Or the flip side, your screenshots could be hurting your downloads.

Like keywords, which need to be optimized, screenshots do too.

In this guide:

Don't Guess, Test!
How Does A/B Testing Work in the App Store?
Setting Up an A/B Test in App Store Connect
Tracking Experiment Performance
Tracking Overall Performance
What To Do When Your A/B Test Ends
Easy Things to Start Testing
Ready. Set. Test!

Don't Guess, Test!

The naive approach to screenshot optimization is to change your screenshots when uploading a new version and see if downloads improve.

But how can you tell if the change in screenshots made an impact on download and not the time of the month, organic promotion, or something else like that?

Generally, it's pretty hard to isolate improvements.

The best way to tell if a new set of screenshots are better is by running them against the existing screenshots at the same time, so any external forces will impact both. Not only that, it also gives a very simple way to determine which are better scientifically. Gut is good but data is better.

That approach is called A/B testing, and recently Apple gave all developers a gift in the form of the ability to run A/B tests right in the App Store, something that's never been possible before. Aaaand, Apple manages it all for us!

How Does A/B Testing Work in the App Store?

A/B testing is a rather simple concept. When it comes to screenshots, an A/B test pins the ones you have in your App Store page against a new set.

When you run an A/B test in App Store Connect, Apple will show both variations to different users in the App Store and count how many of those convert to downloads for each of the versions. You do that for a short amount of time, and at the end, take the variation that has the best conversion and show it to everyone.

LEt's have a closer look at how to actually do it.

Setting Up an A/B Test in App Store Connect

Setting up an A/B test in App Store Connect is pretty straightforward and consists of two main stages. Setting up the test itself and uploading what you're going to test.

FYI - You can test screenshots, App Preview videos, and icons, but icons require uploading a new version of the app with the icons to test. In this guide I'll focus on testing the first two.

Setting Up the Test

Log into your App Store Connect account
Select My Apps and then the app you want to start the A/B test for
Select Product Page Optimizations from the Features menu on the left
Select the + in the header. If you don't see such button you probably have a test running or in draft. You can only have one active or draft test at a time.
Give the test a name. I like to use a name that describes what I'm testing (ex. "Serif font")
Select the number of alternatives you want to test. Apple calls them "treatments".
Select how much traffic the alternative(s) will get. I always divide traffic evenly, so one alternative = 50% to keep things balanced.
Select the localizations this applies to. I keep tests focused, so mine usually span a single localization per test.
Click Create Test

Uploading Alternatives (aka. Treatments)

As soon as the test is set up, meaning right after the last step above, you'll see a new page with a place to upload new screenshots. You may have one or more, depending on how many alternatives (treatments) you selected in step 6.

Now comes the easy part - simply upload the new screenshots into each treatment. Treatment is a fancy name for alternatives, and I like to call them alternatives because that's really what they are.

After you add the new screenshots, make sure to sort them the way you want by dragging them, and when they're all ready, click Start Test at the top.

That's it. You did it. You're now the proud new owner of an A/B test in the App Store.

Things to Keep in Mind

Tests can run for a maximum of 90 days. This will be important in a few short sentences.
Tests can have up to 3 additional variations to test. However, I recommend no more than one to keep things simple.
Tests can run in any localization your app already has set up, but not new ones. In an effort to keep things simple, one is enough.
You can't set up tests while a test is running. That's a bummer because it makes setup a hassle, but that's okay.

Tracking Experiment Performance

The easy part's done, now comes the fun part. Sitting and waiting... You can stand and wait, but this is now a waiting game. Kind of.

Once the test goes live, Apple will start showing the new set(s) of screenshots, your alternatives, to users who find your app in the App Store according to the percent of traffic you set in step 7.

Every time a specific set is shown, it's also tracked and reported in the analytics section of the experiment on App Store Connect.

Within a few days, Apple will start showing you how many people saw the set (impressions) and the perfect of views that went on to download the app (conversion rate).

The conversion rate is the most important metric to track to determine whether the new screenshots are "better" than the existing ones, and is the most important one to focus on over time as you wait for the test to accumulate data.

How long should you wait? Until you have reasonable results, which in my experience, means at least a week of data before you can even look at results. Looking too early will almost always be misleading. This is because you want enough people to see the app in a way that is statistically representative of all possible users who see the app.

For most apps, that's a week at a minimum. Very popular apps can do that more quickly, but when it comes to statistics more is better, and unless you're in a rush, I'd give it at least a week, but realistically longer. A month.

The key here is to make sure that your conversion rate isn't tanking. It can drop a little below the original, but not much below it in a way that's obviously hurting downloads. If it's terrible, end the campaign and start over with a new set. But only if it's terrible.

But there's one more thing you need to keep an eye on that is even more important than conversion and isn't included in the analytics report in App Store Connect.

Tracking Overall Performance

Zooming out a bit, it's important to keep track of what you're really A/B testing for, and that's revenue! Unless your app is totally free, of course, and you can get away with just the conversion rate.

But if you're monetizing in the store, be it with paid downloads, in-app purchases, or subscriptions, your bottom line is still the most important metric to watch. Even when testing.

See, a test can improve conversion rate from impression to download, but could attract the type of user that won't pay. There are many reasons why a test would do that, and many revolve around relevance, but that's for a different guide. What's important to remember is that higher conversion doesn't necessarily mean more money. In extreme cases, it could mean less.

So you have to watch revenue!

I like to overlay revenue on downloads or on page view and watch them throughout the test. And just like a drop in conversion rate, if there's a significant drop in revenue, end the test and restart it.

What To Do When Your A/B Test Ends

We've covered the easy and fun parts, so naturally, now comes the hard part. Kind of.

After enough time has passed, a trend will emerge. That trend can either be statistically significant or not, and can show the original performed better or not. What do those mean?

Before we can answer that question, we have to understand how statistics are used to "guess" how everyone will behave by only looking at a small number of people. I'm pretty sure you don't want to brush up on statistics right now, so I'll skip to the important bit.

Statistical significance is a measure of how likely "everyone" is to behave based on the few people in the test, and the way Apple has it set up, there's a very clear line between results that are and aren't statistically significant.

Let's say you want to run the test for 4 weeks. If after 4 weeks the results are statistically significant, you should end the test and choose whichever set converted better. If it's the original, great. If it's the new alternative, great. Whichever won should be set as the new set for everyone.

When results aren't statistically significant, you can do one of three things. Let the test continue running, end the test and go back to the original set, or end the test and pick the set that seems to be winning.

Don't do the last one. Guessing isn't science. Thanks.

The best way is to let the test continue running. If you hit Apple's 90-day limit, set it up to run again as a new test. I'd let a test run at least twice before calling it quits. It's a lot of time that feels wasted though, so if you have other things to test this would be a good time to test them.

Easy Things to Start Testing

Testing sure is easy, but... where do you start?

The easiest thing to test is captions. They need to tell a story, and getting that story right the first time isn't always easy. Perfect for an A/B test! Another is presentation. Colors, fonts, and layout can turn a dull story into an exciting one. Or ruin an exciting one.

If you want more ideas, check out this list with 16 A/B tests you should try with examples.

But if you just want to try A/B testing with minimal effort, you can always try sorting your screenshots in a different order. It probably won't change things significantly, but something is better than nothing.

Ready. Set. Test!

A/B testing in App Store Connect is nothing short of a gift from Apple. It's basic simple and easy to use so you really shouldn't have any excuse for not testing. And you read this guide to the end, so you definitely know how.

Happy testing!

Appfigure's universal analytics brings together all of your most important metrics from all platforms your apps are on, into a single platform, from downloads to revenue and subscription analytics. And that's only scratching the surface of what Appfigures can do to help you get ahead of the competition! Not a user yet? Get started →