Screen Flow Pattern in XCUITests with statically typed flows and interactions

Published: 25th September 2021; Updated: 10th August 2022

We are strong believers in UI automated tests and would like to help everybody to write better, cleaner, more reliable and more sustainable tests.

We find that the code examples around XCTest, including Apple's official XCTest documentation are focusing too much on individual framework APIs like how to find a button and tap on it, but these examples are setting a poor standard for structuring larger test code bases.

In the following chapters, we will take a look at how to go about structuring your UI automated tests with the goal of having a robust, scalable and sustainable tests code-base.

Screen Flow Pattern

One of the most robust and structural patterns when it comes to native mobile UI tests is something we call Screen Flow Pattern which is similar to Microsoft's Fluent Page Object Pattern¹. The advantage of using this pattern:

Strongly typed screens tied to their actions. It is impossible to even compile a code that would try tapping on a button that doesn't exist on that particular screen.
Highly reusable test code. All querying and interaction code sits inside reusable Screens, which results in:
Highly readable and scalable test code. Test classes have minimum code and have a "natural language" level of readability.

Implementing Screen Flow Pattern

It is always more interesting to see the outcome from the caller/test perspective especially as this is what we are optimizing our code for and then we will deep-dive into implementation details.

func testChangingEmail() {
    XCUIApplication().launch()

    WelcomeScreen()
        .tapOnLoginButton()
        .submitLoginForm(username: "demo@appelium", password: "App3l1um")
        .navigateToSettings()
        .tapOnEditProfile()
        .changeEmail(to: "new@appelium.com")
}

The Screen Flow Pattern requires that the screen verification (synchronization point) is done within Screen constructor or init function. In order to have the verification logic reusable, we can create a Screen protocol and implement the generic verification logic as an extension, so that all objects that conform to this protocol will effectively become screens and be verifiable in a standardized way:

protocol Screen { }

extension Screen {
    var app: XCUIApplication { XCUIApplication() }

    func verify(timeout: WaitingTime, file: StaticString = #file, line: UInt = #line, expression: (XCUIElementSnapshot) -> Bool) {
        // 1. Take a snapshot of the app
        // 2. Assert general (screen agnostic) conditions
        // 3. Assert screen specific conditions from `expression`
    }
}

Notice a few things in the code example:

WaitingTime

We are using this type in order to standardize the waiting time for screen appearance/verification. This can be further simplified by using the plain TimeInterval directly, but from our experience, it is better to limit and standardize the waiting time to only a few possible values. Later on we will declare: .uiInteraction, .networking and .appLaunch.

Test failure location

We are setting file and line in verify(...) function - this will improve the failure messages and include the location of the caller as the source of a failure rather than having all the test failures always showing Screen.swift:20 as the failure location.

Verification block with `XCUIElementSnapshot`

The verification function accepts a closure (XCUIElementSnapshot) -> Bool - this is where the screen-specific verification is done. In order to make sure the verification is atomic and fast, we will use the AppeliumTests SnapshotQuery APIs.

SnapshotQuery APIs

If you are unfamiliar with SnapshotQuery APIs, have a look at our previous article: Running XCTests radically faster using SnapshotQuery and getting better insights.

Now, let's implement the verify(...) function:

func verify(timeout: WaitingTime, file: StaticString = #file, line: UInt = #line, expression: (XCUIElementSnapshot) -> Bool) {
    let screenDidVerify = wait(timeout: timeout) {
        // Take a snapshot of the app
        guard let snapshot = try? self.app.snapshot() else {
            return false
        }

        // Verify that all loading indicators are gone
        if snapshot.activityIndicators.count > 0 {
            return false
        }

        // Finally verify the screen specific conditions
        return expression(snapshot)
    }

    if !screenDidVerify {
        XCTFail("\(type(of: self)) was not shown or is found in an invalid state.", file: file, line: line)
    }
}

For the sake of simplicity, let's just assume that wait(timeout:for:) is going to return true only in case the block gets evaluated as true within the timeout, otherwise it will return false. We will dive into the implementation of waiting a bit later, but for now, let's examine the steps we perform within the verification block:

Taking a snapshot of the app

This way we can perform all the verification queries on a single snapshot, making the whole verification atomic and significantly faster. If you want to read more, we encourage you to check out the SnapshotQuery article.

Waiting for loading indicators to disappear

This is not a must, but we found that having this as part of general screen verification helped in making tests more stable and reliable. And yes, this includes the tiny little loading indicator in the iOS status bar.

Pro tip

If you have a custom and generic loading indicator component, you can extend the verify(...) method to also make sure that one is dismissed before performing any other verification steps. Just make sure that this component always uses the same accessibilityIdentifier.

Screen specific verification

This is the customizable part and screen-specific. The closure is injected by the concrete Screen type, so let's examine how are we going to implement those.

Modeling screen-specific actions

Let's start with the WelcomeScreen which has a static text label and 2 buttons for login and registration:

struct WelcomeScreen: Screen {
    init() {
        self.verify(timeout: .appLaunch) { snapshot in
            snapshot.staticTexts["Welcome"].exists &&
            snapshot.buttons["Login"].exists &&
            snapshot.buttons["Register"].exists
        }
    }

    @discardableResult
    func tapOnLoginButton() -> LoginScreen {
        self.app.buttons["Login"].tap()
        return LoginScreen()
    }
}

Note how we often refer to the conditions inside Screen as a synchronization point. This is very important. It is essential to make sure that the conditions inside the verification block are unique to that screen. Without this, the app might still be on the previous screen but since the next screen condition is not unique enough, the test code would already move to the next step and the test might fail.

The @discardableResult is used in order to avoid warnings in tests about unused results when this is the last test action. Since the test can end with any screen being shown, we will practically annotate every function like this.

WaitingTime

This is where we can see how the WaitingTime is used - knowing that the WelcomeScreen will be the first screen to appear, and also knowing that the XCTests can take a longer time to launch and set up the test automation engine, it might be worth having WaitingTime.appLaunch defined as 60s.

struct WaitingTime {
    private(set) var timeInterval: TimeInterval

    private init(timeInterval: TimeInterval) {
        self.timeInterval = timeInterval
    }

    static let appLaunch = WaitingTime(timeInterval: 60)
    static let uiInteraction = WaitingTime(timeInterval: 5)
    static let networking = WaitingTime(timeInterval: 20)
}

These constants are going to be largely app specific - maybe your app would need a 4th constant, or maybe 20s for networking isn't going to be sufficient. Feel free to fine-tune it according to your specific app constraints and experience.

Screen actions and chaining

Notice how in the previous code example the WelcomeScreen returns an instance of LoginScreen. This is how all the magic of chaining happens. And, given that the init function of every screen serves as a synchronization/verification point, the caller never gets passed an instance of a Screen, which is not verified (that is present on a screen in an expected way).

For completeness, this is what the implementation of LoginScreen might look like:

struct LoginScreen: Screen {
    init() {
        self.verify(timeout: .appLaunch) { snapshot in
            snapshot.textFields["username"].exists &&
            snapshot.secureTextFields["password"].exists
        }
    }

    @discardableResult
    func submitLoginForm(username: String, password: String) -> HomeScreen {
        self.app.textFields["username"].typeText(username)
        self.app.textFields["password"].typeText(password)
        self.app.buttons["Submit"].tap()

        return HomeScreen()
    }
}

Waiting for conditions to succeed

In the previous code example, we have used wait(timeout:for:) function which returns a Bool but until now we haven't explored its implementation.

It is worth mentioning that XCTest provides XCTWaiter which sounds similar, but we found the APIs to be too tied to XCTestExpectations and don't provide enough flexibility for waiting for arbitrary block-based conditions.

Let's take a look at a very basic implementation of waiting:

func wait(timeout: WaitingTime, for condition: () -> Bool) -> Bool {
    let startTime = Date()
    while true {
        if condition() {
            return true
        }

        // Spin the runloop and retry the condition after a while
        CFRunLoopRunInMode(CFRunLoopMode.defaultMode, 0.5, false)

        // If the maximum waiting time is exceeded, erturn false
        let elapsedTime = Date().timeIntervalSince(startTime)
        if elapsedTime > timeout.timeInterval {
            return false
        }
    }
}

This is a simple and yet quite powerful implementation of waiting that will be sufficient for most projects. As your code-base grows, you might find a need to extend the waiting with:

Retry mechanism - Think of waiting for a post to appear in a feed. You might want to trigger pull-down-to-refresh or tap on a reload button every few seconds because the passive waiting might not be sufficient.
Idle time optimization - Notice that we used 0.5s of idle time between two condition checks. We found this value to perform great for general-purpose UI automated tests. However, there is always room for improvement. You might want to make it progressive, for example: starting with low values ~0.1s and gradually rising to 1-2s. This is a performance optimization technique based on probability theory which aims to minimize time spent on waiting for quick events (like screen transitions which don't involve networking) but at the same time free up some CPU power for longer running queries (involving slow network calls).

Limitations with static type checking

All the examples above work marvelously when each test action always results in the same screen being shown. But, what happens when that is not the case?

Imagine having an app with a media gallery that is implemented using a UICollectionView. Tapping on any photo should open PhotoDetailScreen but a tap on any video should open VideoDetailScreen.

We can solve the problem of multiple return types in two ways. One way is simply to copy the implementation of a function and just return a different type:

@discardableResult
func openPhoto(atIndex index: Int) -> PhotoDetailScreen {
    self.app.cells.element(boundBy: index).tap()
    return PhotoDetailScreen()
}

@discardableResult
func openVideo(atIndex index: Int) -> VideoDetailScreen {
    self.app.cells.element(boundBy: index).tap()
    return VideoDetailScreen()
}

The problem with the above code is that it doesn't scale well when the method implementation grows and leads to a lot of code duplication.

Let's examine ScreenBuilder pattern which can help us solve the code duplication problem and still provide static type safety:

struct ScreenBuilder {
    func `as`<T: Screen>(_ nextScreen: @autoclosure () -> T) -> T {
        return nextScreen()
    }
}

This allows us to rewrite the media opening code into:

@discardableResult
func openMedia(atIndex index: Int) -> ScreenBuilder {
    self.app.cells.element(boundBy: index).tap()
    return ScreenBuilder()
}

and then inject the concrete type within the test using elegant .as(...) chaining syntax:

func testOpeningMedia() {
    XCUIApplication().launch()

    MediaScreen()
        .openMedia(atIndex: 1).as(PhotoDetailScreen())
        .navigateBack()
        .openMedia(atIndex: 2).as(VideoDetailScreen())
        .playVideo() // Compiler knows that this is a VideoDetailScreen
}

Support for multiple tested apps

So far, we assumed that we are only testing a single app. This might be the most common use case, but sometimes you might need to test two different apps and the interaction between them. If your setup only involves a single app, the next chapter is not required as we will be extending the Screen protocol to enable support for multiple apps.

Let's take a look at what would this mean in code for an Uber-like setup with two apps: Driver.app and Passenger.app:

protocol Screen {
    var appBundleIdentifier: String { get }
}

protocol DriverAppScreen: Screen { }
extension DriverAppScreen {
    var appBundleIdentifier: String { "com.uber.driver" }
}

protocol PassengerAppScreen: Screen { }
extension PassengerAppScreen {
    var appBundleIdentifier: String { "com.uber.passenger" }
}

With only a few changes we have created two new protocols - that is one per app. The only thing left to do now is instead of conforming all our screens to Screen protocol, we will conform to the app-specific protocol:

struct DriverLoginScreen: DriverAppScreen { }

struct PassengerProfileScreen: PassengerAppScreen { }

Summary

We have explored what would it mean to structure UI automated tests using Screen Flow Pattern. From our experience, this pattern scales incredibly well with code bases that have hundreds of UI tests, a large number of screens and multiple apps. We hope that code example would help many engineers in writing great UI automated tests and everybody will benefit from using great and highly tested apps.

Fluent Page Object Pattern ↩

Screen Flow Pattern in XCUITests with statically typed flows and interactions

Screen Flow Pattern

Implementing Screen Flow Pattern

WaitingTime

Test failure location

Verification block with XCUIElementSnapshot

Taking a snapshot of the app

Waiting for loading indicators to disappear

Screen specific verification

Modeling screen-specific actions

WaitingTime

Screen actions and chaining

Waiting for conditions to succeed

Limitations with static type checking

Support for multiple tested apps

Summary

Verification block with `XCUIElementSnapshot`