The unintended consequences of automated UI tests

Whenever I see people talking about automated testing I always wonder what type of testing they actually mean? Eventually someone will mention the framework they are using and all too often it’s a UI based automation tool that allows tests to be written end-to-end (A-E2E-UI).
They are usually very good at articulating what they think these tests will give them: fast automated tests that they no longer need to run manually, amongst other reasons.

But what they fail to look at is the types of behaviours these A-E2E-UI tests encourage and discourage within teams.

They have a tendency to encourage

Writing more integrated testing with the full stack rather then isolated tests
- Isolated behaviour tests (e.g. unit, integration, contract tests etc) run faster and help pinpoint where issues could be
- A-E2E-UI test will just indicate that a specific user journey is not working. While useful from an end user prospective someone still needs to investigate why. This can lead to just re-running it to see if it’s an intermittent error. Which is only made worse by tests giving false negatives which full stack tests are more likely to because of having more moving parts
Testing becomes someone else responsibility
- This is more apparent when the A-E2E-UI test are done by somebody else in the team and not the pair developing the code
- Notice ‘pair’ if you’re not a one-person development army then why are you working alone?
  - Pairs tend to produce better code of higher quality with instant feedback from a real person
  - It might be slower at first but it’s worth it to go faster later
  - This is really important for established businesses with paying customers
  - A research paper called The Costs and Benefits of Pair Programming backs this up but it’s nearly 20 years old now so if you know of anything more recent let me know in the comments.
Pushing testing towards the end of the development life cycle
- The only way A-E2E-UI tests work is through a fully integrated system therefore testing gets pushed later into the development cycle
- You could use Test doubles for parts but then that is not an end-to-end test.
Slower feedback loops for development teams
- Due to testing being pushed to the later stages of development developers go longer without feedback into how their work is progressing
- This problem is increased further when the A-E2E-UI tools are not familiar to the developers who subsequently wait for the development pipeline to run their tests instead of doing it locally
Duplication of testing
- As the A-E2E-UI test suits get bigger and bigger it becomes hard and harder to see what is and isn’t covered by automation
- This leads to teams starting to test things at other levels (code and most likely exploratory testing ) which all add to the development time

These are just some of the behaviours I’ve observed A-E2E-UI tests encourage, but they also discourage other behaviours which maybe desirable.

They can discourage development teams from

Building testability into the design of the systems
- Why would you if you know you can “easily” tests something end-to-end with an automation tool?
Maintainability of the code base
- By limiting the opportunities to build a more testable design you decrease the maintainability of the code though tests
- If you need to make a change it’s harder to see what the change in the code affects
- By having more fine grained tests you can pinpoint where issues exist
- A-E2E-UI tests just indicate that a journey has broken and how it could affect the end users
- Not where the problem was actually introduced
Building quality at the source
- You are deferring testing towards the end of the development pipeline when everything has been integrated. Instead of when you are actively developing the code.
- Are you really going to go back and add in the tests especially if you know an end-to-end test is going to cover it?
The responsibility to test your work
- With the “safety net” of the A-E2E-UI tests you send the message that it’s ok if something slips though development
- If it affects anything the A-E2E-UI tests will catch it
- What we should be encouraging is that it’s the developers responsibility to build AND test their work
- They should be confidant that once they have finished that piece of code it can be shipped
- The A-E2E-UI tests should acts as another layer to build on your teams confidence that nothing catastrophic will impact the end users. Think of them as a canary in the coal mine. If it stops chirping then something is really wrong…
More granular feedback loops
- By having A-E2E-UI tests you’re less likely to write unit and integration tests which give you fast feedback on how that part of the code behaves
- Remember code level tests should be testing behaviour not implementation details

If A-E2E-UI tests cause undesirable behaviours in teams should we stop writing them? While they are valuable at demonstrating end users journeys we shouldn’t be putting so much of our confidence that our system works as intended into them. They should be another layer which helps build the teams confidence that the system hangs together.

If we put the vast majority of our effort and confidence into these automated end-to-end tests than we risk losing one of the teams greatest abilities: building testability into the design of our systems. But just like the automated UI tests building in testability takes conscious effort. This will take time, patients and experience for the whole team to understand and benefit from.

The Do’s and Don’ts of Mobile UI Automation

(…from my experience)

Original posted on Medium

My name is Jitesh ‘Jit’ Gosai and I’m a Senior Developer in Test (DiT) at the BBC working in Mobile Platforms, Digital.

I originally wrote this post well over a year ago (2014!) and recently came across it again so thought I might as well get it out there. Most of the points still stand so should be useful for anyone getting into automation or are already on the road in doing so. Got any other tips then let me know in the comments or Twitter (@JitGo).

During our journey of automating our mobile user interface (UI) testing for iPlayer, we’ve learnt many things (good and bad) that I would like to share with you to help you with your work.
Below, I have compiled some do’s and don’ts to help you and hopefully save you from making the same mistakes as we did.

Android: adb (android device bridge) is your friend

Learn all the little things you can do with adb from simply listing connected devices, to grabbing screenshots. Adb will allow you to do quite a lot with a connected device so make sure you are comfortable with it. My suggestion is to begin with the google developer docs and then start searching around for things you would like to do; for example waiting for a connected device to have started, restarting devices, or sending simple commands to control the UI (don’t try to automate your test this way, you’ll be looking at a lot of pain).

Do learn to control your devices remotely

Learning to control your devices remotely will save you a lot of effort both in time and flow, especially if you have devices connected to your build server which is not physically nearby (in our case, locked in a server room)
Being able to remotely view the device screen and restart the device is very useful and can save you countless journeys when your device ends up in a state you can’t identify. It is also great for being able to return your device to the home screen and therefore into a known state before resuming testing.

Do ditch the android SDK emulator

We tried to use this but found it too slow and unreliable, randomly crashing or disconnecting itself from adb. Intel HAXM was faster but proved to be too flaky and would also crash intermittently. Maybe it was something to do with our setup but we just went with connecting a device to run the jobs and proved to be a lot more reliable. Genymotion is another option which is free for small teams so could work for you.

Do be patient!

When first starting out don’t be tempted to keep playing with your test/build environment for little improvements. Let it settle and run. Know exactly what needs fixing and do just that, not things that are nice to have but essential. Then once you have stability then start to slowly add the things that may help improve speed but with an eye on reliability.

Do be available for pairing

Pair with devs and build tests together. This gets devs familiar with how to write them, encourages them to write tests and stops you being the bottleneck when they break. Also show the UI tests to your Testers to get them familiar with what can and can’t be automated. I’ve found that by walking Testers through an automated UI test vastly improves their knowledge of how they work and things that it can miss.

Do test one thing and one thing only

Don’t be tempted to cover lots of things in one test as it can be harder to tell what has broken when they fail and harder to debug.

Don’t use (or use very sparingly) canned steps

When using step definitions for automating tests it is very tempting to keep reusing steps you’ve already written which at first works well. It can even allow non-technical members of the team to automate test but can result in very verbose scenarios – which are harder to read and more difficult to amend at a later date. We always create methods for interactions with the app e.g. go to home or go to channels in the case for iPlayer. We then use these in our step definitions. This way if the path to get to the to home or channels pages changes then you just update the method and all tests get the change, no need to update all your tests. It also allows you to write tests faster in the long run.

Do use the Page object pattern or similar

This will vastly improve the readability, maintainability and reusability of your tests and teach anyone who will be working with the code how to use it effectively.

Do push your test framework (calabash, appium etc) as far down in the stack as possible

Don’t litter your test code with your frameworks commands but instead delegate that to a module that you access via an interface.
This way if a command changes in a new version you only need to update in one place and if you decide to switch frameworks you can potentially a lot easier.

Don’t fall in love (with your tools!)

They are just that, tools, to help you in your task. If you find that it is causing you more problems and no matter how much searching you do no one can help then dump it and switch to something else (if there is).

Don’t automate everything under the sun

Automate the areas that are going to give the most valuable feedback i.e. that something has broken. We’ve found that UI test are most useful when the dev’s are actively working in those areas so focus your UI tests there.

Do have the need for speed

Keep your tests as fast as possible. Running iOS test through the simulator is faster than devices so we tend to stick with running them there. But for android we use devices as it’s more stable. So you need to weigh up what you need – stability always trumps speed. With that said keep your tests as fast as possible. Don’t use sleeps in your tests (or avoid them at all costs). Use waits that will check repeatedly for something to be on/off screen before proceeding or raising an error.

Do stub it out!

Stub out the data your app uses.We use Charles Proxy with some ruby scripts to automate the launching, closing and loading in of config files. The main reason we used this approach was that devs and tester were familiar with it so the learning curve was easier and was a quick solution that we came up with until a more appropriate solution could be developed/re-purposed such as REST-assured or WireMock.

Do use Metrics

Stats, stats and more stats. Use data to back up your ideas and collect as much as you can
Stats such as test execution time, pass/fail rates, number of tests, no of runs and charts these. We us dashing as it’s quick to setup and offers a very nice visual way to display data.

Dashboards displayed by the app development team. From the Top left app statistics via Graphana and app usage. Bottom left dashing board showing test status, code metrics using custom bubble graphs. Far right screen showing current build status

I’ve found that the Riskshaw widget to be very useful. Some teams have also been experimenting with AtlasBoard which looks promising.

Do learn your tools

If you are using Ruby (as we are) then learn to use a REPL such as Pry. This will save you countless amount of time when debugging or even creating tests. Watch this video by Conrad Irwin for a great introduction to Pry and REPL driven development.

Do have multiple tests around any given area of functionality

This way if a test fails you know it could be flaky but if all the tests in that area fail you know straight away something is wrong.

Do read these posts!

This excellent post by Gojko Adzic for anyone attemptig to automate testing at the UI level.
If you are on the journey of UI automation or plan to start then I also highly recommend reading this post by Dan North, The Siren Song of Automated UI Testing.To give a balance view on automated UI testing and that it’s no silver bullet to removing manual testing altogether or other (arguably better) forms of automated unit/component testing.

Got any tips on automated testing? Then let us know in the comments and you never know it may make it in the list above.

Automating testing for BBC iPlayer mobile part two: automation

Originally posted on the BBC website 30 June 2014

This is the second part of a three post series exploring how the BBC iPlayer Mobile testing team has integrated automated user interface (UI) testing into their development practice.

This post will deal with automation.

By creating collaborative feature files through the “3 Amigos” sessions and setting up a robust system for creating and disseminating them, the natural next step is to begin automating them to increase productivity and quality.

To make the tests as easy as possible to write we implemented the page object pattern so that the developers were clear about how to write more maintainable and less flaky tests. This also meant that test were written more consistently and allowed for more code reuse.

In addition to the page object pattern, we created helper modules that contained all the commands that they would need to drive the app, so it was easier for developers to quickly look up what commands are available, and demonstrated how to use the inbuilt debug tools to query the app to find the screen elements.

Although we explored many different options, we decided to use Calabash and Ruby as the predominant tools to automate our tests as they worked cleanly with Cucumber (which is our test runner) and because Calabash had support for both iOS and Android. To help everyone get to grips with the new systems, internal workshops are held to step developers through real life examples, aiding them to organise the feature folders, creating page objects and types of Calabash commands available to drive the app. By providing step by step guidance, everyone is able to get a strong understanding of the process and where they come into it.

Initially creating the automated UI tests is a slow process as you are required to create a fair amount of support code (including page objects, working out how to access elements on screen and working around timing issues with the app) but once these foundational aspects are set in place, automating tests gets faster and faster.

If a developer ever gets into difficulty Developers in Test are available to pair up to help iron out any problems.

There are many advantages to developers writing the automation tests. Ownership creates a sense of responsibility and a smoother process for delivering and testing the products. It also drives the developers to look at the results and take advantage of the benefits of faster feedback.

With developers using the feature files to write their tests, it ensures that the product is as intended, rather than based on an assumption, which speeds up the development process. The benefits of this is that everyone takes mutual responsibility for automation and prevents testing being pushed to manual when a DiT is absent or unavailable, which keeps the process moving more succinctly and effectively.

Another benefit of using Calabash is that it uses the accessibility labels to access on screen elements. If the developers build the tests they have to enable the labels therefore helping to make the app more accessible. For more information on accessibility practices see Senior Accessibility Specialist Henny Swan’s blog posts.

You may be wondering what the DiT’s are doing if the developers are creating all the automation code?

DiT’s remain embedded in the team and available for pairing to help automate tests that are not straightforward. They help build up tools to aid automation e.g. worker methods to carry out complex interactions or how a feature could be automated if not immediately obvious. They help keep Continuous Integration (CI) jobs running and investigate brittle tests. DiT’s also tend to be the experts with the automation frameworks so advise if a feature is worth automating or it’s better to test it manually.

Once the feature file has been automated the tests are pushed into the main build pipeline. They will be run approximately 4 times a day and a subset on each check in of code. We have our build jobs status displayed on large screens (one of the advantages of working near the TV platforms team is that they have a lot of reference TV’s that we can use when not being tested on) so if anything fails the whole team know straightway.

In the final post of this series I’ll tell you we handle legacy and new features and what the future holds for our team.

Originally posted on the BBC website 06 August 2014

Automating testing for BBC iPlayer mobile part one: 3 Amigos

Originally posted on the BBC website 30 June 2014

In this three part series of blog posts I will be exploring how the BBC iPlayer Mobile team has integrated automated user interface (UI) integration testing into their development practice.

I’m a Senior Developer in Test (DiT) working in Mobile Platforms, BBC Future Media. I work with the BBC iPlayer Mobile team to help them automate their testing, investigate new tools and advising how best to use them in their everyday work, sharing this with other teams across the BBC. In the 16 months that I have been with the BBC I have seen a great deal of change in development practice, which I will be sharing in this series of posts.

I was initially brought onto the team to identify how to automate a greater number of tests in order to increase the speed of release without risking the quality of the end product.

When I first joined the team it was apparent that the developers had all individually started to automate some of the tests, however it became clear that there was no continuity to the test scripts, with each developer using their own styles. Inevitably, when a script broke, if it wasn’t investigated by the developer who wrote the test, it would take a long time to identify the problem and to repair the issue. Because of this, it would usually result in adding a simple patch to keep it running or by disabling it. Because of the issues around automation, the team began to lose confidence in the testing method and reverted back to manual.

The lack of systems within the process was problematic in itself with some features having a lot of automation testing carried out and others receiving little or none, with no-one taking responsibility for ensuring that the testing was happening. This meant that each test was insular with only the designated developer having access to the results.

From the outset, it was decided to take things slowly and begin with the area that would give the most value with the least amount of effort. The team understood that feature files are a great way to describe how the systems should work and that a collaborative approach was needed for successful implementation. It was here that we decided to use the idea of the ‘3 Amigos’ to write the features.

3 Amigos

To set up the ‘3 Amigos’ we needed to recruit a developer from each platform (iOS and Android), a tester, a product owner/business analyst and a DiT. Now this is obviously more than three “amigos” however we needed to have a representative from each area of the process and the DiT to lead the sessions until everyone felt comfortable with the process and able to run them independently.

The advantage of having a DiT, or anyone experienced in writing feature files, is to act as chair and mentor. They are able to guide the team to write concise sceanrios and ensure conversation stays on track. They also help to make sure that everyone in the meeting contributes and is comfortable with what the features where specifying.

Ordinarily, the process would start with the user story, created earlier by the Business Analyst (BA) working with the Product Owner. This will help to identify each scenario to cover the feature and only entering into the given/when/then steps if it wasn’t immediately clear how a scenario would play or if there was confusion amongst the team. Once the sessions are over the DiT or BA will flesh out the remaining given/when/then, attaching it to the user story in Jira.

3 Amigos gather around a BBC iPlayer screen

Because BBC iPlayer is available on iOS and Android we only ever had one feature file that both products would use. This would make sure that we kept feature parity and aided us to start delivering features on both platforms at the same time.

‘3 Amigos’ helped everyone involved develop a strong understanding of a feature and how it may need to be altered to work on each platform. This also helped to foster a more collaborative approach to creating feature files and to develop a better understanding of what the Product Owners wanted without the solution being prescriptive on the team, letting them decide how it should work.

Anyone not involved with the 3 Amigos session could read the feature file or speak with any of the developers or testers present to get a heads up. We try to make sure that different developers and testers attend the ‘3 Amigos’ to make sure everyone can run a session without a particular person becoming a bottle neck.

Once the developer has picked up the ticket to develop, they will submit the feature file into our source control system removing any reference to the feature apart from the user story and any acceptance criteria leaving only a link to the location of the feature file for future access. This ensured there was only ever one version of the truth and if any changes were required then there would be an audit trail to identify who made the alteration.

In my next post I will expand on how we use the feature files to automate our testing.

Originally posted on the BBC website 30 June 2014