Automated cross-browser testing with BrowserStack and CircleCI

Robot testing an application

By now, automated testing of code has hopefully become an industry standard. Ideally, you write your tests first and make them a runnable specification of what your code should do. When done right, test-driven development can improve code design, not mentioning you have a regression test suite to stop you from accidentally breaking things in the future. 

However, unit testing does just what it says on the tin: tests the code units (modules, classes, functions) in isolation. To know the whole application or system works, you need to test the integration of those modules.

That's nothing new either. At least in the web application world, which this post is about, we've had tools like Cucumber (which lets you write user scenarios in an almost human language) for years. You can then run these tests on a continuous integration server (we use the amazing CircleCI) and get a green light for every commit you push.

But when it comes to testing how things work in different web browsers, the situation is not that ideal. Or rather it wasn't. 

Automated testing in a real browser

The golden standard of automated testing against a real browser is Selenium, the browser automation tool that can drive many different browsers using a common API. In the ruby world, there are tools on top of Selenium providing a nice DSL for driving the browsers using domain specific commands like page.click 'Login' and expectations like page.has_content?('something').

Selenium will open a browser and run through your scripted scenario and check that everything you expected to happen did actually happen. This should still be an old story to you. You can improve on the default setup by using a faster headless browser (like PhantomJS), although watching your test complete a payment flow on PayPal is kinda cool. There is still a big limitation though.

When you need to test your application on multiple browsers, versions, operating systems and devices, you first need to have all that hardware and software and second, you need to run your test suite on all of them.

So far, we've mostly solved this by having human testers. But making humans test applications is a human rights violation and a time of a good tester is much better spent creatively trying to break things in an unexpected way. For some projects, there even isn't enough budget for a dedicated tester.

This is where cloud services, once again, come to the rescue. And the one we'll use is called BrowserStack.

BrowserStack

BrowserStack allows you to test your web applications in almost every combination of browser and OS/Device you can think of, all from your web browser. It spins up the right VM for you and gives you a remote screen to play around. That solves the first part of our problem, we no longer need to have all those devices and browsers. You can try it yourself at http://www.browserstack.com/.

Amazingly, BrowserStack solves even the second part of the problem by offering the automate feature: it can act as a Selenium server, to which you can connect your test suite by using Selenium remote driver and automate the testing. It even offers up to ten parallel testing sessions!

Testing an existing website

To begin with, let's configure a Cucumber test suite to run against a staging deployment of your application. That has it's limitations - you can only do things to the application that a real user could, so forget mocking and stubbing for now (but keep on reading).

We'll demonstrate the setup with a rails application, using cucumber and Capybara and assume you already have some scenario to run.

First, you need to tell Capybara what hostname to use instead of localhost

[gist id="c71e00bed59979a2d02e" file="01 capybara.rb"]

Next, loosely following the BrowserStack documentation we'll configure the remote driver. Start with building the browser stack URL using environment variables to set the username and API authorization key.

[gist id="c71e00bed59979a2d02e" file="02 cross_browser.rb"]

then we need to set the desired capabilities of the remote browser. Let's ask for Chrome 33 on OS X Mavericks.

[gist id="c71e00bed59979a2d02e" file="03 capabilities.rb"]

Next step is to register a driver with these capabilities with Capybara

[gist id="c71e00bed59979a2d02e" file="04 register_driver.rb"]

and use it

[gist id="c71e00bed59979a2d02e" file="05 use_driver.rb"]

If you run cucumber now, it should connect to BrowserStack and run your scenario. You can even watch it happen live in the Automate section!

Ok, that was a cool experiment, but we wanted multiple browsers and the ability to run on BrowserStack only when needed would be good as well.

Multiple different browsers

What we want then, is to be able to run a simple command to run cross-browser tests in one browser or a whole set of them. Something like

rake cross_browser

and

rake cross_browser:chrome

In fact, let's do exactly that. First of all, list all the browsers you want in a browsers.json in the root of your project

[gist id="c71e00bed59979a2d02e" file="06 browsers.json"]

Each of those browser configurations is stored under a short key we'll use throughout the configuration to make things simple.

The rake task will look something like the following

[gist id="c71e00bed59979a2d02e" file="07 rake_task.rb"]

First we load the JSON file and store it in a constant. Then we define a task that goes through the list and for each browser executes a browser specific task. The browser tasks are under a cross_browser namespace.

To pass the browser configuration to Capybara when Cucumber gets executed we'll use an environment variable. Instead of passing the whole configuration we can just pass the browser key and load the rest in the configuration itself. To be able to pass the environment variable based on the task name, we need to wrap the actual cucumber task in another task.

The inner task then extends the Cucumber::Rake::Task and provides some configuration for cucumber. Notice especially the --tags option, which means you can specifically tag Cucumber scenarios for cross-browser execution, only running the necessary subset to keep the time down (your daily time running BrowserStack sessions is likely limited after all).

The cross_browser.rb changes to the following:

[gist id="c71e00bed59979a2d02e" file="08 cross_browser_2.rb"]

That should now let you run

rake cross_browser

and watch the four browsers fly through your your scenarios one after another.

We've used this setup with a few modifications for a while. It has a serious limitation however. Because the remote browsers is accessing a real site, it can only do as much as a real user can do. The initial state setup and repeatability is difficult. Not mentioning it isn't the fastest solution. We really need to run the application locally.

Local testing

Running your application locally and letting Capybara start your server enables you to do everything you are used to in your automated tests - load fixtures, create data with factories, mock and stub pieces of your infrastructure, etc. But how can a browser running in a cloud access your local machine? You will need to dig a tunnel.

BrowserStack provides a set of binaries able to open a tunnel to the remote VM and connect to any hostname and port from the local one. The remote browser can then connect to that hostname as if it could itself access it. You can read all about it in the documentation.

After you downloaded a BrowserStack tunnel binary for your platform, you'll need to change the configuration again. The app_host is localhost once again and we also need Capybara to start a local server for us.

[gist id="c71e00bed59979a2d02e" file="09 capybara_2.rb"]

We also need to tell BrowserStack we want to use the tunnel. Just add

[gist id="c71e00bed59979a2d02e" file="10 tunnel.rb"]

to the list of capabilities. Start the tunnel and run the specs again

./BrowserStackLocal -skipCheck $BS_AUTHKEY 127.0.0.1,3001 & rake cross_browser

This time everything should go a bit faster. You can also test more complex systems that need external APIs or direct access to your data store because you can now mock those.

This is great! I want that to run for every single build before it's deployed like my unit tests. Testing everything as much as possible is what CI servers are for after all.

Running on CircleCI

We really like CircleCI for it's reliability, great UI and especially it's ease of configuration and libraries and services support.

On top of that, their online chat support deserves a praise in a separate paragraph. Someone is in the chat room all the time, responds almost immediately and they are always very helpful. They even fix an occasional bug in near real time.

To run our cross browser tests on CircleCI we will need a circle.yml file and a few changes to the configuration. The circle.yml will contain the following

[gist id="c71e00bed59979a2d02e" file="11 circle_2.yml"]

We run unit tests, then cucumber specs normally, then open the tunnel and run our rake task. When it's done, we can close the tunnel again. To download and eventually stop the tunnel we wrote a little shell script

[gist id="c71e00bed59979a2d02e" file="12 script.bash"]

It downloads the 64-bit linux browserstack binary and unpacks it into a browserstack directory (which is cached by CircleCI). When passed a stop parameter, it will kill all the browserstack tunnels running. (We will eventually make the script start the tunnel as well, but we had problems with backgrounding the process so it's done as an explicit step for now).

Finally, we can update the configuration to use the project name and build number supplied by Circle to name the builds for BrowserStack

[gist id="c71e00bed59979a2d02e" file="13 capabilities_3.rb"]

That setup should work, but it will take a while going through all the browsers. That is a problem when you work in multiple branches in parallel, because the testing becomes a race for resources. We can use another brilliant feature of CircleCI to limit the impact of this issue: we can run the tests in parallel.

The holy grail

Marking any task in circle.yml with parallel: true will make it run in multiple containers at the same time. You can than scale your build up to as many containers you want (and are willing to pay for). We are limited by the concurrency BrowserStack offers us and on top of that we're using just 4 browsers anyway, so let's start with four, but plan for more devices.

First, we need to spread the individual browser jobs across the containers. We can use the environment variables provided by CircleCI to see which container we're running on. Our final rake task will look like this

[gist id="c71e00bed59979a2d02e" file="14 raketask_3.rb"]

Reading the nodes environment variable we check the concurrency limit and spread the browsers across the same number of buckets. For each bucket, we'll only run the actual test if the CIRCLE_NODE_INDEX is the same as the order of the bucket.

Because we're now opening multiple tunnels to BrowserStack, we need to name them. Add

[gist id="c71e00bed59979a2d02e" file="15 tunnels_3.rb"]

to the capabilities configuration in cross_browser.rb. The final file looks like this

[gist id="c71e00bed59979a2d02e" file="16 cross_browser_final.rb"]

We need to supply the same identifier when openning the tunnel from circle.yml. We also need to run all the cross-browser related commands in parallel. Final circle.yml will look like the following (notice the added nodes=4 when running the tests)

[gist id="c71e00bed59979a2d02e" file="17 circle_final.yml"]

And that's it. You can now scale your build out to four containers and run the tests in paralel. For us this gets the build time down to about 12 minutes on a complex app and 5 minutes on a very simple one.

Conclusions

We are really happy with this setup. It's really stable, fast, individual test runs are completely isolated and we don't need to deploy anything anywhere. It has just one drawback compared to the previous setup which first deployed the application to a staging environment and then ran cross-browsers tests against it. It doesn't test the app in it's real runtime environment (Heroku in our case). Otherwise it's a complete win on all fronts.

We plan to solve that remaining problem by writing a separate test suite testing our whole system (consisting from multiple services consuming each other's APIs) cleanly from the outside. It won't go into as much detail as the normal tests since it is only there to confirm that the different pieces fit together and users can complete the most important journes. Coupled with Heroku's slug promotion feature, we will actually test the exact thing that will end up in production in the exact same environment. And you can look forward to another blogpost about that soon.