Recently I was exploring Swing GUI testing tools. This was a particularly difficult task for me because I’m not only novice to Swing but as a developer I’m new to acceptance testing too. The second was a difficult problem: I just didn’t know the common pitfalls of testing. The unknown unknown, so to speak.
Folti gave me tremendous help: He pointed out that whatever tool I choose to manipulate the GUI, I’m going to need an acceptance testing tool like FitNesse. He was the one who showed me this post about having legacy code in test code.
In our office we have the book “Scripted GUI testing with Ruby“. Although I don’t speak ruby (yet) it provided me with the basic ideas behind acceptance testing. This book shows an example on how to decompose the testing code into various layers. It heavily relies on RSpec so it helped me to a good understanding of how such tools work.
The rest was relatively easy: I had to find, explore and play with Swing testing tools. These are the lessons learned:
Test code should be robust
This one is easy to say but hard to do. Everybody agrees that tests should make changes in your system more easy: the benefit of finding regression should be higher than the cost of writing and maintaining your tests. But, what does a robust GUI test look like?
It should be immune to graphic design issues like button size, background color and so forth. This eliminates testing tools like sikuli or abbot. Those tools are fancy but in the long run you’re better off without them.
It should be immune to refactorings in your code. More precisely, if you rename the class TwoTextFieldsAndAButton to LoginPanel, tests should be still able to locate and verify that component. That means that the test code should be able to locate your components by their swing-supertype and by their name property. Most of the tools can do that: they need to have a flexible component finding framework. This requirement imposes restrictions on your production code too: you need to set name properties everywhere.
Robust GUI test code should be immune to unrelated redesigns. Say, for instance, you used a JDesktopPane as the content pane of your top-level frame. Later you decide to use a JSplitPane as the content pane and put the JDesktopPane to one side of it. Such changes should not break your test code. If they do, you should be able to fix it with a very few changes. How do you do that? These are the two solutions I can think about:
- You can ignore the unimportant parts of your system. This way some things remain untested but your code becomes more robust.
- You can use fixture classes for each element in your component. This way you’ll need to modify only the fixture responsible for the changed component.
The next section compares them from another point of view:
There is a trade-off between test speed and test accuracy
When I write unit tests a test case verifies the behavior of a class. Each test method of that test case verifies one feature of that class. The benefit is that the name of the failing test can basically tell you what went wrong. You could write your GUI tests with the same approach:
- when I click on the “add row” button, is the number of rows growing?
- when I click on the “add row” button, is the focus on the “name” field of this new row?
But it’s a waste of resources to start up your app for each micro-scenario. To make tests run faster you can add multiple assertions to a test.
So you do add multiple assertions to a test run: click here, assert that, enter that text there, assert something else, etc. OK, but what should happen when assertion #3 fails? Does it make sense to go on with the rest of the GUI actions? You’ll need to design this carefully otherwise you introduce dependencies between tests of unrelated components.
Suppose your system under test is a little more complicated than an official Swing demo. Suppose you’re testing a tool like SoapUI which has many different things on the screen. Say, your test modifies things on the left side of the JSplitPane. Should it verify the right side of the JSplitPane? Even if the content of those sides are depending on each other you might need different situations to test either of them. E.g. it’s interesting how different hierarchies can be displayed on the left side; but the right panel looks all the same. The right hand side, however, does interesting things when it receives a HTTP error or a SOAP error. While you focus on one component you will add the same assertions for the other side over and over again.
If you really want to minimize application restarts then you can add assertions for everything on the screen. Your tests will run in the shortest time possible but your tests will be very convoluted: a test run will contain many unrelated actions and assertions. There is a good chance that tracing back test failures to software bugs will take a lot of time. Also, your test code might be very brittle since a change in a certain component can break lots of tests. There are workarounds though: you can write a fixture for each component so you have to change the tests in one place; you can log useful error messages which contain the component name and a short description of reproducing the failure without the rest of the gigantic automated test run. I’m not an expert of acceptance testing, however, my intuition suggests to avoid such one-to-rule-them-all tests as long as possible.
Test code should be parameterized
This one is easy: when you wrote your fixture that writes “foo” and “bar” to the text fields and then presses the “post” button you will want to do it with other inputs too – without repeating the same expressions. This is the same exact thing as calling methods instead of copy-pasting their body. There are many ways to do that:
- You can feed FitNesse fixtures with your spreadsheets
- You can develop and integrate your own property files
- You can programmatically generate the input data and call the test case immediately
Each of those approaches have cons and pros, e.g. FitNesse tests are executable documentation while it forces you to conform to a specific standard. Writing your own property files provide you more flexibility but you’ll need to write more boiler plate code. (Listing pros and cons for the third option is an exercise for the reader.)
Test code should be synchronized with documentation
Ouch. Code and its documentation should be synchronized too! Sadly, documentation in word documents or wiki pages are always outdated. Oh, wait. We write tests as documentation for our code. But tests are made of code too. Usually reading test code is almost as hard as reading production code. So, what now?
FEST is fancy, Jemmy is easy to use
Finally, I would like to compare two tools I find useful for writing Swing tests with: FEST and Jemmy. They do basically the same: they provide a way to programmatically access and manipulate Swing GUI elements.
FEST looks good: it has a nice homepage where you can easily find examples. It even has a Fluent Interface to write tests which almost reads like English. On the other hand it makes you write your fixtures the FEST way. Say for instance, you need to subclass 3 different FEST classes to assert a custom TableCellEditor. Maybe it’s the perl in my past but I just don’t like tools which force me to solve a problem in a certain way – unless that certain way has concrete benefits.
For instance I want to describe test steps with FitNesse and the fixtures need to describe user actions like “click here and assert that”? I know I can do that with FEST. I know I can learn a lot of OO patterns on this way. Still, the solution looks unnecessarily complicated to me.
Don’t get me wrong, FEST is a great tool: it can do everything it should. Many projects use it, they are never going to abandon it. FEST is a good choice, I just like Jemmy better. Lets’s see why:
Jemmy has a not-so-nice homepage. It takes a few clicks to find the downloadable jar and the first usable examples. It requires a little more research to integrate Jemmy into anything but their own Scenario-and-main-method. (Hint: they are only optional). Jemmy has a very odd syntax to check if a component appears, see this or any other sample.
But once you get through these obstacles, jemmy is simple: you can easily implement your own ComponentChooser if you don’t find the one that you need. Manipulating and verifying components is easy too: you can use the plain old Swing syntax and they will be translated to user actions. Thus the learning curve is pretty short. You can use a few methods like “clickButton()” to specify user actions more precisely, however, you can quickly write a workaround if you can’t find the appropriate method for a certain user action.
Acceptance tests like any other tests are important: they find regressions for you.
Writing tests is the same activity as writing code: test code should be maintainable, robust, easy to change, etc. You can apply most of the principles you apply for production code. There are a few caveats though that you should be aware of:
- Tests are runnable documentation for production code. It makes you use specific tools and adhere to specific standards.
- Execution time is often a big constraint on acceptance tests. You’ll need to optimize for test execution time – without going gaga.
- Failing tests should be informative and easy to repeat. Sometimes this requirement contradicts to the previous one.