Blog Archive for / testing /
Wednesday, 25 January 2017
On Monday, I attended the Software Cornwall Business Connect Event. This was a day of talks and workshops for people in the local software development community here in Cornwall. One of the talks was on TDD, and how that fitted into the software development process at one of the larger software companies in the area.
There was an interesting question asked by one of the attendees: how do you know that your tests are correct? What if you make a mistake in your tests, and then change the code to make it pass?
The answer that the presenter gave, and the one that is most common in the literature, is that the tests check the code, and the code checks the tests. There was a brief discussion around this point, but I thought that it was worth elaborating here.
One of the key parts of TDD is the idea of "baby steps". You write code incrementally, with small changes, and small, focused tests.
Consequently, when you write a new test, you're testing some small change you're about to make. If you make a mistake, this gives you a reasonable chance of spotting it.
However, that's still relying on you spotting it yourself. You just wrote the test, and it's very easy to read back what you intended to write, rather than what you actually did write. This is where "the code checks the tests" comes in.
The TDD cycle is often dubbed "Red, Green, Refactor". At this point, the relevant part is "Red" — you just wrote a new test, so it should fail. If you run it, and it passes then something is wrong, most likely a mistake in the test. This gives you a second chance to revisit the test, and ensure that it is indeed correct. Maybe you mistyped one of the values; maybe you missed a function call. Either way, this gives you a second chance to verify the test.
OK, so you wrote the test (wrong), the test fails (as expected), so you write the code to make it pass. This gives you the next opportunity to verify the test.
TDD doesn't exempt you from thinking. You don't just write a test and then blindly make it pass. You write a test, often with an idea of the code change you're going to make to make it pass.
So, you've written a failing test, and you make the desired change. You run the tests, and the new test still fails. You haven't got the "Green" outcome you desired or intended. Obviously, the first thing to check here is the code, but it also gives you another chance to check the test. If the code really does do what you intended, then the test must be wrong, and vice versa.
Of course, we don't have just the one test, unless this is the beginning of a project. We have other tests on the code. If the new test is incorrect, and thus inconsistent with the previous tests, then changing the code to make the new test pass may well cause previous tests to fail.
At this point, you've got to be certain that the test is right if you're going to change the code. Alternatively, you made exactly the same mistake in the code as you did in the test, and the test passes anyway. Either way, this is the point at which the faulty test becomes faulty code.
It's not the last chance we have to check the test, though: the final step in the TDD loop is "Refactor".
Refactoring is about simplifying your code — removing duplication, applying general algorithms rather than special cases, and so forth. If you've made a mistake in a test, then the implementation may well end up with a special case to handle that particular eventuality; the code will be resistant to simplification because that test is inconsistent with the others. Again, TDD does not exempt you from thinking. If you intended to implement a particular algorithm, and the code does not simplify in a way that is consistent with that algorithm then this is a red flag — something is not right, and needs checking. This gives you yet another chance to inspect the test, and verify that it is indeed correct.
At this point, if you still haven't fixed your test, then you have now baked the mistake into your code. However, all is not lost: "Red, Green, Refactor" is a cycle, so we begin again, with the next test.
As we add more tests, we get more and more opportunities to verify the correctness of the existing tests. As we adjust the code to make future tests pass, we can spot inconsistencies, and we can observe that changes we make break the earlier test. When we see the broken test, this gives us a further opportunity to verify that the test is indeed correct, and potentially fix it, rather than fixing the code.
Acceptance tests then provide a further level of checking, though these tend to be far courser-grained, so may work fine unless the particular inputs happen to map to those in the incorrect lower level test.
TDD is not a panacea; but it is a useful tool
So, it is possible that a mistake in a test will be matched by incorrect code, and this will make it into production. However, in my experience, this is unlikely if the incorrect test is merely a mistake. Instead, the bugs that make it through tend to fall into two categories:
- errors of omission — I didn't think to check that scenario, or
- errors due to misunderstanding — my understanding of the requirements didn't match what was wanted, and the code does exactly what I intended, but not what the client intended.
TDD is not a panacea, and won't fix all your problems, but it is a useful tool, and has the potential to greatly reduce the number of bugs in your code.
Please let me know what you think by adding a comment.
Monday, 03 December 2007
When designing websites, it is very important to check the results in multiple web browsers — something that looks fine in Internet Explorer may look disastrous in Firefox, and vice-versa. This problem is due to the different way in which each web browser interprets the HTML, XHTML and CSS standards, combined with any bugs that may be present. If you're designing a website, you have no control over which browser people will use to view it, so you need to ensure that your website displays acceptably in as many different browsers as possible.
The only way to know for sure how a website looks in a particular browser is to try it out. If you don't check it, how do you know you won't hit a bug or other display quirk? However, given the plethora of web browsers and operating systems out there, testing in all of them is just not practical, so you need to choose a subset. The question is: which subset?
Thankfully, most people use one of a few "popular" browsers, but that's still quite a few. In my experience, on Windows, the most popular browsers are Firefox, Internet Explorer and Opera, on Linux most people use Firefox, Mozilla or Netscape and on MacOS most people use Safari or Camino. Obviously, the relative proportions of users using each browser will vary depending on your website, and target niche — a website focused on non-technical users is far more likely to find users with Internet Explorer on Windows than anything else, whereas a website focused on linux kernel development will probably find the popular browser is Firefox on linux.
It's all very well having identified a few popular browsers to use for testing, but an equally crucial aspect is which version of the browser to test. Users of Firefox, Opera, Mozilla, and recent versions of Netscape might be expected to upgrade frequently, whereas users of Internet Explorer might be far less likely to upgrade, especially if they are non technical (in which case they'll stick with the version that came with their PC). Checking the logs of some the websites I maintain shows that the vast majority of Firefox users (90+%) are using some variant of Firefox 2.0 (though there are a smattering all the way back to Firefox 0.5), whereas Internet Explorer users are divided between IE7 and IE6, with the ratio varying with the site.
Don't forget a text-only browser
A text only browser such as Lynx is ideal for seeing how your site will look to a search engine spider. Not only that, but certain screen reader applications will also give the same view to their users. Consequently, it's always worth checking with a text-only browser to ensure that your site is still usable without all the pretty visuals.
Multiple Browsers on the same machine
Having chosen your browsers and versions, the simplest way to test your sites is to install all the browsers on the same machine. That way, you can just open the windows side by side, and compare the results. Of course, you can't do this if the browsers run on different platforms, but one option there is to use virtual machines to test on multiple platforms with a single physical machine. Testing multiple versions of Internet Explorer can also be difficult, but TredoSoft have a nice little package called Multiple IEs which enables you to install multiple versions of Internet Explorer on the same PC. Thanks to Multiple IEs, on my Windows XP machine I've got IE3, IE4.01, IE5.01, IE5.5, IE6 and IE7, as well as Firefox, Opera, Safari and Lynx!
Tuesday, 27 November 2007
Whilst testing on multiple platforms is important, it can be difficult to obtain access to machines running all the platforms that you wish to test on. This is where virtualization software such as VMWare comes in handy: you don't need to have a separate machine for each tested platform — you don't even need a separate partition. Instead, you set up a Virtual Machine running the target platform, which runs on top of your existing OS. This Virtual Machine is completely self-contained, running off a virtual hard disk contained in a file on your real disk, and with a virtual screen which can be shown in a window on your host desktop.
This can be incredibly useful: not only can you test on multiple platforms without repartitioning your hard disk, but you can have multiple virtual machines running simultaneously. If you're developing an application that needs to run on multiple platforms, this can be invaluable, as you can see what the application looks like on different operating systems simultaneously. It also allows you to test network communication — each virtual machine is entirely independent of the others, so you can run a server application on one and a client application on another without having to build a physical network.
Get Started with a Pre-built Virtual Machine
VMWare have a repository of pre-built virtual machines, that they call "appliances". This makes it very easy to get started, without all the hassle of installing the OS. Some appliances even come with pre-installed applications — if you want to try a Ruby on Rails app on linux, then the UbuntuWebServer appliance might be a good place to start.
Warning: Virtual Machines use Real Resources
It's worth noting that the resource use (CPU, memory, disk space) is real, even if the machines are virtual — if you run a CPU-intensive application on your virtual machine, your system will slow down; if you give 3 virtual machines 1Gb of memory each but you only have 2Gb installed, you're going to see a lot of swapping. Virtual machines are not full-time replacements for physical ones unless you have a server with a lot of resources. That said, if you do have a server with a lot of resources, running separate systems and applications in separate virtual machines can make a lot of sense: the individual systems are completely isolated from one-another, so if one application crashes or destroys its (virtual) disk, the others are unaffected. Some web hosting companies use this facility to provide each customer with root access to their own virtual machine, for example.
It's also worth noting that if you install a non-free operating system such as Microsoft Windows, you still need a valid license.
VMWare Server is currently a free download for Windows and Linux, but it's not the only product out there. VirtualBox is also free, and runs on Windows, Linux and MacOSX. One nice feature that VirtualBox has is "seamless Windows": when running Microsoft Windows as the guest operating system, you can suppress the desktop background, so that the application windows from the virtual machine appear on the host desktop.
Another alternative is QEMU which offers full-blown emulation as well as virtualization. This allows you to experiment with operating systems running on a different CPU, though the emulated hardware can be quite limited.