In the QA forums I frequent, there are often questions about how to properly load test when you don’t have access to production or an identically built environment. Most companies won’t spring the cash to build an environment that is identical to production; generally, testing environments are made up of hand-me-down servers that used to be in production. Of course, there is also the cost of test suite licensing to produce a productional load, and the near impossibility of mimicking real production traffic.
Though a production clone would be ideal, a watered down environment can be sufficient, and in some ways better. Bottlenecks are achieved faster, without having to push through 50 Mbps of data. Additionally, a “lesser” environment will be more sensitive to changes; your transaction may take 0.5 seconds on production-grade servers, and a defect that doubles it to 1.0 seconds is hardly noticeable, but on a lesser environment where that transaction takes 6.0 seconds, doubling it to twelve throws up red flags.
For a watered-down environment, try to lessen the horsepower of your system while matching the architecture. If your productional environment is eight web servers that are all quad 3.2 Ghz Xeons running Windows Server 2003 Web Edition, and all load balanced through a hardware load balancer, you can bring it down to two web servers with less horsepower–perhaps dual 700Mhz P3s–but the servers should still run Windows Server 2003 Web Edition and be balanced with a hardware balancer. Do not drop below two web servers because you will still want a load balanced environment, and do not switch to Windows 2000 or use Microsoft’s NLB (Network Load Balancing). If your production web environment uses Windows 2000 and NLB, obviously use that technology in your testing environment; do not switch to Windows 2003 or a hardware load balancer.
Additionally, try to reduce equally throughout your environment. Don’t drop your web servers from Pentium 4s to Pentium 3s while dropping your database servers from Pentium 4s to an old 486 desktop. Equal reductions maintain your continuity, and in the end, your sanity. Unequal reductions introduce new problems that don’t exist in production, but will still happily waste your time and money. A major bottleneck might exist on your web servers, but the defect could be masked because you were database-bound by using that old 486.
The idea behind this is that many bugs can be introduced by a specific revision of your OS (Think of the problems from Windows XP SP2), from your style of network infrastructure, the version of your graphics driver, etc. You want as many common points as possible between your testing and production environments to eliminate any surprises when you launch your application. Ideally, your testing environment is an exact replica of your production environment, but unless you are making desktop applications, it is only a fantasy, so just try to get as close as you can. Use the same OS version, including the same service pack and the same installed hot fixes. Use the same driver versions, and configure the same settings on your web server software. You are trying to create a miniature version of your production environment, like a model car or a ship in a bottle. Pay attention to the details and you will be okay. To your application, the environments should be exactly the same; one is just a little snug.
For love of all things QA, before you launch a new application, test production!
“What? That’s stupid! Why would I want to perform a load test production and risk an outage? That impacts my SLAs. I can’t impact my SLAs!”
Remember the number one rule of quality control: if you don’t find it, your customers will.
When you are about to launch a brand new application into your production environment, test that application against production. However, this only applies for new applications. New applications will introduce new, additional load on the environment, while existing, revised applications already have added that load to the system. Essentially, with an existing application, you already know how well the production environment can handle the additional demand generated by the application’s audience. New applications have not yet generated that load, and production has yet to prove itself.
There is no hard evidence that production can take the additional demand. Maybe your production load balancer can only handle another 5 MB/s, and your new application will demand another 7. Perhaps it is one of the switches, instead. Or for my recent life, maybe it is your ISP. You will not know until you test it, until you measure it, and “if you didn’t measure it, you didn’t do it.” With a past project, my company created an intranet application for our client, and our application just happened to be hosted off-site. The off-site environment was green, and wasn’t hosting anything else, so our client had no issue with us testing this environment fully since it was going to be production, but wasn’t yet. The hosting company and their ISP rated the environment at 45 Mbps (That’s megabits–lower-case ‘b’), and based on the clients traffic expectations, we needed about 30. It is a good thing we tested the site because we found an issue with the load balancer at about 15 Mbps, a problem with server memory when it was processing enough transactions to produce 20 Mbps, a problem with the database switches when we were generating 22 Mbps, and–this one is the kicker–a bandwidth ceiling at 28. Though all of the routers, switches, balancers, and servers were performing well, we couldn’t get more than 28 Mbps to the web servers. It turns out that the ISP didn’t ever expect anyone to use that 45 Mbps rating, and never tested to make sure they could handle it.
“If you didn’t measure it, you didn’t do it.”
Through two months of midnight through 0600 testing, we upgraded the load balancer, added more memory, put in gigabit switches, had the ISP tweak their infrastructure, pushed through all of the data we needed, and successfully proved that the off-site environment and our new application could handle the load. But, the environment still wasn’t fully tested. Our client used everyone’s favorite single-signon, SiteMinder. However, they wouldn’t let us test the application while integrating their productional SiteMinder policy servers. We could only use staging, and when the staging servers couldn’t handle the load, “that’s okay because it’s staging.” But no matter how much we advocated, we couldn’t test production. We might impact the environment and the SLAs. So, we launched without testing it, and guess what happened? The policy servers failed, and they severely impacted their SLAs.
And to think, we could have tested that at 1:00 AM on a Saturday, and they even if we fried the policy servers, they would have had all weekend to fix it. And most importantly, we would have identified it before the end-user did. But what really cooked their goose was the difference between productional load and performance testing load: performance tests can be stopped. It is a lot harder to fix a jet engine at 30,000 ft.
The moral of the story: when launching a new application, always test production. Always.
It’s not all about Internet Explorer any more. Yet, I am surprised at the number of web houses still coding specifically to IE. Much to my dismay, even my own company does it. Though we have a little bit of an excuse—our client only supports IE in their organization, and the app is internal—it still bothers me that we are abandoning everyone else.
New figures released a week ago place IE’s market share at 89%. That means more than 1 in 10 users are not using IE. (Read the Article) By coding specific to Microsoft, you are abandoning 11% of your potential users. That is astonishing and disturbing.
Pay particular attention to Firefox. Its user-base is growing exponentially, and doubling every 9 months. I’m a fan of the application. It is much easier to use than IE, and much more solid. I’ve converted all of my friends and almost all of my family. I even have my in-laws using Firefox. (Get Firefox)
As the IE behemoth continues to fall, you and your organization should be paying more and more attention to standards and multiple-browser testing. Check that your HTML is compliant, and test your sites in at least IE and Firefox, if not others. Don’t force your users to use a particular browser; chances are that if they can, they will just go somewhere else for their information.
When testing .Net web application forms that use postback, it is always a good idea to leave the form and come back. Postback is when a page refreshes or submits to itself; generally, identified by the pre- and post-submit URL being the same page. Often times, the status of the form fields is saved in the .Net ViewState after a submit, rather than retrieved from the database. You might have checked the “Display me” checkbox and clicked submit. The “cached” version from the ViewState says that this control should be checked, so when the page reloads, it is. However, the value may have not been saved to the database, so when the value is loaded from the DB, the box is not checked, but you would not have known since the ViewState version was used. When testing, to make sure you are getting the actual values and not the “cached” counterparts, make sure you leave the page and come back.
My favorite part of my job is most definitely that I get paid to break things.
As a kid I had a playroom that was filled with legos. Legoland lived on two 8′x4′ sheets of plywood, and covered them both with roads, hospitals, race tracks, and restaurants. But it would be impossible to recall all of the horrible, terrible tragedies that happened to Legoland. Every weekend there was a new disaster: a high-speed police chase that would end with the perpetrator crashing into the gas station, and the ensuing explosion would level every building within 4 base plates; a tornado that blew the truck stop clear over to the other side of the Cantina; an earthquake–perfect when Legoland exists on two sheets of plywood–would split the town in half. Every weekend Legoland would get completely destroyed, leaving just an assorted pile of legos strewn across sixty-four square feet of what once was a happy little town. I would spend the next week reconstructing each building in true make-believe fashion, construction vehicles and all, just to repeat it all again come Saturday. The perpetrator was always the same guy, too, in the red helmet and the little blue dune buggy. You’d think that the Legolanders would revoke his driving privileges after the twenty-sixth time.
“Find something you love doing, and find someone that will pay you to do it.”
I break things. And, they pay me for it.
Screen Hunter 4.0 Free - www.wisdom-soft.com
Screen Capture Tool
Cost: Free
Quite possible the most essential task for any tester is taking a snapshot of the current screen to give their developer a visual representation of the logged error. The classic Windows hotkey, [Alt] + [PrtScn], will take a screen capture of the entire active window. However, sometimes the text on a link is spelled wrong, a button uses the wrong icon, or an error message displays in the wrong style; in these scenarios an entire screen grab is overkill and often confusing. Yet there are few things that a tester can do about that short of opening up MS Paint or Macromedia Fireworks and cropping the image, completely wasting valuable time and causing pointed comments from the Project Manager about diddling in Photoshop.
Screen Hunter 4.0 Free allows you to capture the important pixels quickly and effortlessly. Tap F6 (The default hotkey, but it can be modified), and your cursor changes to a cross-hair. Click-drag a box around whatever you want to capture, and it’s done. Instantly cropped screen capture for your bug-tracking pleasure.
The developers will be happier, too.
So your wonderful little creation is finished, and it does exactly what it was designed to do. But, have you prevented it from doing what it’s not supposed to do?
Enter the forgotten art of negative testing. This is the safeguard from user error, malicious attacks, and blatant developer oversight. Negative testing is taking your calculator application and trying to add “Hello” and “Goodnight”. Negative testing is trying to supply an invalid email address–.anything@something.q–into your mailing list form. Negative testing is trying to cause a buffer overflow on your lead-developer’s computer because you were able to sneak in a script injection.
The key word here is “try.”
If everyone has done their job, you will get nowhere. Unfortunately, rarely is this job done right. In 3 minutes I could considerably alter my best friend’s blog, and he doesn’t even know it. In 10 minutes I could corrupt the online database of a Fortune 500’s web site–both company and URL to remain anonymous. And, what scares me the most, in 20 minutes I could download the entire database of a certain benefits company, including the complete identity–SSN included–of a few thousand people.
For years, I have been paid to break things as much as build them. When that calculator finally adds 2 and 2 correctly, don’t be satisfied. Try to add “Hello” and “Goodnight”. Will it give you a neatly handled error message informing you that it couldn’t complete the procedure, or did it return a fatal exception and die a miserable death because it expected a Double and you gave it a String? Optimally, it shouldn’t allow you to even type characters into the input area unless you are working in hex; even then, only A-F.
If instructions tell you to do one thing, enter the opposite. If you see a value in the URL, change it. If a field asks for an integer between 0 and 5, try 0, 2, 5, -1, 9, 3.5, and “Q”, and see how it handles “unexpected inputs.” If a querystring is “?UserID=6″, change the 6 to a 7, to see if you get information on User 7, and try invalid items like 3.5 and “Q” to see if it fails on unexpected inputs. If a client-side cookie has a value of “User”, try changing it to “Admin” or “Administrator” and see if your access-level is increased.
Find the weaknesses, find the holes, and find the bugs so that they can get fixed. You are the demolition man. You get paid to blow things up. Do it. Do it with purpose. Pretend you are a hacker trying to get into the system. Pretend you are a teenager-hacker-wannabe trying to screw with the system. Pretend you are a grandma that doesn’t know what to do with the system. Do all of the things that you aren’t supposed to do to the application and do them on purpose, because if by ignorance or intelligence, your users will find what was missed.
For my needs, the biggest hole in Mercury LoadRunner is its lack of page size monitoring. LoadRunner can monitor anything else imaginable, including transaction counts, transaction times, errors, and all Windows Performance Monitor metrics. However, monitoring page size, download times, and HTTP Return codes are only available through programming.
The following function will monitor the page size of all responses, logging an error if it exceeds you specified limit, as well as track all values on the user-defined graphs.
si_page_size_limit(int PageLimit, char* PageName, char *PageURL, long TransactionID){
//
// Page Size Limit Monitor
// Author: Jay Harris, http://www.cptloadtest.com, (c) 2004 Jason Harris
// License: This work is licensed under a
// Creative Commons Attribution 3.0 United States License.
// http://creativecommons.org/licenses/by/3.0/us/
//
// Created: 10-Aug-2004
// Last Modified: 10-May-2005, Jay Harris
//
// Description:
// Logs an error to the log, pass or fail, including the applicable status, if logging is enabled.
// Plots page size datapoint to User Defined graph.
//
// Inputs:
// int PageLimit Maximum page size allowed, in bytes
// char* PageName Name of the page, such as the Title. For identification in logs.
// char* PageURL URL of the page. For reference in logs. FOr identification in logs.
// long TransactionID Transaction ID for the current request.
// Note: Transaction must be explicitly opened via lr_start_transaction_instance.
// Note: TransactionID is returned by lr_start_transaction_instance.
//
int iPageSize = web_get_int_property(HTTP_INFO_DOWNLOAD_SIZE);
char DataPointName[1024] = “Response Size [”;
strcat(DataPointName, PageName);
strcat(DataPointName, “]”);
if (PageLimit < iPageSize) {
lr_continue_on_error(1);
lr_debug_message(LR_MSG_CLASS_BRIEF_LOG | LR_MSG_CLASS_EXTENDED_LOG,
“Page Size Check FAILED - %s [%s] exceeds specified page size limit of %d (Total: %d)”,
PageName,PageURL,PageLimit,iPageSize);
lr_continue_on_error(0);
} else {
lr_debug_message(LR_MSG_CLASS_BRIEF_LOG | LR_MSG_CLASS_EXTENDED_LOG,
“Page Size Check PASSED - %s [%s] meets specified page size limit of %d (Total: %d)”,
PageName,PageURL,PageLimit,iPageSize);
}
if (lr_get_trans_instance_status(TransactionID) == LR_PASS) {
lr_user_data_point_instance_ex(DataPointName,iPageSize,TransactionID,DP_FLAGS_EXTENDED_LOG);
}
return 0;
}
All too often, we forget about usability. We get so caught up in fixing functionality and enhancing performance that we forget about the most important part: how easy is it to use this thing we have just created. Sure. That new super-gigantic Humvee will go in any direction you want, can climb a 21-inch vertical, and can pull small houses with ease, but who cares? It drools at the sight of a gas station, it is impossible to parallel park, and most importantly, how do you expect grandma to climb in and out of that thing?
What would grandma do?
Imagine that poor old lady trying to raise her little leg up onto the running board, and then pull herself up into the cab with those little arms. She’s a GRANDMA! She isn’t 20 anymore. Or 50, for that matter. We forget about grandma in our software testing, too. How would she use that application you just made? How would she react to that detailed error message your creation just spit out?
My poor father; he just got his first computer, and I’ve been trying to teach him how to do instant messaging. He knows enough about Windows XP to be familiar with the big ‘X’ in upper-right corner. Click it and everything goes away. But, Trillian is different. In the upper-left of the contact list is an upside-down triangle that minimizes the window to the system tray. Right below that is a little small ‘x’. Unfortunately, that ‘x’ removes your contacts from within your Trillian window. You have to play around in your ‘View’ menu to get the list to come back, again. However, my father doesn’t know upside-down triangles, and he certainly doesn’t know about the ‘View’ menu, yet. He just knows the ‘x’. So that’s what he does. He clicks the little ‘x’. And every time he does, his contacts go *poof*, and he has to call me to help him get his contacts back. My grandmother would do the same thing. I don’t think Trillian did any usability testing on that feature.
What would grandma do? You know that she’s going to want to click the ‘x’, no matter what, because the ‘x’ is what she knows, just like my father. So why not make the upside-down triangle an ‘x’? It can still minimize to the system tray. The ‘x’ isn’t a cast-in-stone rule that the application must quit all-together. If you don’t believe me, try the ‘x’ on your MSN Messenger window. It minimizes to the system tray. Why did Microsoft brake their own tradition? I bet what they really did was a little usability testing, and discovered that new users always want to click the ‘x’. To new users, the ‘x’ is a big “CLICK HERE” sign to make that window go away. They don’t care if it closes; new users just want it to go away. And if she were still around, Violet–my grandmother–would always be a new user when it came to computers. Just make the window go away. Like clearing the dishes after dinner: it didn’t matter if you threw the plate out, just get it off the table.
So, we have this problem. Now, what do we do about it? Ask yourself:What would grandma do? “Fatal error: Userdata insert failed. Connection to database unavailable. \\jedimaster\yoda\greenlightsaber\sqlserver2000 not found.” If she saw that, what would grandma do? Stare blankly at the computer? “Unable to save your contact information. Please try again later.” Grandma can understand that. So, think of your grandma when you test that new application. Think of your grandma when you write your error messages. Think of grandma when you draw pretty graphics or design a button icon. Your program will be much more friendly, and much easier to use. Even grandma could use it. If you need help remembering, put a picture of grandma on your desk at work, right next to your monitor. And if you don’t have a grandma, substitute that sweet old lady down the street that bought all of your raffle tickets when you were 12 and baked you cookies because you were such a good little kid.
Remember grandma.
What would grandma do? She’d tell you that she’s proud of you, because that’s what grandmas do.
Imagine a world where no one had a name. Instead of John Doe, you would know “the 185lb, long black haired, 5-foot-11 guy that lives on the corner of 43rd and 5th.” What happened if that guy got a crew cut? To his close friends, he would now be “the 185lb, short black haired, 5-foot-11 guy that lives on the corner of 43rd and 5th.” But not-so-close acquaintances, ex-girlfriends, the florist, cabbie, and the IRS would all know him as his former name. Few would know this new guy. There would be confusion when one group tries to talk to the other about this guy. What happened if he gained a few pounds, too, and moved over to 54th? No one would know who he was, anymore.
That’s why we have names. They are a constant in a dynamic life. They are the dependable value that gives the world security when all else changes. If John dyed his hair blue, moved to Phoenix, and got a sunburn, people would still know who he is. He’s John Doe.
ID attributes do for web objects what names do for humanity. In a world where static web sites are the ones your 12-year-old makes on Geocities, automated QA tools need a little help. Do you have a link to your favorite news site? Today, your automated tool can find that link to <a href=”http://www.cnn.com”>CNN.com</a>, but tomorrow’s <a href=”http://www.msnbc.com”>MSNBC</a> link is lost; the tool is still looking for CNN. So, give the link an ID. That’s its John Doe. The tool can find your link whether it is <a id=”favNewsLink” href=”http://www.cnn.com”>, <a id=”favNewsLink” href=”http://www.msnbc.com”>, or any other link that floats your boat. All it has to do is look for the name. favNewsLink. John Doe.
Be nice to your automated tools: give your web objects a name. Rename your ‘Comments’ link to ‘Feedback’? John Doe. Change your image of your dog to an image of your cat? John Doe. Multilingual site with translated text? John Doe. It will help you automate your QA process, and you will spend less time retooling your automation scripts and more time downing that Corona. Just don’t forget the lime.
Happy Cinco de Mayo.
Yes,
yes. I know. I should have done this long ago. People keep harassing me
to put my thoughts out on some odd web site, essentially exposing
myself to the world. I’m not an exhibitionist. I don’t like doing this.
But, I will try it. Maybe I will like green eggs and ham.
|