![]()
June 2005
From Start to Finish
Spotting hurdles in shoppers' way, performance testing holds providers accountable
By Mary Wagner
In terms of site performance, it looked like FootSmart.com was firing on all cylinders. According to monitoring from the backbone-the Internet's distribution network-availability at the e-commerce site of Benchmark Brands Inc. was 100%, with full pages loading in under 1.5 seconds.
But Gavin Galtere, Benchmark's applications development and network operations director, had concerns about what was happening at the user end, given a recent site redesign that had loaded pages with a lot more data.
A switch to web performance monitoring services from Gomez Inc. that included so-called last mile monitoring-a service Benchmark hadn't been getting from its previous provider-confirmed Galtere's suspicions within a day of implementation: Initial home page availability for visitors on dial-up connections was only 82%.
With as many as 65% of FootSmart's customers reaching the site via dial-up, that represented a significant potential loss of revenue. Last-mile monitoring also showed that visitors accessing other entry points to the site via search were successful on initial log-on only 92% of the time. The findings were enough to loosen funds from senior management to address the problem; ultimately, through the services of an outside content delivery network. "I was able to take those reports, put them in front of our CEO and VP of operations, and say, 'This is what I have suspected and here's the proof,'" says Galtere.
FootSmart.com's experience is just one example of the evolution in how online retailers are evaluating their customers' experience by measuring the performance of the technology that supports it. Monitoring of availability and response time of sites has moved past measuring at the backbone to measuring from the user interface. And beyond the lead metrics of page download speed and availability, applications monitoring is tracking the performance of discrete components that affect those metrics.
Within the enterprise, other types of testing, software and services pinpoint and cut resolution time on content delivery errors that show up at the user end. Some online operators are even making the data on how their technology is supporting the user experience do double duty: They're using the benchmarked data to draft service-level agreements that set contractual standards on site performance. Those standards apply both between a company's business and IT departments and between the company and its outside technology providers.
Revealing the opportunities
FootSmart.com turned the minus revealed by its last-mile monitoring into a plus-using additional performance monitoring. "We realized we had a huge opportunity in terms of low-hanging fruit to be able to lift our top line," says Galtere.
It turned out that much of the page weight creating the problem for dial-up users was in new code written into the pages for tracking and reporting purposes. But eliminating too much of that code in an effort to lighten pages would mean losing that functionality. To find a solution, Galtere used Gomez to run a head-to-head test of the performance of web site pages on which code had been compressed against pages served up by an outside content delivery network.
The outside network, Akamai Technologies Inc., ultimately improved performance more than in-house efforts at code compression, he found. The data gave Galtere what he needed to secure funding approval to bring in Akamai, which FootSmart already was using for limited content caching, on an expanded basis.
Within three months of the full-time implementation of last-mile testing from Gomez and content delivery services from Akamai, Footsmart.com's sales increased 10%. The shopping cart abandonment rate dropped by 7%, home page dial-up downloads went from about 50 seconds to 12 seconds and home page availability on dial-up climbed from 82% to 99.6%. Galtere notes that due to other initiatives such as online marketing campaigns launched during that time, the new technology implementation can't receive credit for the entire 10% lift, but he estimates it's responsible for about 3% of the increase.
One factor affecting web site performance and the speed with which customers can call up pages and complete online transactions is the increased complexity of sites. A transaction such as an online purchase may be composed of multiple applications and unlike e-retail's earlier days, many of those applications may be integrated with or imported from outside providers, points out Pete Cruz, director of product management, enterprise solutions group, at performance applications testing services provider Empirix Inc.
Variety of testing
A purchase may start with a customer log-in, for example, which requires a user authentication application, Cruz notes. Another application serves up the product page the user requests. Adding an item to the shopping cart, going through checkout, requesting credit validation and executing shipping require other applications, some of which may be pulled in via web service calls to outside technology providers. The resulting process represents, Cruz says, "many points of potential failure."
That's one reason e-retailers such the Vermont Teddy Bear Co. monitor and test site performance in a variety of ways. For its four e-commerce web sites, Vermont Teddy Bear depends on vendor Alert Site to monitor uptime and load time, and also to monitor performance of a couple of key web site applications. Alerts on any problems go to IT staff's e-mail in-boxes or pagers. "They're able to tell you whether there is content on the page served and whether users are getting an error message," says webmaster Tom Funk.
Such basic monitoring may cost Vermont Teddy Bear in the range of $100 per month, but simulated, scripted load-testing-to-order taps different vendors and may carry a ticket in the range of a few thousand dollars. So Vermont Teddy Bear saves it for special circumstances, such as testing the performance of a site in development, or testing application performance on a new e-commerce platform before going live. In April, for example, the company used the services of hardware and software testing provider KeyLabs to preview how its recently- acquired Calyx and Corolla floral web site would perform when it moved the site from its existing e-commerce platform to a new one-shortly before its busiest holiday, Mother's Day.
The objective was to approximate on the site ahead of time what Funk calls "real battlefield conditions." Vermont Teddy Bear worked with KeyLabs to simulate user traffic at different levels and at different times of day to determine error rate and cycle times. The idea was that it was better to uncover problems via simulated orders than when serving actual customers, when site problems could mean lost revenue, and that's exactly what happened. "We built the new application; we thought we had it right, but it had issues when we load-tested it," says Funk. "We wanted to make sure the site could handle the traffic. But our first test showed we still needed to fix things before we were ready to go live."
Issues affecting site performance can range from load balancing to server or bandwidth capacity, to problems with the applications themselves, such as script errors, Funk notes. That means identifying a problem through testing and monitoring is only the first step toward resolution; isolating its cause among a multitude of possibilities is something else. Funk's team uses data from error messages, visual inspection of pages flagged as problems, and sifting through other variables to pinpoint the cause of problems once problems are found. "Our sites are online stores that use common elements, common templates. It's not that hard for us to diagnose issues-usually the problems float up to the surface somehow," he says.
Ignoring problems
But that can be time consuming and, at a site as complex as consumer and auto electronics retailer Crutchfield.com, downright unwieldy. About 85% of Crutchfield's applications are internally developed. The order entry system alone, developed over the last 10 years, now has almost 1 million lines of code. The site runs upwards of 350 applications in multiple languages and several thousand ASP pages. That raises the possibility of interaction problems on several fronts: between its internal applications, such as switching from one language to another, for example; between third party applications and other third party applications, or between outside applications and its own code.
That makes sifting through possible sources of trouble to find the actual culprit impractical. It's so time-consuming, in fact, that CIO Steve Weiskircher says that after weighing the programmer time required to re-create the sequence of events leading to a minor problem on its internally-facing system used by call center agents against an agent productivity loss of perhaps 10 seconds, Crutchfield chose to track down and resolve only about 15% of such problems.
But recently, Crutchfield has been using a tool that shrinks that cycle time. The AppSight Black Box from Identify Software represents a category of product that tackles the issue with software which captures actual event sequences, allowing IT staff to simply replay a sequence rather than try to re-create it.
The software captures a real-time log of user actions, system events, performance metrics and other site operations data. Among other uses, Crutchfield has tapped the software to find the source of a problem within an internally written application that was designed to shop for rates at carriers for packages awaiting shipment. Crutchfield wrote the web service that made those calls to carriers, noticed when the application appeared to be slowing down, and saw that, unresolved, it stood to slow the turnaround on customer orders. The Black Box identified the time associated with each element of the application and revealed the location of the bottleneck: internal servers weren't up to supporting the new rate-shopping application.
Crutchfield resolved the issue by boosting server capacity, an upgrade that already was in the works. "The real benefit of the troubleshooting software in this case was in the time it shaved off finding the problem," Weiskircher says: minutes, compared with hours it may have taken for IT staff to retrieve the same information by looking for it in the source code, testing and re-testing the application.
Multiple perspectives
Other technology providers such as Xaffire Inc. also provide software that monitors and replays web site user sessions to troubleshoot problems. Tower Records gets similar functionality from TeaLeaf Technology Inc.'s RealiTea. To cover all bases in ensuring that technology supporting site operations is up to standard, Tower uses monitoring and applications testing services from Keynote Systems Inc. and Digital River Inc.'s Fireclick Inc. as well. The three vendors provide feedback on site performance and site problems at different levels of detail.
TeaLeaf captures data by individual user sessions to give Tower a window on customer- specific interactions with the site, data it uses primarily for tech support and customer service. Recently, for example, a customer contacted Tower.com to complain he couldn't log onto his account and that repeated e-mails to Tower requesting help with his account password had gone unanswered. Replaying session information showed that while the customer kept hitting the "view hint" button, the site couldn't display a hint because the customer had never supplied one when setting up his account.
Session information also showed that Tower had sent a number of e-mails in response to the customer's questions. It turned out the customer never received them because they'd been mistakenly blocked by his AOL address as spam. "In that case our software was working okay, but it would have been much harder to figure out what was happening had we not been able to just look at the session," says vice president of e-commerce Kevin Ertell.
With testing and monitoring services and products getting better at identifying performance and application problems, it's no surprise that some companies are starting to quantify the accumulated data and use it as the basis of service-level agreements that guarantee technology performance. And as systems have become more reliable, some of the focus of web site monitoring and testing has shifted away from an earlier emphasis on basic site availability to site responsiveness and all the elements that affect it.
Measuring consistency
As a result, forward-thinking retailers are now looking to monitor and test site operations with a new metric in mind: consistency of response time. "In addition to expectations of functionality and ease of use, customers now carry expectations related to site performance," says Matthew Poepsel, director of business development at Gomez. Meeting them is critical for retailers, he contends-any gap between visitors' expectations of and their experience with site performance is an indicator of customers' frustration, their propensity to click off and go elsewhere, and a wasted opportunity for that online retailer.
Gomez and Keynote Systems already publish indices that benchmark retail site performance on availability and response times; in June, both planned to expand their indices to cover the performance of site operations in greater depth. Gomez's expanded benchmark system, the Gomez Consistency Benchmark, will measure Internet business processes and the infrastructure impacts on end-user performance. Keynote's expansion will build on recently-launched service-level management services with an annual study that benchmarks 20 retailers against their peers along 10 or more performance factors compiled from as many as 40 underlying metrics.
Darmesh Thakker, senior product manger for service level agreements at Keynote, says SLAs in performance contracts between a retailer's business team and internal or outsourced operations units have two aspects. "There's applications performance-is my site going to be 99.9% available and have a response time of no more than 8 seconds?" he says. "Then there's the operational responsiveness component, which says, if site operations do exceed these thresholds, how quickly am I going to get that issue resolved?" The expanded index will provide data that will serve to expand SLAs beyond availability and responsiveness to cover components of those broader metrics, he adds.
With feedback on the technology supporting site operations available in increasing detail, expanded offerings from existing vendors and more vendors getting into the game, are IT departments charged with keeping the site and its applications humming risking information overload? It's a concern that's bubbled up in response to rapid developments in other areas of e-commerce technology. "More data that we know what to do with" is one complaint sometimes voiced by site operators trying to figure out how to apply the seemingly infinite information available from marketing analytics, for example.
That seems less likely to be the case when measuring the more finite dimensions of response time, availability and the elements that affect them. In fact, as more monitoring and site operations software and services emerge, some retailers are finding that having overlapping tools doesn't muddy the water, but instead clarifies findings and results in more dependable data. "It helps to have a couple of different reporting packages," says Ertell. "When they correlate, you can believe the numbers."


AlertSite Mobile
Subscribe
Twitter
Facebook
LinkedIn