Workloads and planning for growth
IBM has identified a collection of patterns for e-business and workload that are classified by the amount of page hits and the level of complexity of the transaction. For example, online shopping tends to involve a moderate number of page hits and a low level of complexity. Online trading, on the other hand, requires a more complex transaction. B2B Web services represent a complex transaction that might not have a lot of page hits, while publish and subscribe represents a simple transaction with a lot of page hits. See Resources for a link to the Patterns for e-business site.
Once a Web site is designed, the most commonly asked question is "Will my site scale?" Legregni says that the obvious response is to ask what the target is. Frequently, the customer wants the site to scale without having considered what the goal is. You need to consider long-term trends and needs for your site as well as "burstiness." There might be particular events, times of the day, or periods during the year when your site would have an uncharacteristic burst of activity. For example, a health care provider could see a burst in traffic at enrollment time. The site for the 1998 Nagano Olympic Games saw an extremely large traffic burst as local interest in a Japanese ski jumper drove traffic to the site. One strategy is to respond to increased page demand by stripping down the page so more people can access the core information, and removing some of the extras that might stress capacity.
One of the best ways to plan for growth is to simulate the load before you go live. Tools such as IBM's High Volume Web Sites (HVWS) Simulator for WebSphere let you vary parameters and experiment with different types of loads. You can easily see how your site would respond. This requires, of course, that you analyze the various ways your site could be stressed. How many users do you anticipate? How dynamic is your site? What is the expected level of page-views each day?
Designing for scalability
As you design your site, you don't need to reinvent the wheel. Keep your focus on the business problem and let it dictate your technology needs. You should frequently reassess whether your technology goals are in line with the business value that you want to deliver. Even if you are starting small, design with scalability in mind.
You can be prepared to scale if you begin with a workload balancer on the first server you deploy. As demand for your site increases, the workload balancing will become more and more important. Even from the beginning you should architect the balance of different types of machines into the way you design your site. Consider whether your needs include a Web server, a Web application server, directory and security services, database server, and various repositories for data and existing applications. Are you setting up a two-tiered system or three-tiered? You could make your decisions based on performance, or you might want a third tier so you can install a second firewall. There might be design constraints that force you to choose three tiers to better separate your presentation and business logic to support reuse and consistency.
When the time comes to scale, there are many options available. You can always throw more hardware at the problem, which might include buying faster machines, specialized machines, or replicating the machines you already have installed. If you decide to scale up the number of machines you have installed, you need to do so linearly. You can't just increase the number of Web servers, as this might drive too much traffic to the back end of your system. If you scale up the number of Web servers, you need to scale up the supporting machines by a corresponding amount. Your basic architecture shouldn't change. You might find, however, that once you are in production, there are bottlenecks that could be addressed by moving a specific function to its own machine. For example, you could move authentication to a different process. An online trading site might notice that a lot of its traffic just wants stock quotes and isn't purchasing, so might move the quote service to a dedicated machine to better serve customers making a transaction and those just seeking a quote.
You can also address the load with software-centric solutions. As always, assess your needs before deciding the proper line of attack -- understand your hardware configuration and workload, then determine the components most impacted. As with any performance improving advice, the bottlenecks are often not located where you would guess, so make sure you measure what's really going on. Then you can decide whether caching, request batching, connection management, or the aggregation of user data might improve performance.