Archive for the 'Systems Administration' Category

Feb 04 2008

Performance Testing – First Example – Notes

In our first example we did several things that I perhaps took for granted. A reader asked why I chose to use CLI load generation tools instead of surfing with a browser, and if the initial numbers generated are even reasonable. Allow me to address both points.

There is nothing stopping a tester from performing their load test using a browser or other GUI based load generation tool. Providing the tool has some manner of minimal controllability and repeatability, feel free to use any tool you so desire. Should you choose to use a browser, such as Firefox, be sure to recognize that the plugins are capable of altering the behavior and characteristics of the browser considerably and all future testing would need to be done using the same exact browser, machine, and plugin combination for the numbers to be comparable.

Essentially, I choose to use a CLI tool, such as wget, because it behaves the same every time and I can wrap the application in a shell script to guarantee both the same behavior, and instrument things further.

Second, the first cut performance numbers. When trying to rapidly determine a speed of light, its more important to get to the right magnitude first and then refine than it is to try and get the exact answer on the first pass.

For example, we know that we cannot move more packets across the network than we can fit through a single network interface. Therefore, if the size of the interface is N bytes per second, regardless of what we do, we will never be able to push more than N+1 total bytes through the interface. Similarly, if we know that we have M sized pages, then we will never be able to push N/M pages through the interface at a single point in time.

What this does for us is provide an upper bound for our testing. If our results indicated that we were able to do N/M + 5302 transactions per second, we know know that something was wrong with our calculation as those last 5302 operations/second would simply not fit through the pipe. However, if our results indicated 5302, and N/M is 8192, then we know that our result is a reasonable number.

It is important to obtain the speed of light at the start and then refine using the bounding boxes we know to be true. This ideal holds whether testing network, disk I/O, cpu, etc. If we know what the fastest possible speed of a given component is, then we know that if our results report numbers faster than what we know to be possible, then we must question the results and find them to be true because of another reason, or discard them and start over.

And a few thoughts just to provide a bit of context as to when speed of light may be false. When testing disk I/O, if the file size is sufficiently small, the file will be loaded directly into the buffer cache. This is a section of machine memory dedicated to buffering all disk I/O transactions and provides access times and speeds comparable to system memory and not the disk subsystem. Ie, the speed of light for the buffer cache is based off of a 50ns access time and not an 80ms access time.

I’d like to thank the reader that brought the questions to me and invite others to comment or email me with any questions they may have. I can be reached at kmajer at karlmajer dot com.

>>> Karl

No responses yet

Feb 01 2008

Performance Testing – First Example

This article is second in a series on performance and performance testing. The principles of the scientific method were discussed in the first article and now this article will detail a basic and slightly simplistic example of a performance task.

The developers you support as a systems administrator are considering moving the static content for the company website www.somefamoussite.com to its own webserver to free up resources for the dynamic page generation and generally speed things up. They are curious how fast an apache serving only static content will be able to serve requests.

In order to accomplish this task, we must examine the steps of the scientific method and see how each plays its part in providing us a sound and thorough roadmap to provide the developers with their answer.

  1. First, define the question:How fast can an apache webserver go serving static content.
  2. Next gather information and resources:
  3. It would benefit us to know the number of static items and the size of them to determine how best to answer the question, so we ask development that question and are handed a tarball containing 256 images with an average size of 6k each.

    Since we setup the hardware that is being used, we know that the servers have gigabit ethernet cards in them. We also expand the tarball into the tree on our webserver and use a find to create a logfile full of relative links to use to fetch the static content.

    find . -type f | awk ‘{print “/path/to/images/$1”}’ > logfile.out
    cat logfile.out logfile.out logfile.out logfile.out > logfile.1k
    cat logfile.1k logfile.1k logfile.1k logfile.1k > logfile.4k
    cat logfile.4k logfile.4k logfile.4k logfile.4k > logfile.16k

  4. Form the hypothesis:
  5. Based on an average size of 6KBytes, and knowing that the hardware has gigabit ethernet, we can compute that in lab conditions with perfect network, the machines can do no more than approximately 21,845 requests/second.

    ( (1Gbit/sec == 128 MBytes/sec) / 6KBytes avg size == 21,845.3 objects/sec)

    Our hypothesis, therefore, based solely on the network capacity of the hardware, is “A server can do no more than 21,845 operations/second.

  6. Perform the experiment(s) and collect data:
  7. You’ll want to run top on the webserver to get a rough idea of how much free cpu there is.

    Copy the logfile.16k to each of the load generators. In this example there will be 4 load generators.

    Use wget on one of the load generators to mark the logfile with something we can search for later.

    wget http://www.myserver.com/images/TESTSTARTEDHERE

    Use wget on each load generator to fetch the images

    wget -i logfile.16k -o wget.out

    Fire off all four wget’s at the same time and let it run.

    Watch top running on the webserver and keep rough track of our idle cpu. Time passes and the load generators will eventually run out of logs, probably within a few minutes.

    Use wget to mark the logfile again.

    wget http://www.myserver.com/images/TETENDEDHERE

  8. Analyze the data:
  9. With each load generator having a 16k logfile, we had the potential load capability of 64k instantaneous requests. This is unlikely, however, as there is a certain amount of overhead between requests that must be accounted for. A reasonable assumption would be that each generator could generate close to 8k instantaneous requests, the four of which still totaling over the ~22k maximum of the network.

    Using the logfile from the apache we can determine how much traffic we received.

    First use sed or perl or your language of choice and extract the logfile.

    sed -n /TESTSTARTEDHERE/,/TESTENDEDHERE/p access.log > test.log

    Next determine the starting time by looking at the next line after STARTED line in the test.log and looking at the timestamp on the line

    head -2 test.log

    Do the same for the ending time by looking at the second to last line of test.log.

    tail -2 test.log

    Determine the total test time by subtracting the two times from each other.

    Determine the total number of lines in the logfile and removing 2 for the header and footers.

    wc -l test.log

    This will likely be 64k unless you interrupted the test prematurely.

    Now extract the successful image retrievals using grep.

    grep “HTTP/…. 200” test.log > 200.log

    Count the number of successful requests

    wc -l 200.log

    Now compute the average request speed

    Good Requests / Total Time in minutes == Average Good Requests/Minute

    Next, determine the number of requests per unit. Typically a 5 minute unit works best but for simplicity we will use a 1 minute unit.
    Extract column 4 (the timestamp in a CLF apache log) from the 200.log file

    awk ‘{print $4}’ 200.log > timestamp.out

    Truncate the seconds from the logfile using either cut or awk

    cut -d: -f 1,2,3 timestamp.out > trunctime.out

    Uniq and count the truncated timestampts to get the number occuring during each minute:

    uniq -c trunctime.out > counttime.out

    Now reverse the two columns to make the graphic easier and add a closing bracket.

    awk ‘{print $2”] ”$1}’ counttime.out > graphdata.out

    It is now possible to look through the logfile at this time and see a rough estimate of how the webserver did, however it is more valuable if we can graph this data and examine it visually.

    Import the data into excel, pages, or use gnuplot on the command line and plot the graph using a line graph.

    Load Graph

    The graph above was manufactured to illustrate the desired point. Note that the middle of the graph plateaus around 8500 requests/second. The flatness of the graph suggests that we’ve hit a bottleneck of some point. Since we know that the network is capable of nearly 22k request/second, and the network on each load client is presumably similar, we know that we’re either hitting the limitations of the disk subsystem or that we’ve pushed the webserver out of CPU.

    If, during the test, you saw the idle CPU approach single digits, then we have reasonable confidence that we pushed the machine to its limit of CPU otherwise we may be pushing the limits of I/O.

  10. Interpret the data and draw conclusions:
  11. Now, by using the graph, we can decide on a limit for the webserver. The flat line starts approximately around 8500 req/sec. A reasonable buffer is 10% of that number, and so we would say that the max capacity of the webserver is 7650 req/sec. If you wish to be more conservative, and you should, you could leave yourself 25% capacity and call it 6325.

    As a general rule you want to leave sufficient capacity on the machines to handle any excess load from failed hosts. If you have 2 machines, each machine should be able to handle all of the load. If you have 3 machines, then each should be able to handle 66% of the total load, and so forth.

  12. Publish Findings:
  13. With this simplistic testing done, you could approach development and tell them that you have some confidence that based on the preliminary testing you’ve done, the webserver can do 6325 ops/sec. Additionally, you should then provide the developers with your step by step guide as to how to get their own numbers to both allow them to validate your work, and to enable them to do this level of testing on their own in the future.

This concludes our first example. There are several more to come.

If you like what you’ve read, please share the blog with others. If you have any questions or comments, feel free to send me email at kmajer at karlmajer dot com.

>>> Karl

No responses yet

Jan 30 2008

Performance Testing for the Uninitiated

What is performance?

As a working systems administrators, or programmer, there will be times in which you are called upon to determine the performance of a system, an application, or even an algorithm. While performance testing is frequently time consuming, finding the answer to the question “how fast is it” is neither arcane nor rocket science. All that is required is patience, sound and repeatable methodology, and use of the principles of the scientific method.

Information detailing the scientific method is available on multiple websites. For simplicity Wikipedia’s entries will be used. Wikipedia tells us that the Classical Model of the scientific method is:

  • Characterization – Observations, definitions, measurements,
  • Hypothesis – explanations of characterization, often theoretical/hypothetical
  • Prediction – reasoning from the hypothesis
  • Experiment – tests all of the above

The Hypothetico-Deductive model, perhaps more in line with the working professional, is:

  • Use your experience.
  • Conjecture an explanation.
  • Deduce a prediction from that explanation.
  • Test

And an even more modern interpretation of the classical model is:

  • Define the question
  • Gather information and resources
  • Form hypothesis
  • Perform experiment and collect data
  • Analyze data
  • Interpret data and draw conclusions that form starting point for next hypothesis
  • Publish results
  • Retest, often done by third parties

The modern definition would extend the classical model with post experiment work to analyzie, publish, and retest the work done.

Using this information as background, I will be writing a series of articles on understanding performance. These articles will include a few in depth examples to help performance testers understand not only the methods and techniques used, but also the depth and thoroughness required. In the examples, both the hypothetico-deductive model and the modern interpretation of the scientific method will be used.

If you have any examples or questions you’d like covered, feel free to drop me email at kmajer at thisdomainname.com.
>>> Karl

No responses yet

« Prev - Next »