XS Load Testing

From OLPC
Revision as of 15:57, 10 March 2011 by Sverma (talk | contribs)
Jump to: navigation, search

Testing the OLPC School Server

Author: Benjamin Tran

Date: 12/15/2010

Abstract: The One Laptop Per Child (OLPC) project oversees the development, construction, and deployment of an affordable educational laptop (XO) for use in the developing countries. OLPC has been deployed in many countries and as of August 2010, there are over 1.85 million XO laptops in the field. The low-cost XOs help to revolutionize the way we educate the world's children. In the school, each child's XO laptop connects to a central server that provides various networking, content, and communication services. The server also acts as the Internet gateway if an Internet connection is available. The hardware capabilities of the server depend on factors such as cost of hardware and availability of electricity. A more powerful machine means faster access to data and the capability to handle more users simultaneously, but this usually implies higher power consumption, a luxury that is unaffordable in many parts of the world. The software that runs on the server in the classroom, called the School Server (XS), is based on the Fedora distribution of the Linux operating system and provides networking infrastructure, services, as well as education and discovery tools to the XO laptops. Among the stack of customized software included in the XS, one of the most noticeable collaborations tools included is Moodle. Moodle is a free web-based course management system. With the customized software running on limited hardware resources, how many users can reasonably connect to the server and use Moodle? What hardware combination would be ideal for use with the XS given the estimated total number of students in a school? The purpose of this project is to come up with a method to determine the approximate number of users a potential server loaded with the OLPC School Server is capable of supporting. The approach taken will be to perform functional and load testing against different hardware systems loaded with the School Server software. The project makes use of industry-standard technologies and tools such as Selenium, Apache JMeter, and nmon for Linux. Selenium ensures Moodle functions according to specifications while JMeter simulates various load against the server to test its capability. Nmon for Linux monitors server-side resources and presents graphs of CPU usage, free memory, and disk I/O activities among many others. Test results from six different machines are compared. The outcome of this project is a formal process to test and measure the capability of a potential server that is being considered for deployment.


Keywords: OLPC, One Laptop Per Child, XS School Server, XO Laptop, Moodle, Selenium, Apache JMeter, Software Testing, Functional Testing, Performance Testing, Load Testing, Stress Testing


This work is the result of taking course CSC895 Applied Research Project (Computer Science Department, San Francisco State University), with the guidance and assistance from the following advisors:

Professor Sameer Verma, College of Business, San Francisco State University

Professor Dragutin Petkovic, Department of Computer Science, San Francisco State University

Professor Barry Levine, Department of Computer Science, San Francisco State University

This work is licensed under the Creative Commons Attribution-NonCommercial 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc/3.0/ or send a letter to Creative Commons, 171 Second Street, Suite 300, San Francisco, California, 94105, USA.

Excerpts

Methods and Approach

The purpose of this project is to come up with a method to determine the approximate number of users a potential server loaded with the OLPC School Server is capable of supporting. The approach taken will be to perform functional and load testing against different hardware systems loaded with the School Server software. Functional testing ensures that the software works the way it was intended to. Effective test cases need to be written to cover major and common use cases of the software system. Load testing involves putting variable demands onto a system and measuring its response time. This ensures the system's hardware is capable of supporting an expected number of simultaneous users at peak times. Similarly, exemplary test cases need to be written for this type of testing. The test approach needs to be defined. The review and selection of software testing tools is also of great importance. Additionally, a comprehensive but yet easy-to-use system monitoring tool must be chosen to monitor the server's resources, such as CPU and memory usage. Taking all of the required tasks into consideration, the following design objectives were finalized (with the advice of the candidate's committee members):

  1. To design and write effective test cases to ensure the program's major functionality is within the scope of specifications.
  2. To design and write exemplary test cases that can effectively resemble the activities and network traffic that occur when many users simultaneously access the system.
  3. To select a quick and efficient approach to carry out these test cases.
  4. To choose an effective functional testing framework that allows for easy creation of tests since a web application's user interface is expected to change frequently.
  5. To choose an easy-to-use but yet powerful load testing tool that can simulate a heavy load with unique, dynamic data.
  6. To select one or more comprehensive system monitoring tool that can graphically show the consumption of a system's resources.
  7. To pick out and obtain several computer systems that can represent a broad spectrum of potential servers. The XS software will be loaded on to these systems and testing will be performed against them.
  8. To write a thorough test plan containing information such as the testing environment, features to be tested, features not to be tested, test approach, hardware/software setup, etc.

All of the above design objectives that served as requirements for the project were successfully satisfied.

Performance/Load/Stress Testing

Many people use the terms performance testing, load testing, and stress testing interchangeably but they have quite different meanings [12]. Unlike functional testing, performance testing does not aim to uncover defects in the system. Instead, it is performed with the hopes of eliminating bottlenecks and establishing a baseline for future regression testing. It is of crucial importance to have a set of expectations for the testing so that the results can be meaningful. For example, in the context of this project, it is necessary to know the expected number of students that will be using the School Server's services at the same time and how long the acceptable response time will be. When the number of students reaches a certain level that causes the response time to be no longer acceptable, then bottlenecks can be looked for at different levels (application, web server, database, operating system, network) and then performance tuning can be performed. Due to the many layers of complexity involved in a modern day web application, performance tuning has to be done with care. Measurements should be collected after every modification of a variable. Changing multiple variables for a single test session can lead to complications and unexpected results. The complete process should be to run the tests, measure the performance, tune the system and repeat the cycle until the system under test performs at an acceptable level. This process is also known as configuration testing.

The definition of load testing is to execute the largest tasks the system under test can handle and understand the behavior of the system under an expected load [12]. Load testing is usually performed when the maximum number of supported concurrent users for a system is known in advance. Thus, the system is put under such a load to ensure it can still function properly.

The goal of stress testing is to make sure a system fails and then recovers in a graceful manner [12]. This is done by either using up all the resources of the system or taking resources away from the system. For example, stress testing a web application could involve taking the database offline or running unrelated processes that consumes most, if not all, of the resources (CPU, memory, disk, etc.) on the web server

Decisions regarding Performance/Load Testing

JMeter was selected for automating the load and performance testing in this project due to its popularity and large user base in the testing community. Similar to Selenium, JMeter offers a HTTP Proxy Recorder component to allow for quick creation of tests by recording all the requests made in a browser. Unwanted requests can be easily filtered or deleted. Shared settings such as server and port address can be extracted to a central setup location in the test script. The tool allows for testing with dynamic data; a different record from a comma-separated file can be used for each thread in a test. For example, it is possible to simulate fifty different users logging in to a system instead of reusing the same account over and over again. Some systems do not allow multiple logins from the same account, too. Also, the built-in Linux commands and the nmon tool should be sufficient for monitoring resource usage on the School Server during test execution.

The load on a server primarily depends on the number of concurrent users and not on the total number of user accounts in the system nor the number of users logged in at a given point in time. There are different definitions of concurrent users on different systems though [39]. Generally, concurrent users are those that are causing the server to actively do something for them at the same time, such as processing a page, querying the database, or transferring a file. In other words, many similar computations are being performed simultaneously and possibly interacting with each other on the server. Many users trying to post to a forum at the same time or many users trying to watch a certain video at the same time are considered concurrent users in this case. Therefore, test cases must be designed with this fact in mind. It will take a very long time if the tester needs to manually create hundreds or thousands of accounts in the system for testing. Although Moodle provides the option to perform a bulk upload of user accounts, there is still a need to easily create a CSV file with many records of account information. Therefore, a Perl script was written to perform this task. It generates a CSV file for upload to Moodle to create many user accounts at once along with various settings such as enrolling each user into an existing course. The script also generates a CSV file that can be set to be used in a JMeter test so each thread can use a record in the file, thus simulating different users logging in at the same time. The script and its usage can be found in Appendix E.

In addition to the number of concurrent users, there are other factors that can have a significant impact on the test results. First, a course representative of the actual courses used in the field is needed. In the countries where OLPC is deployed, the course materials are different from what a typical course here in the United States would contain. Each user loads the course page in Moodle and the amount of content on the course page contributes to the overall response time. For the project, a sample course was created, which is shown below in Figure 7.a. Due to the lack of data from the field, we could only estimate what should be in the course. Also, the ramp-up time should not be overlooked. It is highly unlikely that all students try to connect to the School Server at the exact same time after they were told to do so. The amount of time to spread out the connection requests to the XS can greatly affect the overall response time. Based on a video of an OLPC deployment, we estimated the average ramp-up time to be 60 seconds [40]. That was the value used for the experiments in this project. In addition, to make the load more realistic, a Gaussian Random Timer was added to each JMeter test case to add a small random delay between each request. This is equivalent to a Normal distribution of the load. A ramp-up time of 60 seconds can be viewed as aggressive. It really depends on the number of students in the classroom. There are lots of variations, leading to a wide range of possible values. We do not have good estimates and cannot observe the actual ramp-up time in a classroom in a remote part of Peru, for example. The effects to system performance could be significant due to the load being spread out over a longer period of time. An experiment was performed to confirm this and the results can be found in Appendix I. In the remainder of this paper, we will use 60 seconds as the average ramp-up time.

It would be too time-consuming to set up the test environment over and over again for each execution of the JMeter test on each of the machines. User accounts have to be uploaded. The sample course has to be restored to Moodle. Various IDs in Moodle need to be known and configured into JMeter. These IDs are different for each installation. The forum should be restored to its original state for each test run. The aforementioned list is just some of the steps needed to set up the test environment. A couple of Selenium scripts were written for the purpose of setting up and cleaning up Moodle for test execution. These scripts can be found in Appendix F.

Full document

File:Testing the OLPC School Server Benjamin Tran SFSU.pdf