Skip to content

Load tests

remedrano edited this page Jul 25, 2020 · 11 revisions

DESCRIPTION

Software load tests allow to obtain the behavior of the XRepo platform, with a specific user concurrence during a set time range. Functionalities of "Loading Sample Files" and "Searching for Samples by Filters" are critical to the flow of the XRepo platform, therefore load test scenarios are designed focused on the two specific functionalities. Load tests are automated with a test automation tool, using test scenarios, where can mainly choose the number of concurrent users and server upload times. With the same test tool, the execution results are extracted, from which the respective analysis is carried out and the capacity of the XRepo platform is determined with precision, specifically the functionalities of "Loading sample files" and "Search for samples by filters".

OBJETIVES

  1. To Validate the correct operation of the functionalities to load samples and to search samples across by filters
  2. To Execute automated software load tests about the selected functionalities.
  3. To Obtain the maximum capacity of concurrent users who can use the charge functionalities of sample’s files and samples’ search functionalities in a given time range.
  4. To Analyze obtained results from the automated technical validations.

METHODOLOGY

1. Selection of the test tool

The JMeter 5.3 test tool is established keep in mind the following criteria:

a. Flexibility for setting up test scenarios. 
b. Documentation available by developer communities. 
c. Use in a large number of software testing projects.

2. Design of test scenarios

The test scenarios focus on two critical functionalities of the XRepo platform:

a.	Upload of samples files (Upload samples) – 500 concurrent users. 
b.	Samples Search (Search samples) with all fields filtered – 200 concurrent users. 
c.	Samples Search (Search samples) with some fields filtered – 300 concurrent users. 

Based on the two critical features selected, is established the quantity of possible concurrent users with the support of platform end user, associated with the frequency of use of each functionality.

These scenarios were included in the jmeter tool, It was generate a jmeter plan, this jmeter plan it can download in [Jmeter Plan]

3. Test environment preparation

Test environment contains the servers where the application is hosted, and the remote equipment used to launch the tests in an automated way. Hardware specifications of the Azure cloud-hosted servers that hosted the XRepo platform are detailed below:

a. Servers : The servers are in the azure cloud
No. Server Region OS Type Service Type Machine CPU Memory Disk
1 xrhdfsserver1.westus2. cloudapp.azure.com West US 2 Ubuntu 18.04 LTS Virtual Machine Standard D3 v2 4 VCPUS 14 GiB memory 30 GiB Standard SSD [OS Disk] 128 GiB Standard SSD [Data disks]
2 xrmongoserver1.westus2. cloudapp.azure.com West US 2 Ubuntu 18.04 LTS Virtual Machine Standard D2 v2 2 VCPUS 7 GiB memory 30 GiB Standard SSD [OS Disk] 128 GiB Standard SSD [Data disks]
3 xrepo.westus2. cloudapp.azure.com West US 2 Ubuntu 18.04 LTS Virtual Machine Standard D2 v2 2 VCPUS 7 GiB memory 30 GiB Standard SSD [OS Disk]
b. PC desktop : PC for the tests run
No. Description OS CPU Memory Disco Speed Network
1 Equipo de escritorio (PC) Ubuntu 16.04 Intel Core I7-9750H CPU 2.60Hz 16 GiB memory 500 GB SSD 25 Mb in broadband

4. Running load tests

Execution is determined with the test environment configuration and execution of designed test scenarios. The selected tool returns a series of results in .csv files, which are consolidated in graphs to obtain the results and carry out the respective analysis.

5. Analysis of results

Based on the results thrown in files .csv and the graphs obtained in the execution, is possible to establish the system capacities associated with the load received from users, the concurrence in certain time and the behavior of the servers with the concurrence of users.

The file used for the tests run is Here you can download

TEST SCENARIOS

Design of the test scenarios is detailed below:

No. Server Region OS
1 Upload File Samples 500 1
2 Search for dates with all fields 200 1 second
3 Search for dates with some fields 300 1 second
||Total	|1000 simultaneous users	|3 seconds|

RESULTS

The tests execution lasted 19 minutes and 17 seconds, where additional test scenarios are executed in parallel, with the objective to simulate the behavior of concurrent users with the platform. Running the load tests using the selected automation tool, the following results will be used at the functionality and server level

1. Requests Results

No. Functionality % Errors %Success Response Time Success (ms)
1 Upload File Samples 25.60% 74.4% Average: 1007829.055 Max: 1126284 Min: 484918
2 Search for dates with all fields 71.00% 29.00% Average: 1124198.915 Max: 1139987 Min: 1104390
3 Search for dates with some fields 70% 30% Average: 1125359.076 Max: 1139630 Min: 1104325

The inflection point corresponds to the number of requests that the server processes without returning an error in the response. The maximum capacity of the server is obtained by adding the quantities of the point of simultaneous requests without generating an error. The parallel execution of the different functionalities causes a higher latency, resulting in higher response times for each request.

The load received by the servers is due to a representation of the functionalities used by users in scenarios close to reality, as more requests are sent, response times increase over time. inflection for each functionality, in the current tests a maximum amount of 522 is obtained

a.	Upload File Samples

Sample file upload functionality consumes large amount of disk space due to simultaneous face of files sent in test. Http requests from client to server require longer execution time due to file parallel upload, consuming more network resources and connection times by the server. When the maximum capacity calculated on the server is exceeded, the responses with errors returned by the application server are initiated.

b.	Search for dates with all fields

Search functionality represents querying user-loaded sample files, which implies the use of bigdata functions and execution of search algorithms as greater be the number of files, greater be the server response time. The behavior of response times is expected during the execution of the scenario, since as more quantity of requests are received, takes longer the server to transmit a response

The search algorithms for big data functions run faster with large files, approximately 300MB minimum, given that HDFS and Map Reduce is designed to handle and execute algorithms with a large volume of information.

c.	Search for dates with some fields

In the case of filters on search functionality, the behavior obtained is as expected, As the number of requests sent to the server increases, response time increases too. Is recommended to use as many filters as possible that give you the improvement of obtaining data in each file stored by the user.

2. Server Results [MONGO DB, HDFS, XREPO]

Following graphs are related to the behavior of the different servers:

No. Server CPU % Used Memory % Average Used Disk I/O % Average Used
1 xrhdfsserver1.westus2. cloudapp.azure.com [HDFS] Average: 10.104% Max: 81.772% Min: 0% Average: 16.964% Max: 17.623% Min: 14.78% Average: 0.263% Max: 82.4% Min: 0%
2 xrmongoserver1.westus2. cloudapp.azure.com [MONGO] Average: 5.321% Max: 31.658% Min: 0% Average: 8.161% Max: 8.574% Min:7.423% Average: 0.072% Max: 14.44% Min: 0%
3 xrepo.westus2. cloudapp.azure.com:8080 [XREPO] Average: 60.508% Max: 100% Min: 0% Average: 31.340% Max: 35.574% Min: 16.561% Average: 0.150% Max: 86.97% Min: 0%

HDFS server behavior has processing spikes that exceed 80% CPU, then processing normalizes after a period of time. RAM memory does not exceed 20% of utilization and the reads / writes to DISC (I / O) have different peaks of increase as the data of each request is processed with the search algorithms within XRepo, exceeding 80% at the highest peak. The behavior on “Illustration 5 - HDFS Server Performance” graph is due to the number of requests received in parallel during the test. As a new request arrives, the Map / Reduce execution starts on all the files uploaded by the user. The algorithm is executed and a final results file is generated from which the final URL is sent to the user, as greater is the number of files uploaded by the user, the longer the search algorithm runs. The high peaks obey the parallel execution of the algorithm, creating concurrent threads within the HDFS, until do not finish executing the algorithms. after that, the normalization of the execution is due to the lack of processing requests, that is to say, no more execution requests are received on the server

MONGO DB server behavior has processing spikes not exceeding 40% CPU usage, most of the time it is less than 10% utilization. RAM memory consumption does not exceed 10% of the total. Reads / writes to DISC (I / O) do not exceed 2% usage. Server does not exceed 40% in processing during the test, which can be established that it can process more load and greater number of requests than those received during the testing period, In addition, the MONGO DB server does not represent a point of failure on the initially proposed big data architecture.

XREPO server behavior has processing spikes that reach 100% and are sustained over long periods of time. Then there is a peak that begins exceeding 90% and later remains between 80% and 90%, causing large consumption of resources on the server. RAM consumption rises during test execution, but does not exceed 40%. The read / write to DISC (I / O) is recorded as a percentage of 86.97% in the first milliseconds, then drops to 0% throughout the test.

The XRepo application server represents a blocking element on the proposed architecture, after 8 to 9 minutes of processing the server reaches its inflection point, and then generates a blocking, causing a consumption between 80% and 90%, which does not allow processing of the following requests received, generating errors in the responses during the responses. The inflection point is calculated in 522 requests in parallel.

The initial value of 86.97% of the reads and writes to disk I / O, represent an outlier during the execution of the tests, associated with file accessibility by the operating system, specifically the VFS (Virtual File System) within the NFS communication architecture (Network File System). Initially the call to the call configuration is made by the VFS layer registered in DISCO. getting NFS pointers and NFS mount point locations for doing the respective binding of the files. This is done at the beginning of the operation, so that later the readings to DISC I / O are 0.

RESULTS FILES

The next files contain the data after the tests had running:

No. File Description
1 UploadFieSamples.csv Upload File Samples
2 SearchAllFields.csv Search All Fields
3 SearchSomeFields.csv Search Some Fields
4 Performance_ServerHDFS.csv Performance for HDFS Server
5 Performance_ServerMongo.csv Performance for MONGO DB Server
6 Performance_ServerXREPO.csv Performance for Xrepo Server
7 LoginSearchAllFields.csv Login for Search All Fields
8 LoginSearchSomeFields.csv Login for Search Some Fields
9 LoginUploadFile.csv Login for Upload File

Reports by jmeter are below:

No. Functionality
1 Upload File Samples
2 Search All Fields
3 Search Some Fields

CONCLUSIONS

By executing the test scenarios on the Big data architecture, it is possible to establish the capacity of the platform associated with the architecture, established in 522 requests in parallel with the general inflection point.

Increasing response times for each request during load test execution is expected behavior; the application server, the critical and blocking point of the architecture does not contain a load balancer, additional works on a machine in the cloud without advanced optimization techniques. Database server is available to receive more requests and finally the HDFS server supports the calculated inflection points and depending on the behavior it can process more requests.

To increase the responsiveness of the server, it is necessary to include different optimization techniques and direct modifications to the architecture, which are detailed below:

  1. To make adjustments to the architecture for working with containers, this allows processing more requests independently and without blocking the platform at the machine level, increasing the capacity of the XRepo platform.
  2. To use load balancer to manage the architecture including service escalation policies which allows to distribute the load of requests in the different servers or containers, without affecting the processing performance. Policies establish when to increase or to decrease the number of containers that process requests.
  3. Increase the upload size of sample files that the user uploads, the behavior of the HDFS server is optimal with large files, recommended with files larger than 300mb of data for XRepo. Platform supports a maximum initial size of 1.5GB, which can be increased depending on the user's need.

The test execution activities were designed to simulate user behavior in real scenarios, parallel requests and simultaneous functionalities, which allows obtaining approximate results to user behavior, specifically obtaining the platform capabilities associated with the architecture currently implemented.

For ‘Samples Files Upload’ functionality, Xrepo currently has a maximum capacity of 1.5 in .CSV formats. This capacity can be to increase on the Xrepo platform to receive more amount of GB’s. However, as larger the sample file size, the longer the user must wait to upload files through the web user interface; additionally, if the user close your browser with which he is uploading the file, he loses the transfer link with the server. To solve this problem, it is to recommended as a later work, create a native client that the user can download install and run to synchronize files independently to the browser or use streaming techniques and batch jobs by directly connecting devices to the Xrepo platform. This can apply either for current .CSV formats or for video and image formats that they are intended to be to use in the future, facilitating the extraction of data from samples of the different devices in real time.

Clone this wiki locally