A Simple Approach to Multi-Tenant Data Testing

January 7, 2017 | Author: Melvin Laguren | Category: N/A

Share Embed Donate

Report this link

Short Description

Download A Simple Approach to Multi-Tenant Data Testing...

Description

A Simple Approach To Multi-Tenant Data Testing By Melvin Laguren

With all the different types of testing methods performed on a product, the tester always has the following question in the back of his or her mind:

Have I done enough testing? This question begins to move to the front of the tester’s mind when the application being tested is multi-tenant in nature and new ones begin to form in the back of the mind:  

How can I guarantee that one customer does not see another customer’s data? What testing method can I use to guarantee it?

Manually testing this by hand and taking a screenshot of the result is a long and tedious process and is prone to user error. As the number of customers and their data increases, so does the time it will take to perform this testing activity and the chances for error . Another issue with this approach is that with each new software release, the process would have to be repeated. In the end, this process can potentially become a full time job for one person. Automating the manual process is the first step in the right direction. The only drawback here is that the automated code has to be maintained and updated for each software upgrade. Both methods require accountability from the tester. Whereas the manual process requires the tester to take a snapshot and store it somewhere, the automation process is designed to record the testing activity. Both these methods require a lot of planning to insure that the data is captured correctly

Where to begin? When creating your multi-tenant data test solution, you must not only identify the problem --you must also identify all the components that lead to a solution. Let’s start by formally identifying the problem. Taking an example from the agile development method, a good approach to writing the problem is in the form of a user story: AS A PROFESSIONAL TESTER, I WOULD LIKE TO BE ABLE TO TEST THAT THE DATA CREATED IN THE APPLICATION CAN ONLY BE VIEWED BY THE CUSTOMER WHO CREATED IT, SO THAT THE CUSTOMER IS CONFIDENT THAT THEIR INFORMATION CANNOT BE VIEWED BY ANOTHER. The “reason” from the user story explains it all. The test solution being developed must make the customer confident that their information cannot be viewed by another.

Background Information To begin identifying the solution, the following parameters will be added to the problem:  

The multi-tenant application is an Ajax-based web application. There is no budget to purchase tools and there are currently none at the tester’s disposal.



The customer works for a regulated industry, therefore the customer will perform an external audit before accepting the application.

In reviewing the identified parameters, it is easy to see that the third bullet will be important in developing the solution. The results of the test must be clearly documented to convince an auditor that thorough testing has been performed on the application to insure data security.

Identify Possible Solutions Now examine the first two parameters. The second parameter would imply that the tester would use a manual solution.

Manual Solution 1 1. 2. 3. 4. 5. 6.

Log into the application as a customer Take screen shots of all web pages that contains customer data Save screen shots in a folder Log out of application Log into the application as a different customer and repeat steps 2 thru 4 Compare the data saved from the first customer to the data saved from the second customer

By testing the application manually – saving screenshots of the different pages that display the unique data and visibly comparing the differences to the similar pages between the customer data pages – this solution has addressed the three parameters that was given to the problem.

Drawbacks to Manual Solution 1 The first and foremost drawback with this solution is human error. A thorough tester would test more than just two different customers. As the number of “customers” used for testing increases, the chances the tester could inaccurately record the data from steps 2, 3 and 4 increase. Another drawback is that the data being used for testing may fail to discover real world problems. This is especially true if this is a first release of the application. Setting up test data can be time consuming and a tester may miss something, especially if data is being entered for more than two customers. So, what now? Analyzing the solution has introduced two new problems that need to be accounted for:  

As the number of customers used for testing increases, so does the chance of incorrectly recording data As the number of customers increases, there is a chance that the data used will not find a problem

Manual Solution 2 Looking at the parameters again, the third parameter could help solve the new problems discovered. The solution now includes a second set of eyes.

Manual Solution 2A 1. 2. 3. 4. 5. 6. 7.

Log into the application as a customer Take screen shots of all web pages that contains customer data Save screen shots in a folder Log out of application Log into the application as a different customer and repeat steps 2 thru 4 Compare the data saved from the first customer to the data saved from the second customer Repeat Steps 1 – 6 with another tester

Manual Solution 2B 1. 2. 3. 4. 5.

Tester A creates data in Test Environment A Tester B creates data in Test Environment B Tester A performs Manual Solution 2A (steps 1-6) on environment B Tester B performs Manual Solution 2A (steps 1-6) on environment A Testers A & B switch environments and repeat Manual Solution 2A (steps 1-6)

Drawbacks to Manual Solution 2 The solution reduces the potential of error because a second set of eyes are involved. What drawbacks exist with this solution? The difference between the first and second solution is that a second tester is involved in the process. What if this resource is not available? Part of the second parameter says that, “there is no budget to purchase tools.” Applying this parameter to the equation, the odds of hiring another tester is unlikely. The level of confidence with the second solution is definitely more assuring to an auditor than the first solution. Then again, if another tester is available to assist in the process, there is no guarantee that this second solution will insure information security since it does not truly solve the human error issue discussed in the first solution.

How about automation? Automation could violate the second parameter. However, since the first parameter says that the application is web based, there are a multitude of open source applications available to automate either of the two processes mentioned above. Automating the process means that considerable investment is required initially, then the investment should be focused on developing the second solution. The reason is that after the solution has been automated, the tester will have more time to create application data to be used in the test. Automating the second solution is a very good beginning. The advantage for the tester is that the initial investment made now means that down the road, more time can be focused on creating additional data and only minor updating of the automated scripts for future versions of the application.

Other than not decreasing the odds of finding a problem, the biggest drawback an automation tool has is that it cannot do the data comparison. This responsibility still belongs to the tester(s) to verify that the data is unique between customers. This can be troublesome when the number of comparisons increases.

Problem Solved? So far, 3 possible solutions have been discussed. All 3 basically followed the following pattern: 1. 2. 3. 4.

Test the application as a customer Record the data being shown on each page Continue Steps 1 and 2 with a different customer Compare the results to make sure that they are unique

Under a tight deadline and limited resources, the tester’s focus will be on the functionality and performance of the application which may lead him or her to think that as long as any of the three methods are used, the odds of a data “bleed” are small. Others involved in the software development process may feel that the architecture of the software will insure that this “bleed” will not occur, especially when combined with one of these three testing methods. This should not put a tester’s mind at ease, and it definitely will not put the auditor’s. So, how does the tester put this issue to rest and focus on everything else that can possibly go wrong? Obviously, getting involved early on in the development process will be great, especially if the tester has seen the common mistakes that can be made when designing the application.

Increase Reliability To increase the reliability of the testing, there are two items that the tester needs to get involved with long before implementing a repeatable test process. Covering these two items will lead to a much more solid solution. This will not only convince you and your colleagues that the odds the data could bleed elsewhere are very slim, but will also convince outside observers that the testing is more than adequate.

Common Errors The first is to understand the two most common errors that will lead to the problem and how to spot them.

Missing Foreign Keys In the design of the database that will store the information, it is very rare that a direct relationship between two tables would occur without having a foreign key between the parent table and the child table. The more common mistake would occur as more relations are added to a table. To illustrate the problem, one of the requirements for an application is that a contractor can return back to their bid to make changes prior to finalizing their bid.

PROJECT • id • name • description • closing_date

BID • id • project_id • amount • final

CONTRACTOR • id • contractor_name

Figure 1

Being involved at the design phase, the tester would be able to see that the tables designed in Figure 1 will not satisfy the necessary requirements. The reason is that if the application displays the bid for the project, the contractor will be able to see all the bids made for the project that they are interested. This early catch in the design assures that the data stored in the bid table is associated with the contractor table, as well as the project table (Figure 2).

PROJECT •id •name •description •closing_date

BID •id •project_id •contractor_id •amount •final

CONTRACTOR •id •contractor_name

Figure 2

The Where Clause Forgetting the where clause or not completely including everything in the where clause is another common error that can result in data “bleeding”. Even with the improvements made in the database in the Figure 2, if the following query is executed: SELECT PROJECT.NAME, BID.AMOUNT, BID.FINAL FROM PROJECT INNER JOIN BID ON PROJECT.ID = BID.PROJECT_ID WHERE PROJECT.PROJECT_ID = 1; When executed by the web application, the contractor would be able to see all bids made on a project. Depending on further requirements, the contractor could potentially edit or delete other bids to the project. The tester would be able to catch this error and the following correction would be made: SELECT PROJECT.NAME, BID.AMOUNT, BID.FINAL FROM PROJECT INNER JOIN BID ON PROJECT.ID = BID.PROJECT_ID WHERE PROJECT.PROJECT_ID = 1 AND BID.CONTRACTOR_ID Being involved in the design process and catching these common mistakes, will decrease the odds of customers accessing data that they should not have access to.

Mind Mapping The second important item is to know the data model. Understanding the database tables and what is accessible from the user interface will help you identify where to focus the testing. For example, if every customer has access to the same contractor, then the test does not need to see if the contractor is unique per customer. If the contractor works for different customers, then it is very important to see that the application allows customers to see other customers that use the same contractor. Mind mapping is the perfect technique to draw out what important information needs to be isolated from other users of the application. There are even a lot of free tools that can help to create the Mind Maps. Figure 3 was created using a free tool called FreeMind.

Figure 3

With the mind map created, it is easy to see the key data that should be isolated. Now the automation tool can focus on accessing the web pages that display this information.

Developing The Automated Solution Since the application is web based, there is an abundance of open source tools that can be used. The tool of choice depends on the overall solution being developed. The ideal approach would be a simple script that will execute the test and return a report.

Gather Data

Generate Report

Compare Data

Gathering Data Using JMeter JMeter, a functional load testing tool, is the ideal tool for accomplishing the first part of the test. As a functional load testing tool, it can automatically log into the application and navigate the various pages at once. Compared to the manual method, the application will log in with all N users at once instead of one at a time.

Figure 4

JMeter also provides a post processor component, “Save Responses To A File”. This component will read the responses from the server for each request made by JMeter and write it to a file. Place this component after each http request that is used to call the web pages that display the customer only data. In Figure 4, JMeter has the ability to add a prefix to the file being written. It will be very important to create a unique prefix for each request that is unique, for example User1.xml, User2.xml, etc. In Figure 5, each of the file’s prefix are unique so that later on the comparisons can be done on similar files.

Figure 5

Finally, JMeter can be executed from a script. This will make it easier to integrate into a multi-tenant testing application, especially since getting the files is only the beginning of the problem.

Shell Scripting with Cygwin Early on, it was noted that one of the drawbacks to the automated solution is that the automation tool could not perform the comparison. It meant that the tester would be required to perform the task. A scripting language has the capability to do the same task. One option is to use shell scripting. Since JMeter can be executed on either a windows based computer or a *nix based computer, shell scripting would be the ideal language to use because of Cygwin, a Linux-like environment for Windows. What should the shell script accomplish? Since there are various tasks that the script must do, it would be best to create several scripts to accomplish the following task: 1. 2. 3. 4. 5.

Execute JMeter Gather the files created by JMeter and group them so that it will be easy to compare 1 Remove excess information from the files for easy comparison Perform the comparison Create a response

Finally, one script can be created to execute each of the scripts created above.

Run JMeter The first script is pretty straightforward. The script will navigate to JMeter’s installation directory and execute the following command, ‘./jmeter -n multi-tenant.jmx’ where multi-tenant.jmx is the test case created by JMeter. To make the script feel more robust, a simple series of tests can be performed to make sure that the conditions to run JMeter are valid. The first would make sure that the number of threads defined by multi-tenant.jmx is less than or equal to the number of lines in the configuration csv file used for logging in. The reason is that if there are more threads defined, then jmeter will start back at the beginning of the csv file for the next set of parameters to be used with the next thread. This will definitely result in a duplicate data down the road. The second check would insure that the test is setup to run correctly. If the number of threads defined by multitenant.jmx is 1, then it is an invalid test. If the csv configuration file defined in multi-tenant.jmx does not exist, then there is problem. Adding these two checks will definitely make a stronger testing tool.

Gather the Results Once JMeter has completed its run, the next task is to gather the saved files and place them in a single folder to help identify the test run. Within that folder should be a folder for each of the different data sets used for comparison.

1

Since JMeter writes the response from the server to a file, additional information (headers, html code, etc) would exist.

Again, to make the script more robust, a simple file count between each of the subfolders will determine if the correct number of files exist. Even if a particular request made does not contain any data, a file is still created. This comparison can be made to see if the number of threads defined by ‘multi-tenant.jmx’ resulted with the correct number of responses.

File Clean Up Since the parameters state that the multi-tenant application is an Ajax- based web application, it is safe to assume that the data being transported from the server to the client browser is in some sort of XML format. Each of the files can now be cleaned up, so that all that remains is the xml data in question. The other important task that needs to be done is to make sure that each of the individual elements and their attributes are separated into their own line. This will make the file comparison easier. By this point, there is probably no need to make any additional error checking prior to executing the clean up. If anything, a post check could be performed to see if any files that will be used in the next script are blank.

Comparison Now comes the time consuming and confusing part of the test, comparing the files within each of the group. The more files generated for each of the data set, the longer the time. To be exact it will be: 𝑛

𝑛 − 𝑛 1

When writing the script, the first thing the script needs to know is the number of data comparisons or folders created from the second executed script. Since the mind map created back in figure 3, a check can be performed to make sure that there are three folders created for the different groups of data sets being compared. ls –d */ > folders.txt if wc –l folders.txt !=3… The next part of the script will navigate to each folder and execute the comparison. Once inside a folder, the comparison begins. Just like earlier, a file is generated which knows what files exist in the folder that needs to be compared. Before beginning the ordeal of executing the correct number of comparisons, the actual comparison script (to be referred to as datacompare.sh) should be addressed. All *nix shells are provided with a diff command. The inclination would be to use this in the shell script. This would be the wrong command to use since the diff command will only compare line 1 of file A to line 1 of file B. It will not see if line 1 of file A exists anywhere else in the file. In this case, the ever popular grep command would be more than enough. # datacompare.sh cat file1.txt | while read line do grep $line file2.txt > response.txt cat response.txt >> results.txt done

Replace file1.txt and file2.txt with $1 and $2 respectively, and now the datacompare.sh script will have 2 parameters that will be needed in order for the comparison to take place. Managing the data files for comparisons can be handled by a double while loop and a manipulation of the number that appears before the data files extension which was added by JMeter. # comparison_manager.sh – Performed on folder XXX cat datafiles.txt | while read i; do cat datafiles.txt | while read j; do if [ $i -lt $j ] then ./datacompare.sh "User$i.xml" "User$j.xml" uniq results.txt > output.txt mv output.txt "$i"_to_"$j".data fi done done Above is the basic comparison algorithm. When completed, the end result will be a set of files which are the end results of the comparison of 1 customer to another customer.

Analyze and Report The solution is almost complete. All that is left is to comb through the various result files to determine if there is any data in question. Just like gathering the files created by JMeter, the same technique will apply to the files generated by the comparison script. Once the files have been sorted, the first thing to do is to clean up the data in the files. Why? The data being collected is stored in xml, the xml tags will automatically show up as a match for every comparison being executed. This clean up will make it extremely easy to see if there is a problem. Files that do not have any matches will be blank. A simple sed command can remove the blank lines in each of the files: sed –i ‘/^$/d’ $12 After the removal of xml tags and blank lines from the different comparison result files, the following script below will create a report file that will only contain the names of the files that may need further investigation. # report.sh – Performed on folder XXX ls *.data > files.rpt cat files.rpt| while read line do if ((wc –l $line) > 0) echo $line >> XXX.rpt cat $line >> XXX.rpt fi done 2

If the ‘-i’ option is not available, then redirection should be used.

Ideally throughout the development of the solution, adequate testing was performed on both the JMeter test case and the shell scripts, the report file generated will give accurate information about the application being tested.

Upgrade and Expansion Just like any automated script, this solution will continue on in maintenance mode as the application being tested grows. This will be easy to perform since the script was designed to call the following functions: 1. 2. 3. 4. 5. 6.

Execute JMeter to gather the http response from web based application and place them in data files. Execute shell script to group data files and clean up files to leave only the data identified from mind map. Execute shell script to compare the data groups and create comparison result files. Execute shell script to group the comparison result files and clean up. Analysis is performed on the clean comparison result files. Report generated will identify any problems.

If designed properly, this script can be adapted to any other multi-tenant web based application by making minor changes to the various shell scripts and the one JMeter test case. For non web-based applications, this method can be applied by replacing JMeter with an emulator and making the necessary modifications to the remaining scripts.

A Simple Approach to Multi-Tenant Data Testing

Short Description

Description

Comments

We need your help!