Business Research Methods
Short Description
Business Research Methods eBook for MBA (Masters in Business Administration) Financial Management...
Description
LESSON - 1 INTRODUCTION TO BUSINESS RESEARCH METHODS
Research - Meaning - Types - Nature and scope of research - Problem formulation statement of research objective - value and cost of information - Decision theory Organizational structure of research - Research process - research designs - exploratory -Descriptive - Experimental research. OBJECTIVE To equip the students with the basic understanding of the research methodology and provide an insight into the application of modern analytical tools and techniques for the purpose of management decision making. STRUCTURE · · · ·
Value of Business Research Scope of Research Types of Research Structure of Research
LEARNING OBJECTIVES · · · · ·
To understand the importance of business research as a management decision making tool To define business research To understand the difference between basic and applied research To understand when business research in needed and when it should be conducted To identify various topics for business research.
INTRODUCTION The task of business research is to generate accurate information for use in decision making. The emphasis of business research is shifting the decision makers from intuitive information that is based on own judgment and gathering information into systematic and objective investigation. DEFINITION The business research is defined as the systematic and objective process of gathering, recording and analyzing data for aid in making business decisions. Literally, research means to "search again". It connotes patient study and scientific investigation where in the research takes more careful look to discover to know about the subject of study. The
data collected and analyzed are to be accurate, the research need to be very objective. Thus, the role of researcher is to be detached and impersonal rather than engaging in biased attempt. This means without objectivity the research is useless. The definition is restricted to take decision in the aspects of business alone. This generates and provides the necessary qualitative and quantitative information upon which, as base, the decisions are taken. This information reducing the uncertainly of decisions and reduces the risk of making wrong decisions. However research should be an "aid" to managerial judgment, not a substitute for it. There is more to management than research. Applying research remains a managerial art. The study of research methods provides with the knowledge and skills that need to solve the problems and meet the challenges of the fast-paced decision making environment. There are two important factors stimulate an interest in scientific approach to decision making: 1. The is an increased need for more and better information, and 2. The availability of technical tools to meet this need. During the last decade, we have witnessed dramatic changes in the business environment. These changes have created new knowledge needs for the manager to consider when evaluating any decision. The trend toward complexity has increased the risk associated with business decision, making it more important to have sound information base. The following are the few reasons which makes the researcher to lookout for newer and better information based on which the decisions are taken: · · · · · ·
There are more variables to consider on every decision More knowledge exists in every field of management The quality and theory and models to explain the tactics and strategic results are improving. Better arrangement of information Advancement in computer allowed to create better database. The power and ease of use of today's computer have given to capability to analyze the data to solve managerial problems
The development of scientific method in business research lags behind the similar developments in physical science research which is more rigorous and much more advanced. But business research is of recent origin and moreover the finding cannot be patented that of physical science research. Business research normally deals with topics such as human attitudes, behavior and performance. Even with these hindrances, business research is making strides in the scientific arena. Hence, the managers who are not proposed ior this scientific application in business research will be at severe disadvantage. Value of Business Research
The prime value of business research is that it reduces uncertainty by providing information that improves decision making process. The decision making process is associated with the development and implementation of a strategy, which involves the following 1. Identifying problems or opportunities Before any strategy can be developed, an organization must determine where it wants to go and how it will get there. Business research can help managers to plan strategies by determining the nature of situations or by identifying the existence of problems of opportunities that are present in the organization. Business research may be used as a scanning activity to provide information about what business happening or its environment. Once it defines and indicates problems and opportunities, managers may evaluate alternatives very easily and clear enough to make a decision. 2. Diagnosing and Assessing problems and opportunities The important aspect of business research is the provision of diagnostic information that clarifies the situation. It there is a problem, they need to specify what happened and why. If an opportunity exists, they need to explore, clarify and refine the nature of opportunity. This will help in developing alternative courses of action that are practical. 3. Selecting and implementing a course of action After the alternative course of action has been clearly identified, business research is conducted to obtain scientific information which will aid in evaluating the alternatives and selecting the best course of action. Need for Research When a manager faced with two or more possible course if action, the researcher carefully need to take decision whether or not to conduct the research. The following are the determinants. 1. Time Constraints In most of the business environment, the decisions most must be made immediately, but conducting research systematically takes time. There will not be much time to relay on research. As a consequence, the decisions are sometimes made without adequate information and through understanding of the situation. 2. Availability of Data Often managers possess enough information with no research. When they lack adequate information, research must be considered. The managers should think whether the research will be able to generate information needed to answer the basic question about which the decision is to be taken.
3. Nature of Information The value of research will depend upon the nature of decisions to be made. A routine decision does not require substantial information or warrants. However, for important and strategic decision, more likely research needs to be conducted. 4. Benefits vs. Costs The decision to conduct research boils down to these important questions. 1. Will the rate of return be worth the investment? 2. Will the information improve the quality of the decision? 3. Is the research expenditure the best use of available funds? Thus, the cost of information should not exceed the benefits i.e. value of information. What is Good Research? Good research generates dependable data can be used reliable for making managerial decisions. The following are the tips of good research. · · · · · · · ·
Purpose clearly defined, i.e., understanding problems clearly The process described in sufficient details The design carefully planned to yield results Careful consideration must be given and maintain high ethical standards Limitations properly revealed Adequate analysis of the data and appropriate tools used Presentation of data should be comprehensive, early understood and presented unambiguously Conclusion should base on the data obtained and justified.
Scope of Research The scope of research on management is limited to business. A researcher conducting research within an organization may be referred as a "marketing researcher" or "organizational researcher", although business research is specialized and the term encompasses all the functional areas – Production, Finance, Marketing, HR etc. The different functional areas may investigate different phenomenon, but they are comparable to one another because they use similar research methods. There are many kinds of areas are resembled in the business environment like forecasting, trends environment, capital formation, portfolio analysis, cost analysis, risk analysis, TQM, job satisfaction, organizational effectiveness, climate, culture, market potential, segmentation, sales analysis, distribution channel, computer information needs analysis, social values and establish and etc.
Types of Research Research is to develop and evaluate concepts and theories. In broader sense research can be classified as. 1) Basic Research or Pure Research It does not directly involve the solution to a particular problem. Although basic research generally cannot be implemented, this is conducted to verify the acceptability of a given theory or to discuss more about a certain concept. 2) Applied Research It is conducted when a decision must be made about a specific real-life problem. It encompasses those studies undertaken to answer question to specific problems or to make decision about particular course of action. However, the procedures and techniques utilized by both researchers do not differ substantially. Both employ scientific method to answer questions. Broadly, the scientific method refers to techniques and procedures that help researcher to know and understand business phenomenon. The scientific method requires systematic analysis and logical interpretation of empirical evidence (facts from observation or experimentation) to confirm or dispose prior conceptions. In basic research, it first tests the prior conceptions or assumptions or hypothesis and then makes inferences and conclusions. In the applied research the use of scientific method assures objectivity in gathering facts and taking decision. At the outset, it may be noted that there are several ways of studying and tackling a problem. There is no single perfect design. Research designs have been classified by authors in different ways. Different types of research designs have emerged on the account of the different perspectives from which the problem or opportunity is viewed. However, the research designs broadly classified into three categories - exploratory, descriptive, and causal research. The research can be classified on the basis of either technique or function. Experiment, surveys and observation are few common techniques. The technique may be qualitative of quantitative. Based on the nature of the problems or purpose of study the above three are used invariably used in management parlance. 3) Exploratory Research The focus is mainly on discovering of ideas. An exploratory research is generally based on secondary data that are already available. It is to be understood that this type of study is conducted to classify ambiguous problems. These studies provide information to use in analyzing situations. This will helps to crystallize a problem and identify information needs for further research. The purpose of exploration is usually to develop hypotheses or question for further research. The exploration may be accomplished with different techniques. Both qualitative and quantitative techniques are applicable
although exploration studies relies on more heavily a qualitative technique like experience survey, focus group. 4) Descriptive Research The major purpose is to describe characteristics of a population or phenomenon. Descriptive research seeks to determine to answers to who, what, when, where and how questions. Unlike explorative, these studies are based on some previous understanding of the nature of the research problem. Descriptive studies can be divided into two broad categories-Cross sectional and Longitudinal. The former type is more frequently used. A cross section study is concerned with the sample of elements from a given population. It is carried out once and represents one point at time. The longitudinal studies are based on panel data or panel methods. A panel is sample of representatives who are interviewed and then re-interviewed from time to time, that is, longitudinal studies are repeated over an extended period. 5) Causal Research The main goal of causal research is the identify cause and effect relationship among variables. It attempts to establish that when we do one thing what another thing will happen. Normally explorative and descriptive studies precede causal research. However, based on the breadth and depth of study, another method is frequently used is management called case study research. This places more emphasis on a full contextual analysis of fever events or conditions and their interrelations. An emphasis on details provides valuable insight for problem solving, evaluation and strategy. The detail is gathered from multiple sources of information. Value of Information and Cost Over the part decade many cultural, technological are competitive factors have created a variety of new challenges, problems and opportunities for today's decision makers in business. First, the rapid advances in interactive marketing communication technologies have increased the need for database management skills. Moreover, advancements associated with the so called information super high ways have created greater emphasis on secondary data collection, analysis and interpretation. Second, there is a growing movement emphasizes quality improvements. This placed more importance on cross sectional information then over before. Third is the expansion of global markets which introduces a new set of multicultural problem and question. These three factors that influence the research process and it take steps into seeking new information in management perspective. There may be situations where management is sufficiently clear that no additional information is likely to change its decision. In such cases, it is obvious that the value of information is negligible. In contrast, there are situations where the decisions look out for information which is not available easily. Unless the information collected does not led to change or modify a decision, the information has no value. Generally information is most valuable in cases i) where there
is unsure of what is to be done and ii) where extreme profits or losses involved. A pertinent question is-how much information should be collected in a given situation? Since the collected information involves a cost, it is necessary to ensure that the benefit from the information is more than the cost involved in its collection. Decision Theory With reference to the above discussion, an attempt is needed to see how information can be evaluated for setting up a limit. The concept of probability is the basis of decision maker under conditions of uncertainly. These are three basic sources of assigning probabilities 1) Based on a logic / deduction: For e.g., when a coin is tossed, the probability of getting a head or tail is 0.5. 2) Past experience / Empirical evidence: The experience gained in the process resolving these problems in the past. On the basis of its past experience, it may be in a better positive to estimate the probability of new decisions. 3) Subjective Estimate: The most frequently used method, it is based on the knowledge and information with respect to researcher for the probability estimates. The above discussion was confined to single stage problem wherein the researcher is required to select the best course of action on the basis of information available at a point at time. However, there are problems with multiple stages wherein a sequence of decisions involved. Each decision leads to a chance event which in turn influences the next decision. In those cases, a Decision Tree Analysis i.e. graphical derives depicting the sequence of action-event combination, will be useful in making a choice between two alternatives. If the decision tree is not helpful, more sophisticated technological known as Bayesian Analysis can be used. Here, the probabilities can be revised on account of the availability of new information using prior, posterior and pre-posterior analysis. There is a great deal between budgeting and value assessment in management decision to conduct research. An appropriate research study should help managers avoid losses and increase sales or profits; otherwise, research can be wasteful. The decision maker wants a cost-estimate for a research project and equally precise assurance that useful information will result from the research. Even if the researcher can give good cost and information estimates, the decision maker or manager still must judge whether the benefits out-weigh the costs. Conceptually, the value of research information is not difficult to determine. In business situation the research should provide added revenues or reduce expenses. The value of research information may be judge in terms of “the difference between the results of decisions made with the information and the result that would be made without it”. It is simple to state, in actual application, it presents difficult measurement problems. Guideline for approximately the cost-to-value of Research
1. Focus on the most important issues of the project: Identify certain issues as important and others as peripheral to the problems. Unimportant issues are only to drain resources. 2. Never try to do much: There is limit to the amount of information that can be collected. The researcher must take a trade-off between the number of issues that can be dealt with the depth of each issue. Therefore it is necessary to focus on those issues of greatest potential value. 3. Determine whether secondary, primary information or combination is needed: The most appropriate must be selected that should address the stated problem. 4. Analyze all potential methods of collecting information: Alternative data sources and research designs are available that will allow detailed investigation of issues at a relatively low cost. 5. Subjectively asses the value of information: The researcher need to ask some fundamental questions relating to objections. For example, a) Can the information be collected at all? b) Can the information tell something more that already what we have? c) Will the information provide significant insights? d) What benefits will be delivered from this information? Structure of Research Business research can take many forms, but systematic inquiry is a common thread. Systematic inquiry requires an orderly investigation. Business research is a sequence of highly interrelated activity. The steps research process overlap continuously. Nevertheless, research on management often follows a general pattern. The styles are: 1. Defining the problem 2. Planning a research design 3. Planning a sample 4. Collecting data 5. Analyzing the data 6. Formulating the conclusions and preparing the report SUMMARY This paper outlined the importance of business research. Difference between basic and applied research have been dealt in detail. This chapter has given the meaning, scope, types, and structure of the research.
KEY TERMS · · · · · · · · · · · · · · ·
Research Value of research Need for research Good Research Scope of research Types of research Basic research Applied research Scientific method Exploratory research Descriptive research Cross - sectional and longitudinal Causal research Decision theory Structure
QUESTIONS 1. What are some examples of business research in your particular field of interest? 2. What do you mean by research? Explain its significance in modem times. 3. What is the difference between applied and basic research? 4. What is good research? 5. Discuss: Explorative, descriptive and causal research. 6. Discuss the value of information and cost using decision theory. 7. Discuss the structure of business research.
- End of Chapter LESSON - 2 PROBLEM DEFINITION
OBJECTIVES
· · · ·
To discuss the nature of decision makers objectives and the role they play in defining the research To understand that proper problem definition is essential for effective business research To discuss the influence of the statement of the business problem on the specific research objectives To state research problem in terms of clear and precise research objectives.
STRUCTURE · · · · ·
Problem definition Situation analysis Measurable Symptoms Unit of analysis Hypothesis and Research objectives
PROBLEM DEFINITION Before choosing a research design, manager and researcher need a sense of direction for the investigation. It is extremely important to define the business problem carefully because the definition determines the purposes of the research and ultimately the research design. Well defined problem is half solved problems - hence, researcher must understand how to define problem. The formal quantitative research process should not begin until the problem has been clearly defined. Determination of research problem consists of three important tasks namely, 1. Classifying in argument information needs. 2. Redefining research problem, and 3. Establishing hypothesis & research objectives. Step 1: To ensure that appropriate information created through this process, researcher must assist decision maker in making sure that the problem or opportunity has been clearly defined and the decision maker is aware of the information requirements. This include the following activities namely, i) Purpose: Here, the decision maker holds the responsibility of addressing a recognized decision problem or opportunity. The researcher begins the process by asking the decision maker to express his or her reasons for thinking there is a need to undertake research. By this questioning process, the researcher can develop insights as what they believe to be the problems. One method that might be employed to familiarize the decision maker is the iceberg principle. The dangerous part of many problems, like submersed portion of the iceberg, is neither visible nor understood by managers. If the submerged position of the problem is omitted from the problem definition, then result may be less than optimal. ii) Situation Analysis or understanding the situation: To gain the complete understanding, both should perform a basic situation analysis of the circumstances
surrounding the problem area. A situational analysis is a popular tool that focuses on the informal gathering of background information to familiarize the overall complexity of the decision. A situation analysis attempts to identify the event and factors that have led to the current decision problem situation. To objectively understand the client's domain (i.e., industry, competition, product line, markets etc) the researchers not rely only on the information provided by client but also others. In short the researcher must develop expertise in the client's business. iii) Identifying and separating measurable symptoms: Once the researcher understands the overall problem situation, they must work with decision maker to separate the problems from the observable and measurable symptoms that may have been initially perceived as the being the decision problem. iv) Determining unit of analysis: The researcher must be able to specify whether data should be collected about individual, households, organizations, departments, geographical areas, specific object or some contributions of these. The unit of analysis will provide direction in later activities such as scale measurement development and drawing appropriate sample of respondents. v) Determining relevant variables: Here, the focus is on the identifying the different independent or dependent variables. It is determination of type of information (i.e., facts, estimates, predictions, relationships) and specific constructs (i.e. concepts or ideas about an object, attributes, or phenomenon that are worth measurement) Step 2: Once the problem is understood and specific information requirements are identified, then the researcher must redefine the problem in more specific terms. In reframing the problems and questions as information research questions, they must use their scientific knowledge expertise. Establishing research questions specific to problems will force the decision maker to provide additional information that is relevant to the actual problems. In other situations, redefining problems as research problems can lead to the establishment of research hypothesis rather than questions. Step 3: (Hypothesis & Research objective) A hypothesis is basically an unproven started of a research question in a testable format. Hypothetical statement can be formulated about any variable and can express a possible relationship between two or more variables. While research questions and hypotheses are similar in their intent to express relationship, the hypotheses tend to be more specific and declarative, whereas research questions are more interrogative. In other words hypotheses are statement that can be empirically tested. Research objectives are precise statements of what a research project will attempt to achieve. It indirectly represents a blueprint of research activities. Research objectives allow the researcher to document concise, measurable and realistic events that either increase or decrease the magnitude of management problems. More importantly it
allows for the specification of information required to assist management decision making capabilities. SUMMARY Nature of decision maker’s objectives, and the role they play in defining the research have been dealt in detail. This chapter has given the steps involved in defining the problem. KEY TERMS · · · · · · ·
Problem definition Iceberg principle Situation analysis Unit of analysis Variables Hypotheses Research objectives
QUESTIONS 1. What is the task of problem definition? 2. What is the iceberg principle? 3. State a problem in your field of interest, and list some variables that might be investigated to solve this problem. 4. What do you mean by hypothesis? 5. What is a research objective?
- End of Chapter LESSON – 3 RESEARCH PROCESS AND DESIGN
OBJECTIVES · ·
To list the stages in the business research process To identify and briefly discuss the various decision alternatives available to the researcher during each stage of the research process.
· ·
To classify business research as exploratory, descriptive, and causal research To discuss categories of research under exploratory, descriptive and causal research.
STRUCTURE · · · · · ·
Research process Research design Types of research designs Explorative research Descriptive research Casual research
Research Process Before discussing the phases and specific steps of research process, it is important to emphasize the need for information and when the research is conducted or not. In this context, the research process may be called as information research process that would be more appropriate in business parlance. The information research is used to reflect the evolving changes occurring within the management and the rapid changes facing many decision makers regarding how firms conduct both internal and external activities. Hence, understanding the process of transforming raw data into usable information from broader information and expanding the applicability of the research process in solving business problems and opportunities is very important. Overview: The research process has been described anywhere from 6 to 11 standardized stages. Here, the process consist of four distinct inter related phases that have logical, hierarchical ordering depicted below. Diagram: Four phases of Research Process
However, each phase should be viewed as a separate process that consists of combination of integrated steps and specific procedures. The four phases and corresponding step guided by the principles of scientific method, which involves formalized research procedures that can be characterized as logical, objective, systematic, reliable, valid and ongoing. Integrative steps within research process
The following exhibit represents the interrelated steps of the 4 phases of the research process. Generally researchers should follow the steps in order. However, the complexity of the problem, the level of risk involved and management needs will determine the order of the process. Exhibit Phase 1: Determination of Research Problem Step 1: Determining management information needs Step 2: Redefining the decision problem as research problem. Step 3: Establishing research objectives. Phase 2: Development of Research Design Step 4: Determining to evaluate research design. Step 5: Determining the data source. Step 6: Determining the sample plan and sample size. Step 7: Determining the measurement scales. Phase 3: Execution of the Research Design Step 8: Data collection and processing data. Step 9: Analysing the data. Phase 4: Communication of the Results Step 10: Preparing and presenting the final report to management. Step 1: Determining management information needs Before the researcher becomes involved usually, the decision maker has to make a formal statement of what they believe is the issue. At this point, the researcher's responsibility is to make sure management has clearly and correctly specified the opportunity or question. It is important for the decision maker and the researcher to agree on the definition of the problem so that the result of the research process will produce useful information. Actually the researcher should assist the decision maker in determining whether the referred problem is really a problem or just a symptom or a yet unidentified problem. Finally the researchers list the factors that could have a direct or indirect impact on the defined problem or opportunity.
Step 2: Redefining the decision problem as research problem. Once the researcher and decision makers have identified the specific information needs, the researcher must redefine the problem in scientific terms since the researcher feel more comfortable using a scientific framework. This is very critical because, it influences many other steps. It is the researcher's responsibility to state the initial variables associated with the problem in the form of one or more question formats (how, what, where, when or why). In addition, the researcher need to focus on determining what specific information is required (i.e., facts, estimates, predictions, relationships or some combination) and also of quality of information which includes the value of information. Step 3: Establishing research objectives. The research objective should follow from the definition of research problem established in Step 2. Formally stated objective provides the guidelines for determining other steps to be undertaken. The undertaking assumption is that, if the objectives are achieved, the decision maker will have information to solve the problem. Step 4: Determining to evaluate research design. The research design serves as a master plan of methods and procedures that should be used to collect and analyze the data needed by the decision maker. In this master plan, the researcher must consider the design technique (survey, observation, and experiment), the sampling methodology and procedures, the schedule and the budget. Although every problem is unique, but most of the objectives can be met using one of three types of research designs: exploratory, descriptive, and casual. Exploratory focuses on collecting either secondary or primary data and using unstructured formal or informal procedures to interpret them. It is often used simply classify problems or opportunity and it is not intended to provide conclusive information. Some examples of exploratory studies are focus group interview, expensive surveys and pilot studies. Descriptive studies that describe the existing characteristic which generally allows to draw inference and can lead to a course of action. Causal studies are designed to collect information about cause and effect relationship between two or more variables. Step 5: Determining the data source. This can be classified as being either secondary or primary. Secondary data can usually be gathered faster and at less cost than primary data. Secondary data are historical data previously collected and assembled for some research problem other than the current situation. In contrast primary data represent firsthand data, and yet to have meaningful interpretation and it employs either surveyor observation. Step 6: Determining the sample plan and sample size. To make inference or prediction about any phenomenon we need to understand where or who is supplying the raw data and how representative those data are. Therefore, researchers need to identify the relevant defined target population. The researcher can
choose between a sample (small population) and census (entire population). To achieve this research objective, the researcher needs to develop explicit sampling plan which will serve as a blueprint for defining the target population. Sampling plans can be classified into two general types: probability (equal chance) and non probability. Since sampling size affects quality and general ability, researchers, must think carefully about how many people to include or how many objects to investigate. Step 7: Determining the measurement scales. This step focuses on determining the dimensions of the factors being investigated and measuring the variables that underlie the defined problem. This determines how much raw data can be collected and the amount of data to be collected. The level of information (nominal, ordinal, interval, and ratios), the reliability, the validity and dimension (uni vs. multi) will determine the measurement process. Step 8: Data collection and processing data. There are two fundamental approaches to gather raw data. One is to ask questions about variables and phenomena using trained interviewers or questionnaires. The other is to observe variables or phenomena using professional observers or high tech mechanical devices. Self - administered surveys, personal interviews, computer simulations, telephone interviews are some of the tools to collect data. The questioning allows a wider variety of collective of data about not only past, present but also the state of mind or intentions. Observation can be characterized as natural, contrived, disguised or undisguised, structured or unstructured, direct or indirect, human or mechanical, and uses the devices like video camera, tape recorders, audiometer, eye camera, psychogalvanometer or pupil meter. After the raw data collected, a coding scheme is needed so that the raw data can be entered into computers. It is assigning logical numerical description to all response categories. The researcher must then clean the raw data of either coding or data entry error. Step 9: Analysing the data. Using a variety of data analysis technique, the researcher can create new, complex data structure by continuing two or more variables into indexes, ratios, constructs and so on. Analysis can vary from simple frequency distribution (percentage) to sample statistic measures (mode, median, mean, standard deviation, and standard error) to multivariate data analysis. Step 10: Preparing and presenting the final report to management. This step is to prepare and present the final research report to management. The report should contain executive summary, introduction, problem definition and objectives, methodology, analysis, results and finding, finally suggestions and recommendation. It also includes appendix. Any researcher is expected not only submit well produced written report but also oral presentation.
Research Design Kerlinger, in his "Foundations of Behavioral Research" book defines, "Research design is the plan structure, and strategy of investigation conceived so as to obtain answers to research questions and to control variance". The plan is overall scheme or program of the research. It includes an outline of what the investigator will do from writing hypotheses and their operational implication to the final analysis of data. A structure is the framework, and the relations among variables of a study. According to Green & Tull, a research design is the specification of methods and procedures for acquiring the information needed. It is the overall operational pattern or framework of the project that stipulates what information is to be collected from which sources by what procedures. From the above definitions it can be understood that the research design is more or less a blueprint of research, which lays down the methods and procedure for requisite collection of information and measurement and analysis with a view to arrive at meaningful conclusions of the proposed study. Types of Research Design The different types of design are explained in the previous section (refer types of research). There are three frequently used classification is give below. I. Explorative II. Descriptive III. Casual Here the focus will be how these studies are conducted and methods are explained: I. Categories of Explorative Research There are four general categories of explorative research methods. Each category provides various alternative ways of getting information. 1. Experience Surveys It is an attempt to discuss issues and ideas with top executive and knowledge people who have experience in the field. This research in the form of experience survey may be quite informal. This activity intends only to get ideas about the' problems. Often an experience survey may consist of interviews with a small number of people who have been carefully selected. The respondents will generally be allowed to discuss the questions with few constraints. Hence, the purpose of such experts is to help formulate the problem and classify concepts rather than develop conclusive evidence.
2. Secondary Data Analysis Another quick source of background information is trade literature. Using secondary data may equally important to applied research. Investigating date that has been completed for some purpose other than the present one is one of the frequent forms of exploratory research. Also, it is to remember that this method often used in descriptive analysis. 3. Case Studies It is to obtain information one or few situation that are similar to the present one. The primary advantage of case study is that entire entity can be investigated in depth and with meticulous attention in details. The results from this type should be seen as tentative. Generalizing from a few cases can be dangerous, because more situations are not typical in same sense. But even if situations are not directly comparable, a number of insights can be gained and hypothesis suggested for future research. 4. Pilot Studies In the context of exploratory research, a pilot study implies that some aspect of the research will be on a small scale. This generates primary data usually for qualitative analysis. The major categories are discussed below: a. Focus Group Interview The popular method in the qualitative research is an unstructured free-flowing interview with a small group of people. It is not rigid, but flexible promote that encourages discussions. The primary advantages are that they are relatively brief, easy to execute, quickly analyzed and inexpensive. However, a small group will rarely be a representative sample, no matter how carefully it is selected. b. Projective Techniques It is an indirect means of questioning that enable the respondent to project beliefs and feeling onto a third party, an inanimate object or a situation. Respondents are not required to provide answer to a structural format. They are encouraged to describe a situation in their own words, with little prompting by the researcher, within the context of their own experiences, attitudes, personality and to express opinions and emotions. The most common techniques are used associations, sentence completion, Thematic Apperception Test (TAT) and role playing. c. Depth interview It is similar to focus group but in the interviewing session the researcher asks many questions and probes for elaboration. Here the role of researcher
(interviewers) is more important. He must be highly skillful who can influence respondents to talk freely without disturbing the direction. It may last more hours, hence it is expensive. II. Categories of Descriptive Research In contrast to exploratory, descriptive studies are more formalized and typically structured with clearly stated hypothesis. When the researcher interested in knowing the characteristics of certain groups such as age, sex, educational level, occupation or income, a descriptive study may be necessary. Descriptive studies can be divided into two broad categories - cross sectional and longitudinal. The following are the methods used in descriptive studies: 1. Secondary Data Analysis 2. Primary Data Analysis 3. Case studies Several methods are available to collecting the information (i.e., observation, questionnaire, and examination of records with the merits and limitations, the researcher may use one or more of these methods which have been discussed in details in later chapters. Thus the descriptive studies methods must be selected keeping in view the objectives of the study and the resources available. The said design can be appropriately referred to as a survey design using observing or questioning process, lit. III. Categories of Causal Research As the name implies, a causal design investigates the cause and effect relationship between two or more variables. The causal studies may be classified as informal and formal or quasi / true and complex designs. The methods used in experimental research are discussed hereunder: 1. The one-shot case study (after-only design) 2. Before-after without control group. 3. After-only with control group. 4. Before-after with one control group 5. Four-group, Six- study design (Solomon four group design) 6. Time series design 7. Completely randomized design
8. Randomized block design. 9. Factorial design 10. Latin square design The first two methods are called as quasi experimental designs, next 3 methods are called as true experimental designs and last 4 methods are called complex designs. In this experimental design the following symbols are used in describing the various experimental designs: X = Exposure of a group to an experimental treatment. O = Observation or measurement of the dependent variable. 1. The one-shot case study design (after only design): The one-shot design, there should be a measure of what would happen when test units were not exposed to X to compare with the measure when subjects were exposed to X. This is diagrammed as follows: X
O1
2. Before - after without control group design: In this, the researcher is likely to conclude that the difference between O2 and O1 (O2 – O1) is the measure of the influence of the experimental treatment. The design is as follows: O1
X
O2
3. After only with control group design: The diagram is as follows: Experimental Group: Control Group:
X X
O1 O2
The design is to randomly selected subjects and randomly assign to experimental or the control group. The treatment is then measured in both groups at same time. The effect is calculated as follows: O2 - O1 4. Before-after with one control group design: This is explained as follows: Experimental Group: Control Group:
O1 O3
X X
O2 O4
As the diagram above indicated, the subjects of the experimental group are tested before and after being exposed to the treatment. The control group is tested twice, at the same time of experimental group, but these subjects are not exposed to the treatment. The effect is calculated as follows: (O2 - O1) - (O4 – O3) 5. Four-group, six- study design: (Solomon four groups design) Combining, the before - after with control group design and the after-only with control group design provides a means for controlling testing affects, as well as other sources of extraneous variations. The diagram as follows: Experimental Group 1 Control Group 1
O1
X
O3
X
Experimental Group 2:
O4
X
Control Group 2:
O2
X
O5 O6
6. Time series design: When experiments are conducted over long periods of time, they are more vulnerable to history effects due to changes in population, attitudes, economic patterns and the like. Hence, this is also called quasiexperimental design. This design can be diagrammed as follows: O1
O2
O3
X
O4
O5
O6
Several observations have been taken before and after the treatment to determine the patterns after the treatment are similar to the pattern of before the treatments. 7. Completely randomized design (CRD): CRD is an experimental design that uses a random process to assign experimental units to treatments. Here, randomization of experimental units to control extraneous variables while manipulating a single independent variable, the treatment variable. 8. Randomized block design (RBD): The RBD is an extension of the CRD. A form of randomization is utilized to control for most extraneous variation. Here an attempt is made to isolate the effects of the single variable by blocking its effects. 9. Factorial design: A FD allows for testing the effects of two or more treatment (factors) at various levels. It allows for the simultaneous manipulation of 2 or more variable at various levels. This design will measure main effect (i.e., the influence on the dependent variable by each independent variable and also interaction effect.
10. Latin Square Design (LSD): The LSD attempts to control or block out the effect of two or more confounding extraneous factors. This design is so named because of the layout of the table that represents the design. SUMMARY This chapter has outlined the stages in business research prawn. Various types of research design have been dealt in detail. KEY TERMS · · · · · · · · · · ·
Exploratory research Descriptive research Causal research Focus group Projective techniques Case studies Depth Interview Experimental designs Quasi experimental design True experimental designs Complex designs
QUESTIONS 1. Explain the different phases of research process. 2. Briefly describe the different steps involved in a research process. 3. What are major types researches in business? 4. Discuss the categories of exploratory and descriptive research. 5. Explain different experimental designs. REFERENCES 1. Bellenger and et al, Marketing Research, Home Wood Illinois, Inc. 1978. 2. Boot, John C.G. and Cox., Edwin B., Statistical Analysis for Managerial Decisions, 2nd ed. New Delhi: McGraw Hill Publishing Co. Ltd. 3. Edwards, Allen, Statistical Methods, 2nd ed., New York. 1967.
- End of Chapter -
LESSON – 4 INTRODUCTION TO STATISTICS AND POPULATION PARAMETERS
OBJECTIVES · · ·
Meaning and definition of Statistics Nature of Statistical study Importance of Statistics in business and also its limitations
STRUCTURE · · · ·
Nature of statistical study Importance of statistics in business Statistical quality control method Limitation of statistics
INTRODUCTION At the outset, it may be noted that the word 'Statistics' is used rather curiously in two senses-plural and singular. In the plural sense, it refers to a set of figures. Thus, we speak of production and sale of textiles, television sets, and so on. In the singular sense, Statistics refers to the whole body of analytical tools that are used to collect the figures, organize and interpret them and, finally, to draw conclusions from them. It should be noted that both the aspects of Statistics are important if the quantitative data are to serve their purpose. If Statistics, as a subject, is inadequate and consists of poor methodology, we would not know the right procedure to extract from the data the information they contain. On the other hand, if our figures are defective in the sense that they are inadequate or inaccurate, we would not reach the right conclusions even though our subject is well developed. With this brief introduction, let us first see how Statistics has been defined. Statistics has been defined by various authors differently. In the initial period the role of Statistics was confined to a few activities. As such, most of the experts gave a narrow definition of it. However, over a long period of time as its role gradually expanded, Statistics came to be considered as much wider in its scope and, accordingly, the experts gave a wider definition of it. Spiegal, for instance, defines Statistics, highlighting its role in decision-making particularly under uncertainty, as follows:
"Statistics is concerned with scientific method for collecting, organising, summarising, presenting and analysing data as well as drawing valid conclusions and making reasonable decisions on the basis of such analysis." This definition covers all the aspects and then tries to link them up with decisionmaking. After all, Statistics as a subject must help one to reach a reasonable and appropriate decision on the basis of the analysis of numerical data collected earlier. Using the term 'Statistics' in the plural sense, Secrist defines Statistics as "Aggregate of facts, affected to a marked extent by multiplicity of causes, numerically expressed, enumerated or estimated according to reasonable standards of accuracy, collected in a systematic manner for a predetermined purpose, and placed in relation to each other". This definition of Secrist highlights a few major characteristics of statistics as given below: 1. Statistics are aggregates of facts. This means that a single figure is not statistics. 2. Statistics are affected by a number of factors. For example, sale of a product depends on a number of factors such as its price, quality, competition, the income of the consumers, and so on. 3. Statistics must be reasonably accurate. Wrong figures, if analysed, will lead to erroneous conclusions. Hence, it is necessary that conclusions must be based on accurate figures. 4. Statistics must be collected in a systematic manner. If data are collected in a haphazard manner, they will not be reliable and will lead to misleading conclusions. 5. Finally, statistics should be placed in relation to each other. If one collects data unrelated to each other, then such data will be confusing and will not lead to any logical conclusions. Data should be comparable over time and over space. THE NATURE OF A STATISTICAL STUDY Having briefly looked info the definition of Statistics, we should know at this stage as to what the nature of a Statistical study is. Whether a given problem pertains to business or to some other field, there are some well defined steps that need to be followed in order to reach meaningful conclusions. 1. Formulation of the Problem: To begin with, we have to formulate a problem on which a study is to be done. We should understand the problem as clearly as possible. We should know its scope so that we do not go beyond it or exclude some relevant aspect.
2. Objectives of the Study: We should know what the objectives of the proposed study are. We should ensure that the objectives are not extremely ambitious or else the study may fail to achieve them because of limitations of time, finance or even competence of those conducting the study. 3. Determining Sources of Data: The problem and the objectives, thus properly understood, will enable us to know as to what data are required to conduct the study. We have to decide whether we should collect primary data or depend exclusively on secondary data. Sometimes the study is based on both the secondary and the primary data. When study is to be based on secondary data, whether partly or fully, it is necessary to ensure that the data are quite suitable and adequate for the objectives of the study. 4. Designing Data Collection Forms: Once the decision in favour of collection of primary data is taken, one has to decide the mode of their collection. The two methods available are: (i) observational method, and (ii) survey method. Suitable questionnaire is to be designed to collect data from respondents in a field survey. 5. Conducting the Field Survey: Side by side when the data collection forms are being designed, one has to decide whether a census surveyor a sample survey is to be conducted. For the latter, a suitable sample design and the sample size are to be chosen. The field survey is then conducted by interviewing sample respondents. Sometimes, the survey is done by mailing questionnaires to the respondents instead of contacting them personally. 6. Organising the Data: The field survey provides raw data from the respondents. It is now necessary to organise these data in the form of suitable tables and charts so that we may be aware of their salient features. 7. Analysing the Data: On the basis of the preliminary examination of the data collected as well as the nature and scope of our problem, we have to analyse data. As several statistical techniques are available, we should take special care to ensure that the most appropriate technique is selected for this purpose. 8. Reaching Statistical bindings: The analysis in the preceding step will bring out some statistical findings of the study. Now we have to interpret these findings in terms of the concrete problem with which we started our investigation. 9. Presentation of Findings: Finally, we have to present the findings of the study, properly interpreted, in a suitable form. Here, the choice is between an oral presentation and a written one. In the case of an oral presentation, one has to be extremely selective in choosing the material, as in a limited time one has to provide a broad idea of the study as well as its major findings to be understood by the audience in proper perspective. In case of a written presentation, a report has to be prepared. It should be reasonably comprehensive and should have graphs and diagrams to facilitate the reader in understanding it in all its ramifications.
IMPORTANCE OF STATISTICS IN BUSINESS There is an increasing realisation of the importance of Statistics in various quarters. This is reflected in the increasing use of Statistics in the government, industry, business, agriculture, mining, transport, education, medicine, and so on. As we are concerned with the use of Statistics in business and industry here, the description given below is confined to these areas only. Three major functions where statistics can be found useful in a business enterprise: A. The planning of operations - This may relate to either special projects or to the recurring activities of a firm over a specified period. B. The setting up of standards - This may relate to the size of employment, volume of sales, fixation of quality norms for the manufactured product, norms for the daily output, and so forth. C. The function of control - This involves comparison of actual production achieved against the norm or target set earlier. In case the production has fallen short of the target, it gives remedial measures so that such a deficiency does not occur again. A point worth noting here is that although these three functions - planning of operations, setting standards, and control-are separate, but in practice they are very much interrelated. Various authors have highlighted the importance of Statistics in business. For instance, Croxton and Cowden give numerous uses of Statistics in business such as project planning, budgetary planning and control, inventory planning and control, quality control, marketing, production and personnel administration. Within these also they have specified certain areas where Statistics is very relevant. Irwing W. Burr, dealing with the place of Statistics in an industrial organisation, specifies a number of areas where Statistics is extremely useful. These are: customer wants and market research, development design and specification, purchasing, production, inspection, packaging and shipping, sales and complaints, inventory and maintenance, costs, management control, industrial engineering and research. It can be seen that both the lists are extremely comprehensive. This clearly points out that specific statistical problems arising in the course of business operations are multitudinous. As such, one may do no more than highlight some of the more important ones to emphasise the relevance of Statistics to the business world. Personnel Management This is another sphere in business where statistical methods can be used. Here, one is concerned with the fixation of wage rates, incentive norms and performance appraisal of individual employee. The concept of productivity is very relevant here. On the basis of measurement of productivity, the productivity bonus is awarded to the workers.
Comparisons of wages and productivity are undertaken in order to ensure increases in industrial productivity. Seasonal Behaviour A business firm engaged in the sale of certain product has to decide how much stock of that product should be kept. If the product is subject to seasonal fluctuations then it must know the nature of seasonal fluctuations in demand. For this purpose, seasonal index of consumption may be required. If the firm can obtain such data or construct a seasonal index on its own then it can keep a limited stock of the product in lean months and large stocks in the remaining months. In this way, it will avoid the blocking of funds in maintaining large stocks in the lean months. It will also not miss any opportunity to sell the product in the busy season by maintaining adequate stock of the product during such a period. Export Marketing Developing countries have started giving considerable importance to their exports. Here, too, quality is an important factor on which exports depend. This apart, the concerned firm must know the probable countries where its product can be exported. Before that, it must select the right product, which has considerable demand in the overseas markets. This is possible by carefully analysing the statistics of imports and exports. It may also be necessary to undertake a detailed survey of overseas markets to know more precisely the export potential of a given product. Maintenance of Cost Records Cost is an important consideration for a business enterprise. It has to ensure that cost of production, which includes cost of raw materials, wages, and so forth, does not mount up or else this would jeopardize its competitiveness in the market. This implies that it has to maintain proper cost records and undertake an analysis of cost data from time to time. Management of Inventory Closely related to the cost factor is the problem of inventory management. In order to ensure that the production process continues uninterrupted, the business firm has to maintain an adequate inventory. At the same time, excessive inventory means blocking of funds that could have been utilized elsewhere. Thus, the firm has to determine a magnitude of inventory that is neither excessive nor inadequate. While doing so, it has to bear in mind the probable demand for its product. All these aspects can be well looked after if proper statistics are maintained and analyzed. Expenditure on Advertising and Sales
A number of times business firms are interested to know whether there is an association between two or more variables such as advertising expenditure and sales. In view of increasing competitiveness, business and industry spend a large amount on advertising. It is in their interest to find out whether such advertising expenditure promotes the sales. Here, by using correlation and regression techniques it can be ascertained that the advertising expenditure is worthwhile of not. Mutual Funds Mutual funds which have come into existence in recent years, provide an avenue to a person to invest his savings so that he may get a reasonably good return. Different mutual funds have different objectives as they have varying degrees of risk involved in the companies they invest in. Here, Statistics provides certain tools or techniques to a consultant or financial adviser through which he can provide sound advice to a prospective investor. Relevance in Banking and Insurance Institutions Banks and insurance companies frequently use varying statistical techniques in their respective areas of operation. They have to maintain their accounts and analyze these to examine their performance over a specified period. The above discussion is only illustrative and there are numerous other areas where the use of Statistics is so common that without its use they may have to close down their operations. STATISTICAL QUALITY CONTROL METHODS In the sphere of production, for example, statistics can be useful in various ways to ensure the production of quality goods. This is achieved by identifying and rejecting defective or substandard goods. The sale targets can be fixed on the basis of sale forecasts, which are done by using varying methods of forecasting. Analysis of sales done against the targets set earlier would indicate the deficiency in achievement, which may be on account of several causes: (i) targets were too high and unrealistic (ii) salesmen's performance has been poor (iii) emergence of increase in competition, and (iv) poor quality of company's product, and so on. These factors can be further investigated. LIMITATIONS OF STATISTICS The preceding discussion highlighting the importance of Statistics in business should not lead anyone to conclude that Statistics is free from any limitation. As we shall see here, Statistics has a number of limitations. There are certain phenomena or concepts where Statistics cannot be used. This is because these phenomena or concepts are not amenable to measurement. For example,
beauty, intelligence, courage cannot be quantified. Statistics has no place in all such cases where quantification is not possible. 1. Statistics reveal the average behaviour, the normal or the general trend. An application of the 'average' concept if applied to an individual or a particular situation may lead to a wrong conclusion and sometimes may be disastrous. For example, one may be misguided when told that the average depth of a river from one bank to the other is four feet, when there may be some points in between where its depth is far more than four feet. On this understanding, one may enter those points having greater depth, which may be hazardous. 2. Since Statistics are collected for a particular purpose, such data may not be relevant or useful in other situations or cases. For example, secondary data (i.e., data originally collected by someone else) may not be useful for the other person. 3. Statistics is not 100 percent precise as is Mathematics or Accountancy. Those who use Statistics should be aware of this limitation. 4. In Statistical surveys, sampling is generally used as it is not physically possible to cover all the units or elements comprising the universe. The results may not be appropriate as far as the universe is concerned. Moreover, different surveys based on the same size of sample but different sample units may yield different results. 5. At times, association or relationship between two or more variables is studied in Statistics, but such a relationship does not indicate 'cause and effect' relationship. It simply shows the similarity or dissimilarity in the movement of the two variables. In such cases, it is the user who has to interpret the results carefully, pointing out the type of relationship obtained. 6. A major limitation of Statistics is that it does not reveal all pertaining to a certain phenomenon. 7. There is some background information that Statistics does not cover. Similarly, there are some other aspects related to the problem on hand, which are also not covered. The user of Statistics has to be well informed and should interpret Statistics keeping in mind all other aspects having relevance on the given problem. SUMMARY This chapter outlined the importance and growth of statistics. Various applications of statistics in the domain of management have been dealt in detail. KEY TERMS · · ·
Statistics Statistical quality control methods Seasonal behaviour
IMPORTANT QUESTIONS 1. Define statistics 2. What do you mean by statistical quality methods? 3. Explain the application of statistics in various business domains.
- End of Chapter LESSON – 5 ESTIMATION OF POPULATION PARAMETERS
OBJECTIVE ·
To acquire knowledge of estimation of parameter
STRUCTURE · · · · ·
Estimation of population parameters Measure of central tendency Mean, median, mode Geometric mean Harmonic mean
A population is any entire collection of people, animals, plants or things from which we may collect data. It is the entire group we are interested in, which we wish to describe or draw conclusions about. In order to make any generalizations about a population, a sample, that is meant to be representative of the population, is often studied. For each population there are many possible samples. A sample statistic gives information about a corresponding population parameter. For example, the sample mean for a set of data would give information about the overall population mean. It is important that the investigator carefully and completely defines the population before collecting the sample, including a description of the members to be included. Example The population for a study of infant health might be all children born in the India, in the 1980's. The sample might be all babies born on 7th May in any of the years.
Population Parameters Parameter is a value, usually unknown (and which therefore has to be estimated), used to represent a certain population characteristic. For example, the population mean is a parameter that is often used to indicate the average value of a quantity. Within a population, a parameter is a fixed value which does not vary. Each sample drawn from the population has its own value of any statistic that is used to estimate this parameter. For example, the mean of the data in a sample is used to give information about the overall mean in the population from which that sample was drawn. MEASURES OF CENTRAL LOCATION (OR CENTRAL TENDENCY) The most important objective of statistical analysis is to determine a single value for the entire mass of data, which describes the overall level of the group of observations and can be called a representative set of data. It tells us where the centre of the distribution of data is located on the scale that we are using. There are several such measures, but we shall discuss only those that are most commonly used. These are: Arithmetic Mean, Mode and Median. These values are very useful in not only presenting the overall picture of the entire data, but also for the purpose of making comparisons among two or more sets of data. As an example, questions like, "How hot is the month of June in Mumbai?" can be answered, generally, by a single figure of the average temperature for that month. For the purpose of comparison, suppose that we want to find out if boys and girls at the age 10 differ in height. By taking the average height of boys of that age and the average height of the girls of the same age, we can compare and note the difference. While, arithmetic mean is the most commonly used measure of central location, mode and median arc more suitable measures under certain set of conditions and for certain types of data. However, all measures of central tendency should meet the following requisites: · ·
· ·
·
It should be easy to calculate and understand. It should be rigidly defined. It should have one and only one interpretation so that the personal prejudice or bias of the investigator does not affect the value or its usefulness. It should be representative of the data. If it is calculated from a sample, then the sample should be random enough to be accurately representing the population. It should have sampling stability. It should not be affected by sampling fluctuations. This means that if we pick 10 different groups of college students at random and we compute the average of each group, then we should expect to get approximately the same value from these groups. It should not be affected much by extreme values. If a few very small or very large items are presented in the data, they will unduly influence the value of the average by shifting it to one side or the other and hence the average would not be
really typical of the entire series. Hence, the average chosen should be such that it is not unduly influenced by extreme values. Let us consider these three measures of the central tendency: MODE In statistics, mode means the most frequent value assumed by a random variable, or occurring in a sampling of a random variable. The term is applied both to probability distributions and to collections of experimental data. Like the statistical mean and the median, die mode is a way of capturing important information about a random variable or a population in a single quantity. The mode is in general different from mean and median, and may be very different for strongly skewed distributions. The mode is not necessarily unique, since the same maximum frequency may be attained at different values. The worst case is given by so-called uniform distributions, in which all values are equally likely. Mode of a probability distribution The mode of a probability distribution is the value at which its probability density function attains its maximum value, so, informally speaking; the mode is at the peak. Mode of a sample The mode of a data sample is the element that occurs most often in the collection. For example, the mode of the sample [1, 3, 6, 6, 6, 6, 7, 7, 12, 12, 17] is 6. Given the list of data [1, 1, 2, 4, 4] the mode is not unique. For a sample from a continuous distribution, such as [0.935..., 1.211..., 2.430..., 3.668..., 3.874...], the concept is unusable in its raw form, since each value will occur precisely once. The usual practice is to discreteise the data by assigning the values to equidistant intervals, as for making a histogram, effectively replacing the values by the midpoints of the intervals they are assigned to. The mode is then the value where the histogram reaches its peak. For small or middle-sized samples the outcome of this procedure is sensitive to the choice of interval width if chosen too narrow or too wide; typically one should have a sizable fraction of the data concentrated in a relatively small number of intervals (5 to 10), while the fraction of the data falling outside these intervals is also sizable. Comparison of mean, median and mode For a probability distribution, the mean is also called the expected value of the random variable. For a data sample, the mean is also called the average.
When do these measures make sense? Unlike mean and median, the concept of mode also makes sense for "nominal data" (i.e., not consisting of numerical values). For example, taking a sample of Korean family names, one might find that "Kim" occurs more often than any other name. Then "Kim" might be called the mode of the sample. However, this use is not common. Unlike median, the concept of mean makes sense for any random variable assuming values from a vector space, including the real numbers (a one-dimensional vector space) and the integers (which can be considered embedded in the real numbers). For example, a distribution of points in the plane will typically have a mean and a mode, but the concept of median does not apply. The median makes sense when there is a linear order on the possible values. Uniqueness and Definedness For the remainder, the assumption is that we have (a sample of) a real-valued random variable. For some probability distributions, the expected value may be infinite or undefined, but if defined, it is unique. The average of a (finite) sample is always defined. The median is the value such that the fractions not exceeding it and not falling below it are both at least 1/2. It is not necessarily unique, but never infinite or totally undefined. For a data sample it is the "halfway" value when the list of values is ordered in increasing value, where usually for a list of even length the numerical average is taken of the two values closest to "halfway". Finally, as said before, the mode is not necessarily unique. Furthermore, like the mean, the mode of a probability distribution can be (plus or minus)-infinity, but unlike the mean it cannot be just undefined. For a finite data sample, the mode is one (or more) of the values in the sample and is itself then finite. Properties Assuming definedness, and for simplicity uniqueness, the following are some of the most interesting properties. All three measures have the following property: If the random variable (or each value from the sample) is subjected to the linear or affine transformation which replaces Xby ax+b, so are the mean, median and mode. However, if there is an arbitrary monotonic transformation, only the median follows; for example, if X is replaced by exp(X), the median changes from m to exp(m) but the mean and mode won't. Except for extremely small samples, the median is totally insensitive to "outliers" (such as occasional, rare, false experimental readings). The mode is also very robust in the presence of outliers, while the mean is rather sensitive.
In continuous uni-modal distributions the median lies, as a rule of thumb, between the mean and the mode, about one third of the way going from mean to mode. In a formula, median = (2 x mean + mode)/3. This rule, due to Karl Pearson, is however not a hard and fast rule. It applies to distributions that resemble a normal distribution. Example for a skewed distribution A well-known example of a skewed distribution is personal wealth: Few people are very rich, but among those some are excessively rich. However, many are rather poor. A well-known class of distributions that can be arbitrarily skewed is given by the lognormal distribution. It is obtained by transforming a random variable X having a normal distribution into random variable Y = exp(X). Then the logarithm of random variable Y is normally distributed, whence the name. Taking the mean μ of X to be 0, the median of Y will be 1, independent of the standard deviation σ of X. This is so because X has a symmetric distribution, so its median is also 0. The transformation from X to Y is monotonic, and so we find the median exp(0) = 1 for Y. When X has standard deviation σ = 0.2, the distribution of Y is not very skewed. We find (see under Log-normal distribution), with values rounded to four digits: Mean = 1.0202 Mode = 0.9608 Indeed, the median is about one third on the way from mean to mode. When X has a much larger standard deviation, σ = 5, the distribution of Y is strongly skewed. Now Mean = 7.3891 Mode = 0.0183 Here, Pearson's rule of thumb fails miserably. MEDIAN In probability theory and statistics, a median is a number dividing the higher half of a sample, a population, or a probability distribution from the lower half. The median of a finite list of numbers can be found by arranging all the observations from lowest value to highest value and picking the middle one. If there is an even number of observations, one often takes the mean of the two middle values.
At most, half the population has values less than the median and at most half have values greater than the median. If both groups contain less than half the population, then some of the population is exactly equal to the median. Popular explanation The difference between the median and mean is illustrated in a simple example. Suppose 19 paupers and one billionaire are in a room. Everyone removes all money from their pockets and puts it on a table. Each pauper puts $5 on the table; the billionaire puts $1 billion (that is, $109) there. The total is then $1,000,000,095. If that money is divided equally among the 20 persons, each gets $50,000,004.75. That amount is the mean (or "average") amount of money that the 20 persons brought into the room. But the median amount is $5, since one may divide the group into two groups of 10 persons each, and say that everyone in the first group brought in no more than $5, and each person in the second group brought in no less than $5. In a sense, the median is the amount that the typical person brought in. By contrast, the mean (or "average") is not at all typical, since no one present - pauper or billionaire - brought in an amount approximating $50,000,004.75. Non-uniqueness There may be more than one median. For example if there are an even number of cases, and the two middle values are different, then there is no unique middle value. Notice, however, that at least half the numbers in the list are less than or equal to either of the two middle values, and at least half are greater than or equal to either of the two values, and the same is true of any number between the two middle values. Thus either of the two middle values and all numbers between them are medians in that case. Measures of statistical dispersion When the median is used as a location parameter in descriptive statistics, there are several choices for a measure of variability: the range, the inter-quartile range, and the absolute deviation. Since the median is the same as the second quartile, its calculation is illustrated in the article on quartiles. To obtain the median of an even number of numbers, find the average of the two middle terms. Medians of particular distributions The median of a normal distribution with mean μ and variance σ2 is μ. In fact, for a normal distribution, mean = median = mode. The median of a uniform distribution in the interval [a, b] is (a + b) / 2, which is also the mean. Medians in descriptive statistics
The median is primarily used for skewed distributions, which it represents differently than the arithmetic mean. Consider the multiset {1, 2, 2, 2, 3, 9}. The median is 2 in this case, as is the mode, and it might be seen as a better indication of central tendency than the arithmetic mean of 3.166.... Calculation of medians is a popular technique in summary statistics and summarizing statistical data, since it is simple to understand and easy to calculate, while also giving a measure that is more robust in the presence of outlier values than is the mean. MEAN In statistics, mean has two related meanings: - The average in ordinary English, which is also called the arithmetic mean (and is distinguished from the geometric mean or harmonic mean). The average is also called sample mean. The expected value of a random variable, which is also called the population mean. - In statistics, ‘means’ are often used in geometry and analysis. A wide range of means have been developed for these purposes, which are not much used in statistics. See the other means section below for a list of means. Sample mean is often used as an estimator of the central tendency such as the population mean. However, other estimators are also used. For a real-valued random variable X, the mean is the expectation of X. If the expectation does not exist, then the random variable has no mean. For a data set, the mean is just the sum of all the observations divided by the number of observations. Once we have chosen this method of describing the communality of a data set, we usually use the standard deviation to describe how the observations differ. The standard deviation is the square root of the average of squared deviations from the mean. The mean is the unique value about which the sum of squared deviations is a minimum. If you calculate the sum of squared deviations from any other measure of central tendency, it will be larger than for the mean. This explains why the standard deviation and the mean are usually cited together in statistical reports. An alternative measure of dispersion is the mean deviation, equivalent to the average absolute deviation from the mean. It is less sensitive to outliers, but less tractable when combining data sets Arithmetic Mean The arithmetic mean is the "standard" average, often simply called the "mean".
The mean may often be confused with the median or mode. The mean is the arithmetic average of a set of values, or distribution; however, for skewed distributions, the mean is not necessarily the same as the middle value (median), or most likely (mode). For example, mean income is skewed upwards by a small number of people with very large incomes, so that the majority has an income lower than the mean. By contrast, the median income is the level at which half the population is below and half is above. The mode income is the most likely income, and favors the larger number of people with lower incomes. The median or mode is often more intuitive measures of such data. That said, many skewed distributions are best described by their mean - such as the Exponential and Poisson distributions. An amusing example… Most people have an above average number of legs. The mean number of legs is going to be less than 2, because there are people with one leg, people with no legs and no people with more than two legs. So since most people have two legs, they have an above average number. Geometric Mean The geometric mean is an average that is useful for sets of numbers that are interpreted according to their product and not their sum (as is the case with the arithmetic mean). For example rates of growth.
For example, the geometric mean of 34, 27, 45, 55, 22, 34 (six values) is (34 x 27 x 45 x 55 x 22 x 34)1/6 = (1699493400)1/6 = 34.545 Harmonic Mean The harmonic mean is an average which is useful for sets of numbers which are defined in relation to some unit, for example speed (distance per unit of time).
An example… An experiment yields the following data: 34, 27, 45, 55, 22, 34. We need to find the harmonic mean. No. of items is 6, therefore n = 6. Value of the denominator in the formula is 0.181719152307. Reciprocal of this value is 5.50299727522. Now, we multiply this by ‘n’ to get the harmonic mean as 33.0179836513. Weighted Arithmetic Mean The weighted arithmetic mean is used, if one wants to combine average values .rom samples of the same population with different sample sizes:
The weights ωi represent the bounds of the partial sample. In other applications they represent a measure for the reliability of the influence upon the mean by respective values. SUMMARY This chapter has given the meaning of population parameters. The procedures of measuring the above population parameters are dealt with in detail in the chapter. KEYTERMS · · · · · ·
Population parameters Mean, Mode and Median Arithmetic mean Geometric mean Harmonic mean Skewed distribution
IMPORTANT QUESTIONS 1. Explain the methods to measure the Median, Mode and Mean
2. What are the different types of Means?
- End of Chapter -
LESSON – 6 HYPOTHESIS TESTING
OBJECTIVES · · · · ·
To learn the process of Hypothesis Testing To find out the details of Null hypothesis and Alternate hypothesis To learn the precise meaning of probability value To understand the Type I error and Type II errors To gain knowledge on various methods to test hypothesis
STRUCTURE · · · ·
Hypothesis test Statistical and practical significance One and two failed tests Type I and Type II Errors
Statistical hypothesis is an assumption about a population parameter. This assumption may or may not be true. The best way to determine whether a statistical hypothesis is true is to examine the entire population. Since this is often impractical, researchers typically examine a random sample from the population. If the sample data are consistent with the statistical hypothesis, the hypothesis is accepted if not, the hypothesis is rejected. There are two types of statistical hypotheses. · ·
Null hypothesis: usually the hypothesis that sample observations result purely from chance effects. Alternative hypothesis: the hypothesis that sample observations are influenced by some non-random cause.
For example, suppose we wanted to determine whether a coin was fair and balanced. A null hypothesis might be that half the flips would result in Heads and half in Tails. The alternative hypothesis might be that the number of Heads and Tails would be very
different. Suppose we flipped the coin 50 times resulting in 40 Heads and 10 Tails Given this result, we would be inclined to reject the null hypothesis and accept the alternative hypothesis. HYPOTHESIS TESTS Statisticians follow a formal process to determine whether to accept or reject a null hypothesis, based on sample data. This process, called hypothesis testing, consists of four steps. 1. Formulate the hypotheses. This involves stating the null and alternative hypotheses. The hypotheses are stated in such a way that they are mutually exclusive. That is, if one is true, the other must be false, and vice versa. 2. Identify the test statistic. This involves specifying the statistic (e.g., a mean score, proportion) that will be used to assess the validity of the null hypothesis. 3. Formulate a decision rule. A decision rule is a procedure that the researcher uses to decide whether to accept or reject the null hypothesis. 4. Accept or reject the null hypothesis. Use the decision rule to evaluate the test statistic. If the statistic is consistent with the null hypothesis, accept the null hypothesis; otherwise, reject the null hypothesis. This section provides an introduction to hypothesis testing. Basic analysis involves some hypothesis testing. Examples of hypotheses generated in marketing research abound · · · ·
The department store is being patronized by more than 10 percent of the households The heavy and light users of a brand differ in terms of psychographic characteristics One hotel has a more upscale image than its close competitor Familiarity with a restaurant results in greater preference for that restaurant.
The null hypothesis is a hypothesis about a population parameter. The purpose of hypothesis testing is to test the viability of the null hypothesis in the light of experimental data. Depending on the data, the null hypothesis either will or will not be rejected as a viable possibility. Consider a researcher interested in whether the time to respond to a tone is affected by the consumption of alcohol. The null hypothesis is that μ1 - μ2 = 0, where μ1 is the mean time to respond after consuming alcohol and μ2 is the mean time to respond otherwise. Thus, the null hypothesis concerns the parameter μ1 - μ2 and the null hypothesis is that the parameter equals zero. The null hypothesis is often the reverse of what the experimenter actually believes; it is put forward to allow the data to contradict it. In the experiment on the effect of alcohol,
the experimenter probably expects alcohol to have a harmful effect. If the experimental data show a sufficiently large effect of alcohol, then the null hypothesis that alcohol has no effect can be rejected. It should be stressed that researchers very frequently put forward a null hypothesis in the hope that they can discredit it. For a second example, consider an educational researcher who designed a new way to teach a particular concept in science, and wanted to test experimentally whether this new method worked better than the existing method. The researcher would design an experiment comparing the two methods. Since the null hypothesis would be that there is no difference between the two methods, the researcher would be hoping to reject the null hypothesis and conclude that the method he or she developed is the better of the two. The symbol H0 is used to indicate the null hypothesis. For the example just given, the null hypothesis would be designated by the following symbols: H0: μ1 - μ2 = 0, or by H0: μ1 = μ2 The null hypothesis is typically a hypothesis of no difference as in this example where it is the hypothesis of no difference between population means. That is why the word "null" in "null hypothesis" is used - it is the hypothesis of no difference. Despite the "null" in "null hypothesis", there are occasions when the parameter is not hypothesized to be 0. For instance, it is possible for the null hypothesis to be that the difference between population means is a particular value. Or, the null hypothesis could be that the mean SAT score in some population is 600. The null hypothesis would then be stated as: H0: μ = 600. Although the null hypotheses discussed so far have all involved the testing of hypotheses about one or more population means, null hypotheses can involve any parameter. An experiment investigating the correlation between job satisfaction and performance on the job would test the null hypothesis that the population correlation (ρ) is 0. Symbolically, H0: ρ = 0. Some possible null hypotheses are given below: H0: μ = 0 H0: μ = 10 H0: μ1 - μ2 = 0 H0: π = 0.5 H0: π1- π2 = 0 H0: μ1 - μ2 - μ3 H0: ρ1 – ρ2 = 0
Steps in hypothesis testing 1. The first step in hypothesis testing is to specify the null hypothesis (H0) and the alternative hypothesis (H1). If the research concerns whether one method of presenting pictorial stimuli leads to better recognition than another, the null hypothesis would most likely be that there is no difference between methods (H0: μ1 - μ2 = 0). The alternative hypothesis would be H1: μ1 ≠ μ2. If the research concerned the correlation between grades and SAT scores, the null hypothesis would most likely be that there is no correlation (H0: ρ = 0). The alternative hypothesis would be H0: ρ ≠ 0. 2. The next step is to select a significance level. Typically the 0.05 or the 0.01 level is used. 3. The third step is to calculate a statistic analogous to the parameter specified by the null hypothesis. If the null hypothesis were defined by the parameter μ1 - μ2, then the statistic M1 - M2 would be computed. 4. The fourth step is to calculate the probability value (often called the p-value). The p value is the probability of obtaining a statistic as different or more different from the parameter specified in the null hypothesis as the statistic computed from the data. The calculations are made assuming that the null hypothesis is true. 5. The probability value computed in Step 4 is compared with the significance level chosen in Step 2. If the probability is less than or equal to the significance level, then the null hypothesis is rejected; if the probability is greater than the significance level then the null hypothesis is not rejected. When the null hypothesis is rejected, the outcome is said to be “statistically significant”; when the null hypothesis is not rejected then the outcome is said be “not statistically significant”. 6. If the outcome is statistically significant, then the null hypothesis is rejected in favor of the alternative hypothesis. If the rejected null hypothesis were that μ1 - μ2 = 0, then the alternative hypothesis would be that μ1 ≠ μ2. If M1 were greater than M2 then the researcher would naturally conclude that μ1 ≥ μ2. 7. The final step is to describe the result and the statistical conclusion in an understandable way. Be sure to present the descriptive statistics as well as whether the effect was significant or not. For example, a significant difference between a group that received a drug and a control group might be described as follows: Subjects in the drug group scored significantly higher (M = 23) than did subjects in the control group (M = 17), t(18) = 2.4, p = 0.027. The statement that "t(18) = 2.4" has to do with how the probability value (p) was calculated. A small minority of researchers might object to two aspects of this wording. First, some believe that the significance level rather than the probability level should be reported. The argument for reporting the probability value is presented in another section. Second, since the alternative hypothesis was stated as μ1 ≠ μ2, some might argue that it can only be concluded that the population means differ and not that the population mean.
This argument is misguided. Intuitively, there are strong reasons for inferring that the direction of the difference in the population is the same as the difference in the sample. There is also a more formal argument. A non significant effect might be described as follows: Although subjects in the drug group scored higher (M = 23) than did subjects in the control group (M = 20), the difference between means was not significant, t(18) = 1.4. p = 0.179. It would not have been correct to say that there was no difference between the performances of the two groups. There was a difference. It is just that the difference was not large enough to rule out chance as an explanation of the difference. It would also have been incorrect to imply that there is no difference in the population. Be sure not to accept the null hypothesis. The Precise Meaning of the Probability Value There is often confusion about the precise meaning of the probability computed in a significance test. As stated in Step 4 of the steps in hypothesis testing, the null hypothesis (H0) is assumed to be true. The difference between the statistic computed in the sample and the parameter specified by H0 is computed and the probability of obtaining a difference this large or large is calculated. This probability value is the probability of obtaining data as extreme or more extreme than the current data (assuming H0 is true). It is not the probability of the null hypothesis itself. Thus, if the probability value is 0.005, this does not mean that the probability that the null hypothesis is true is .005. It means that the probability of obtaining data as different or more different from the null hypothesis as those obtained in the experiment is 0.005. The inferential step to conclude that the null hypothesis is false goes, as follows: The data (or data more extreme) are very unlikely given that the null hypothesis is true. This means that: (1) a very unlikely event occurred or (2) the null hypothesis is false. The inference usually made is that the null hypothesis is false. To illustrate that the probability is not the probability of the hypothesis, consider a test of a person who claims to be able to predict whether a coin will come up heads or tails. One should take a rather skeptical attitude toward this claim and require strong evidence to believe in its validity. The null hypothesis is that the person can predict correctly half the time (H0: π = 0.5). In the test, a coin is flipped 20 times and the person is correct 11 times. If the person has no special ability (H0 is true), then the probability of being correct 11 or more times out of 20 is 0.41. Would someone who was originally skeptical now believe that there is only a 0.41 chance that the null hypothesis is true? They almost certainly would not since they probably originally thought H0 had a very high probability of being true (perhaps as high as 0.9999). There is no logical reason for them to decrease their belief in the validity of the null hypothesis since the outcome was perfectly consistent with the null hypothesis. The proper interpretation of the test is as follows:
A person made a rather extraordinary claim and should be able to provide strong evidence in support of the claim if the claim is to be believed. The test provided data consistent with the null hypothesis that the person has no special ability since a person with no special ability would be able to predict as well or better more than 40% of the time. Therefore, there is no compelling reason to believe the extraordinary claim. However, the test does not prove the person cannot predict better than chance; it simply fails to provide evidence that he or she can. The probability that the null hypothesis is true is not determined by the statistical analysis conducted as part of hypothesis testing. Rather, the probability computed is the probability of obtaining data as different or more different from the null hypothesis (given that the null hypothesis is true) as the data actually obtained. According to one view of hypothesis testing, the significance level should be specified before any statistical calculations are performed. Then, when the probability (p) is computed from a significance test, it is compared with the significance level. The null hypothesis is rejected if p is at or below the significance level; it is not rejected if p is above the significance level. The degree to which p ends up being above or below the significance level does not matter. The null hypothesis either is or is not rejected at the previously stated significance level. Thus, if an experimenter originally stated that he or she was using the 0.05 significance level and p was subsequently calculated to be 0.042, then the person would reject the null hypothesis at the 0.05 level. If p had been 0.0001 instead of 0.042 then the null hypothesis would still be rejected at the 0.05 level. The experimenter would not have any basis to be more confident that the null hypothesis was false with a p of 0.0001 than with a p of 0.041. Similarly, if the p had been 0.051 then the experimenter would fail to reject the null hypothesis. He or she would have no more basis to doubt the validity of the null hypothesis than if p had been 0.482. The conclusion would be that the null hypothesis could not be rejected at the 0.05 level. In short, this approach is to specify the significance level in advance and use p only to determine whether or not the null hypothesis can be rejected at the stated significance level. Many statisticians and researchers find this approach to hypothesis testing not only too rigid, but basically illogical. Who in their right mind would not have more confidence that the null hypothesis is false with a p of 0.0001 than with a p of 0.042? The less likely the obtained results (or more extreme results) under the null hypothesis, the more confident one should be that the null hypothesis is false. The null hypothesis should not be rejected once and for all. The possibility that it was falsely rejected is always present, and, all else being equal, the lower the p value, the lower this possibility. Statistical and Practical Significance It is important not to confuse the confidence with which the null hypothesis can be rejected with size of the effect. To make this point concrete, consider a researcher assigned the task of determining whether the video display used by travel agents for booking airline reservations should be in color or in black and white. Market research had shown that travel agencies were primarily concerned with the speed with which
reservations can be made. Therefore, the question was whether color displays allow travel agents to book reservations faster. Market research had also shown that in order to justify the higher price of color displays, they must be faster by an average of at least 10 seconds per transaction. Fifty subjects were tested with color displays and 50 subjects were tested with black and white displays. Subjects were slightly faster at making reservations on a color display (M = 504.7 seconds) than on a black and white display (M = 508.2) seconds, although the difference is small, it was statistically significant at the .05 significance level. Box plots of the data are shown below.
The 95% confidence interval on the difference between means is: -7.0 < μ colour – μ black & white ≤ -0.1 which means that the experimenter can be confident that the color display is between 7.0 seconds and 0.1 seconds faster. Clearly, the difference is not big enough to justify the more expensive color displays. Even the upper limit of the 95% confidence interval (seven seconds) is below the minimum needed to justify the cost (10 seconds). Therefore, the experimenter could feel confident in his or her recommendation that the black and white displays should be used. The fact that the color displays were significantly faster does not mean that they were much fasten It just means that the experimenter can reject the null hypothesis that there is no difference between the displays. The experimenter presented this conclusion to management but management did not accept it. The color displays were so dazzling that despite the statistical analysis, they could not believe that color did not improve performance by at least 10 seconds. The experimenter decided to do the experiment again, this time using 100 subjects for each type of display. The results of the second experiment were very similar to the first. Subjects were slightly faster at making reservations on a color display (M = 504.7 seconds) than on a black and white display (M = 508.1 seconds). This time the difference was significant at the 0.01 level rather than the 0.05 level found in the first experiment. Despite the fact that the size of the difference between means was no larger,
the difference was "more significant" due to the larger sample size used. If the population difference is zero, then a sample difference of 3.4 or larger with a sample size of 100 is less likely than a sample difference of 3.5 or larger with a sample size of 50. The 95% confidence interval on the difference between means is: -5.8 < μ colour – μ black & white ≤ -0.9 and the 99% interval is: -6.6 < μ colour – μ black & white ≤ -0.1 Therefore, despite the finding of a "more significant" difference between means the experimenter can be even more certain that the color displays are only slightly better than the black and white displays. The second experiment shows conclusively that the difference is less than 10 seconds. This example was used to illustrate the following points: (1) an effect that is statistically significant is not necessarily large enough to be of practical significance and (2) the smaller of two effects can be "more significant" than the larger. Be careful how you interpret findings reported in the media. If you read that a particular diet lowered cholesterol significantly, this does not necessarily mean that the diet lowered cholesterol enough to be of any health value. It means that the effect on cholesterol in the population is greater than zero. TYPE I AND II ERRORS There are two kinds of errors that can be made in significance testing: (1) a true null hypothesis can be incorrectly rejected and (2) a false null hypothesis can fail to be rejected. The former error is called a Type I error and the latter error is called a Type II error. These two types of errors are defined in the table.
The probability of a Type I error is designated by the Greek letter alpha (α) and is called the Type I error rate; the probability of a Type II error is designated by the Greek letter beta (β), and is called the Type II error rate. A Type II error is only an error in the sense that an opportunity to reject the null hypothesis correctly was lost. It is not an error in the sense that an incorrect conclusion was drawn since no conclusion is drawn when the null hypothesis is not rejected. A Type I error, on the other hand, is an error in every sense of the word. A conclusion is drawn that the null hypothesis is false when, in fact, it is true. Therefore, Type I errors are generally considered more serious than Type II errors. The probability of a Type I error (α) is called the significance level and is set by the experimenter. There is a tradeoff between Type I and Type II errors. The more an experimenter protects himself or herself against Type I errors by choosing a low level, the greater the chance of a Type II error. Requiring very strong evidence to reject the null hypothesis makes it very unlikely that a true null hypothesis will be rejected. However, it increases the chance that a false null hypothesis will not be rejected, thus lowering power. The Type I error rate is almost always set at .05 or at .01, the latter being more conservative since it requires stronger evidence to reject the null hypothesis at the .01 level then at the .05 level. One and Two Tailed Tests In the section on "Steps in hypothesis testing", the fourth step involves calculating the probability that a statistic would differ as much or more from parameter specified in the null hypothesis as does the statistic obtained in the experiment. This statement implies that a difference in either direction would be counted. That is, if the null hypothesis were H0: μ1 - μ2 = 0, and the value of the statistic M1 - M2 were +5, then the probability of M1 - M2 differing from zero by five or more (in either direction) would be computed. In other words, probability value would be the probability that either M1 - M2 ≥ 5 or M1 M2 ≤ -5.Assume that the figure shown below is the sampling distribution of M1 - M2.
The figure shows that the probability of a value of +5 or more is 0.036 and that the probability of a value of -5 or less is .036. Therefore the probability of a value either greater than or equal to +5 or less than or equal to -5 is 0.036 + 0.036 = 0.072. A probability computed considering differences in both directions is called a "twotailed" probability. The name makes sense since both tails of the sampling distribution are considered. There are situations in which an experimenter is concerned only with differences in one direction. For example, an experimenter may be concerned with whether or not μ1 - μ2 is greater than zero. However, if μ1 - μ2 is not greater than zero, the experimenter may not care whether it equals zero or is less than zero. For instance, if a new drug treatment is developed, the main issue is whether or not it is better than a placebo. If the treatment is not better than a placebo, then it will not be used. It does not really matter whether or not it is worse than the placebo. When only one direction is of concern to an experimenter, then a "one-tailed" test can be performed. If an experimenter were only to be concerned with whether or not μ1 - μ2 is greater than zero, then the one-tailed test would involve calculating the probability of obtaining a statistic as greater than the one obtained in the experiment. In the example, the one-tailed probability would be the probability of obtaining a value of M1 - M2 greater than or equal to five given that the difference between population means is zero.
The shaded area in the figure is greater than five. The figure shows that the one-tailed probability is 0.036. It is easier to reject the null hypothesis with a one-tailed than with a two-tailed test as long as the effect is in the specified direction. Therefore, one-tailed tests have lower Type II error rates and more power than do two-tailed tests. In this example, the onetailed probability (0.036) is below the conventional significance level of 0.05 whereas the two-tailed probability (0.072) is not. Probability values for one-tailed tests are one half the value for two-tailed tests as long as the effect is in the specified direction. One-tailed and two-tailed tests have the same Type I error rate. One-tailed tests are sometimes used when the experimenter predicts the direction of the effect in advance. This use of one-tailed tests is questionable because the experimenter can only reject the
null hypothesis if the effect is in the predicted direction. If the effect is in the other direction, then the null hypothesis cannot be rejected no matter how strong the effect is. A skeptic might question whether the experimenter would really fail to reject the null hypothesis if the effect were strong enough in the wrong direction. Frequently the most interesting aspect of an effect is that it runs counter to expectations. Therefore, an experimenter who committed himself or herself to ignoring effects in one direction may be forced to choose between ignoring a potentially important finding and using the techniques of statistical inference dishonestly. One-tailed tests are not used frequently. Unless otherwise indicated, a test should be assumed to be two-tailed. Confidence Intervals & Hypothesis Testing There is an extremely close relationship between confidence intervals and hypothesis testing. When a 95% confidence interval is constructed, all values in the interval are considered plausible values for the parameter being estimated. Values outside the interval are rejected as relatively implausible. If the value of the parameter specified by the null hypothesis is contained in the 95% interval then the null hypothesis cannot be rejected at the 0.05 level. If the value specified by the null hypothesis is not in the interval then the null hypothesis can be rejected at the 0.05 level. If a 99% confidence interval is constructed, then values outside the interval are rejected at the 0.01 level. Imagine a researcher wishing to test the null hypothesis that the mean time to respond to an auditory signal is the same as the mean time to respond to a visual signal. The null hypothesis therefore is: μ visual – μ auditory = 0. Ten subjects were tested in the visual condition and their scores (in milliseconds) were: 355, 421, 299, 460, 600, 580, 474, 511, 550, and 586. Ten subjects were tested in the auditory condition and their scores were: 275, 320, 278, 360, 430, 520, 464, 311, 529, and 326. The 95% confidence interval on the difference between means is: 9 ≤ μ visual – μ auditory ≤ 196. Therefore only values in the interval between 9 and 196 are retained as plausible values for the difference between population means. Since zero, the value specified by the null hypothesis, is not in the interval, the null hypothesis of no difference between auditory and visual presentation can be rejected at the 0.05 level. The probability value for this example is 0.034. Any time the parameter specified by a null hypothesis is not contained in the 95% confidence interval estimating that parameter, the null hypothesis can be rejected at the 0.05 level or less. Similarly, if the 99% interval does not contain the parameter then the null hypothesis can be rejected at the 0.01 level. The null hypothesis is not rejected if the parameter value specified by the null hypothesis is in the interval since the null hypothesis would still be plausible. However, since the null hypothesis would be only one of an infinite number of values in the confidence interval, accepting the null hypothesis is not justified. There are many
arguments against accepting the null hypothesis when it is not rejected. The null hypothesis is usually a hypothesis of no difference. Thus null hypothesis such as: μ1 - μ2 = 0 π1 - π2 = 0 in which the hypothesized value is zero are most common. When the hypothesized value is zero then there is a simple relationship between hypothesis testing and confidence intervals: If the interval contains zero then the null hypothesis cannot be rejected at the stated level of confidence. If the interval does not contain zero then the null hypothesis can be rejected. This is just a special case of the general rule stating that the null hypothesis can be rejected if the interval does not contain the hypothesized value of the parameter and cannot be rejected if the interval contains the hypothesized value. Since zero is contained in the interval, the null hypothesis that μ1 - μ2 = 0 cannot be rejected at the 0.05 level since zero is one of the plausible values of μ1 - μ2. The interval contains both positive and negative numbers and therefore μ1 may be either larger or smaller than μ2. None of the three possible relationships between μ1 and μ2: μ1 - μ2 = 0, μ1 - μ2 > 0, and μ1 - μ2 < 0 can be ruled out. The data are very inconclusive. Whenever a significance test fails to reject the null hypothesis, the direction of the effect (if there is one) is unknown. Now, consider the 95% confidence interval: 6 < μ1 - μ2 ≤ 15 Since zero is not in the interval, the null hypothesis that μ1 - μ2 = 0 can be rejected at the 0.05 level. Moreover, since all the values in the interval are positive, the direction of the effect can be inferred: μ1 > μ2. Whenever a significance test rejects the null hypothesis that a parameter is zero, the confidence interval on that parameter will not contain zero. Therefore either all the values in the interval will be positive or all the values in the interval will be negative. In either case, the direction of the effect is known. Define the Decision Rule and the Region of Acceptance
The decision rule consists of two parts: (1) a test statistic and (2) a range of values, called the region of acceptance. The decision rule determines whether a null hypothesis is accepted or rejected. If the test statistic falls within the region of acceptance, the null hypothesis is accepted; otherwise, it is rejected. We define the region of acceptance in such a way that the chance of making a Type I error is equal to the significance level. Here is how that is done: ♦ Given the significance level α, find the upper limit (UL) of the range of acceptance. There are three possibilities, depending on the form of the null hypothesis i. If the null hypothesis is μ < M: The upper limit of the region of acceptance will be equal to the value for which the cumulative probability of the sampling distribution is equal to one minus the significance level. That is, P(x < UL) = 1 - α. ii. If the null hypothesis is μ = M: The upper limit of the region of acceptance will be equal to the value for which the cumulative probability of the sampling distribution is equal to one minus the significance level divided by 2. That is, P(x < UL) = 1 - α/2. iii. If the null hypothesis is μ > M: The upper limit of the region of acceptance is equal to plus infinity. ♦ In a similar way, we find the lower limit (LL) of the range of acceptance. Again, there are three possibilities, depending on the form of the null hypothesis. i. If the null hypothesis is μ < M: The lower limit of the region of acceptance is equal to minus infinity. ii. If the null hypothesis is μ = M: The lower limit of the region of acceptance will be equal to the value for which the cumulative probability of the sampling distribution is equal to the significance level divided by 2. That is, P(x < LL) = α/2 iii. If the null hypothesis is μ > M: The lower limit of the region of acceptance will be equal to the value for which the cumulative probability of the sampling distribution is equal to the significance level. That is, P(x < LL) = α The region of acceptance is defined by the range between LL and UL. Accept or Reject the Null Hypothesis Once the region of acceptance is defined, the null hypothesis can be tested against sample data. Simply compute the test statistic. In this case, the test statistic is the sample mean. If the sample mean falls within the region of acceptance, the null hypothesis is accepted; if not, it is rejected. Other Considerations
When one tests a hypothesis in the real world, other issues may come into play. Here are some suggestions that may be helpful. ♦ You will need to make an assumption about the sampling distribution of the mean score. If the sample is relatively large (i.e., greater than or equal to 30), you can assume, based on the central limit theorem, that the sampling distribution will be roughly normal. On the other hand, if the sample size is small (less than 30) and if the population random variable is approximately normally distributed (i.e., has a bellshaped curve), you can transform the mean score into a t-score. The t-score will have a t-distribution. ♦ Assume that the mean of the sampling distribution is equal to the test value M specified in the null hypothesis. ♦ In some situations, you may need to compute the standard deviation of the sampling distribution sx. If the standard deviation of the population σ is known, then sx = σ x sqrt[(1/n) - (1/N)], where n is the sample size and N is the population size. On the other hand, if the standard deviation of the population σ is unknown, then sx = s x sqrt of [(1/n) - (1/N)], where s is the sample standard deviation. Example 1 An inventor has developed a new, energy-efficient lawn mower engine. He claims that the engine will run continuously for 5 hours (300 minutes) on a single gallon of regular gasoline. Suppose a random sample of 50 engines is tested. The engines run for an average of 295 minutes, with a standard deviation of 20 minutes. Test the null hypothesis that the mean run time is 300 minutes against the alternative hypothesis that the mean run time is not 300 minutes. Use a 0.05 level of significance. Solution There are four steps in conducting a hypothesis test, as described in the previous sections. We work through those steps below: 1. Formulate hypotheses The first step is to state the null hypothesis and an alternative hypothesis. Null hypothesis: μ = 300 minutes Alternative hypothesis: μ < > 300 minutes Note that these hypotheses constitute a two-tailed test. The null hypothesis will be rejected if the sample mean is too big or if it is too small. 2. Identify the test statistic
In this example, the test statistic is the mean run time of the 50 engines in the sample = 295 minutes. 3. Define the decision rule The decision rule consists of two parts: (1) a test statistic and (2) a range of values, called the region of acceptance. We already know that the test statistic is a sample mean equal to 295. All that remains is to describe the region of acceptance; that is, to define the lower limit and the upper limit of the region. Here is how that is done. a. Specify the sampling distribution. Since the sample size is large (greater than or equal to 30), we assume that the sampling distribution of the mean is normal, based on the central limit theorem. b. Define the mean of the sampling distribution. We assume that the mean of the sampling distribution is equal to the mean value that appears in the null hypothesis - 300 minutes. c. Compute the standard deviation of the sampling distribution. Here the standard deviation of the sampling distribution sx is: sx = σ x sqrt[(1/n) - (1/N)] sx = 20 x sqrt[1/50] = 2.83 where s is the sample standard deviation, n is the sample size, and N is the population size. In this example, we assume that the population size N is very large, so that the value 1/N is about zero. 4. Find the lower limit of the region of acceptance Given a two-tailed hypothesis, the lower limit (LL) will be equal to the value for which the cumulative probability of the sampling distribution is equal to the significance level divided by 2. That is, P(x < LL) = α/2 = 0.05/2 = 0.025. To find this lower limit, we use the Normal Distribution table. From table, cumulative probability = 0.025, mean = 300, and standard deviation = 2.83. The calculation tells us that the lower limit is 294.45, given those inputs. a. Find the upper limit of the region of acceptance. Given a two-tailed hypothesis, the upper limit (UL) will be equal to the value for which the cumulative probability of the sampling distribution is equal to one minus the significance level divided by 2. That is, P(x < UL) = 1 - α/2 = 1 - 0.025 = 0.975. To find this upper limit, we use the Normal Distribution Table. From table, cumulative probability = 0.975, mean = 300, and standard deviation = 2.83. The calculation tells us that the upper limit is 305.55, given those inputs.
b. Thus, we have determined that the region of acceptance is defined by the values between 294.45 and 305.55. 5. Accept or reject the null hypothesis The sample mean in this example was 295 minutes. This value falls within the region of acceptance. Therefore, we cannot reject the null hypothesis that a new engine runs for 300 minutes on a gallon of gasoline. Example 2 Bon Air Elementary School has 300 students. The principal of the school thinks that the average IQ of students at Bon Air is at least 110. To prove her point, she administers an IQ test to 20 randomly selected students. Among the sampled students, the average IQ is 108 with a standard deviation of 10. Based on these results, should the principal accept or reject her original hypothesis? Assume a significance level of 0.01. Solution There are four steps in conducting a hypothesis test, as described in the previous sections. We work through those steps below: 1. Formulate hypotheses. The first step is to state the null hypothesis and an alternative hypothesis. Null hypothesis: μ > 110 Alternative hypothesis: μ < 110 Note that these hypotheses constitute a one-tailed test. The null hypothesis will be rejected if the sample mean is too small. 2. Identify the test statistic. In this example, the test statistic is the mean IQ score of the 20 students in the sample. Thus, the test statistic is the mean IQ score of 108. 3. Define the decision rule. The decision rule consists of two parts: (1) a test statistic and (2) a range of values, called the region of acceptance. We already know that the test statistic is a sample mean equal to 108. All that remains is to describe the region of acceptance; that is, to define the lower limit and the upper limit of the region. Here is how that is done. a. Specify the sampling distribution. Since the sample size is small (less than 30), we assume that the sampling distribution of the mean follows a t-distribution.
b. Define the mean of the sampling distribution. We assume that the mean of the sampling distribution is equal to the mean value that appears in the null hypothesis - 110. c. Find the lower limit of the region of acceptance. Given a one-tailed hypothesis, the lower limit (LL) will be equal to the value for which the cumulative probability of the sampling distribution is equal to the significance level. That is, P(x < LL) = α = 0.01. To find this lower limit, we use the T-Distribution Table. From table, cumulative probability = 0.01, population mean = 110, standard deviation = 2.16, and degrees of freedom = 20 - 1 = 19. The calculation tells us that the sample mean is 104.51, given those inputs. This is the lower limit of our region of acceptance. d. Find the upper limit of the region of acceptance. Since we have a one-tailed hypothesis in which the null hypothesis states that the IQ is greater than 110, any big number is consistent with the null hypothesis. Therefore, the upper limit is plus infinity. Thus, we have determined that the range of acceptance is defined by the values between 104.51 and plus infinity. 4. Accept or reject the null hypothesis. The sample mean in this example was an IQ score of 108. This value falls within the region of acceptance. Therefore, we cannot reject the null hypothesis that the average IQ score of students at Bon Air Elementary is equal to 110. Power of a Hypothesis Test When we conduct a hypothesis test, we accept or reject the null hypothesis based on sample data. Because of the random nature of sample data, our decision can have four possible outcomes. · · · ·
We may accept the null hypothesis when it is true. Thus, the decision is correct. We may reject the null hypothesis when it is true. This kind of incorrect decision is called a Type I error. We may reject the null hypothesis when it is false. Thus, the decision is correct. We may accept the null hypothesis when it is false. This kind of incorrect decision is called a Type II error.
The probability of committing a Type I error is called the significance level and is denoted by α. The probability of committing a Type II error is called Beta and is denoted by β. The probability of not committing a Type II error is called the power of the test. How to Compute the Power of a Test
When a researcher designs a study to test a hypothesis, he/she should compute the power of the test (i.e., the likelihood of avoiding a Type II error). Here is how to do that: 1. Define the region of acceptance. (The process for defining the region of acceptance is described in the previous three lessons. See, for example, Hypothesis Tests of Mean Score, Hypothesis Tests of Proportion (Large Sample), and Hypothesis Tests of Proportion (Small Sample).) 2. Specify the critical value. The critical value is an alternative to the value specified in the null hypothesis. The difference between the critical value and the value from the null hypothesis is called the effect size. That is, the effect size is equal to the critical value minus the value from the null hypothesis. 3. Compute power. Assume that the true population parameter is equal to the critical value, rather than the value specified in the null hypothesis. Based on that assumption, compute the probability that the sample estimate of the population parameter will fall outside the region of acceptance. That probability is the power of the test. Example 1: Power of the Hypothesis Test of a Mean Score Two inventors have developed a new, energy-efficient lawn mower engine. One inventor says that the engine will run continuously for 5 hours (300 minutes) on a single gallon of regular gasoline. Suppose a random sample of 50 engines is tested. The engines run for an average of 295 minutes, with a standard deviation of 20 minutes. The inventor tests the null hypothesis that the mean run time is 300 minutes against the alternative hypothesis that the mean run time is not 300 minutes, using a 0.05 level of significance. The other inventor says that the new engine will run continuously for only 290 minutes on a gallon of gasoline. Find the power of the test to reject the null hypothesis, if the second inventor is correct. Solution The steps required to compute power are presented below. 1. Define the region of acceptance. Earlier, we showed that the region of acceptance for this problem consists of the values between 294.45 and 305.55. 2. Specify the critical value. The null hypothesis tests the hypothesis that the run time of the engine is 300 minutes. We are interested in determining the probability that the hypothesis test will reject the
null hypothesis, if the true run time is actually 290 minutes. Therefore, the critical value is 290. Another way to express the critical value is through effect size. The effect size is equal to the critical value minus the hypothesized value. Thus, effect size is equal to 290 - 300 = -10 3. Compute power. The power of the test is the probability of rejecting the null hypothesis, assuming that the true population mean is equal to the critical value. Since the region of acceptance is 294.45 to 305.55, the null hypothesis will be rejected when the sampled run time is less than 294.45 or greater than 305.55. Therefore, we need to compute the probability that the sampled run time will be less than 294.45 or greater than 305.55. To do this, we make the following assumptions: ·
· ·
The sampling distribution of the mean is normally distributed. (Because the sample size is relative large, this assumption can be justified by the central limit theorem.) The mean of the sampling distribution is the critical value, 290. The standard deviation of the sampling distribution is 2.83. The standard deviation of the sampling distribution was computed in a previous lesson
Given these assumptions, we first assess the probability that the sample run time will be less than 294.45. This is easy to do, using the Normal Calculator. We enter the following values into the calculator: value = 294.45; mean = 290, and standard deviation = 2.83. Given these inputs, we find that the cumulative probability is 0.94207551. This means the probability that the sample mean will be less than 294.45 is 0.942. Next, we assess the probability that the sample mean is greater than 305.55. Again, we use the Normal Calculator. We enter the following values into the calculator: value = 305.55; mean = 290; and standard deviation = 2.83. Given these inputs, we find that the probability that the sample mean is less than 305.55 (i.e., the cumulative probability) is 0.99999998. Thus, the probability that the sample mean is greater than 305.55 is 1 0.99999998 or 0.00000002. The power of the test is the sum of these probabilities: 0.94207551 + 0.00000002 = 0.94207553. This means that if the true average run time of the new engine were 290 minutes, we would correctly reject the hypothesis that the run time was 300 minutes 94.2 percent of the time. Hence, the probability of a Type II error would be very small. Specifically, it would be 1 minus 0.942 or 0.058. IMPORTANT STATISTICAL DEFINITIONS OF HYPOTHESIS TESTING Alpha - The significance level of a test of hypothesis that denotes the probability of rejecting a null hypothesis when it is actually true; In other words, it is the probability of committing a Type I error.
Alternative Hypothesis - A hypothesis that takes a value of a population parameter different from that used in the null hypothesis Beta - The probability of not rejecting a null hypothesis when it actually is false. In other words, it is the probability of committing a Type II error. Critical region - The set of values of the test statistic that will cause us to reject the null hypothesis. Critical Value - The first (or 'boundary') value in the critical region. Decision rule - If the calculated test statistic falls within the critical region, the null hypothesis H0 is rejected. In contrast, if the calculated test statistic does not fall within the critical region, the null hypothesis is not rejected. F-distribution - A continuous distribution that has two parameters (df for the numerator and df for the denominator). It is mainly used to test hypotheses concerning variances. F-ratio - In ANOV A, it is the ratio of between column variance to within column variance. Hypothesis - An unproven proposition or supposition that tentatively explains a phenomenon. Null hypothesis - A statement about a status quo about a population parameter that is being tested. One-tailed test - A statistical hypothesis test in which the alternative hypothesis is specified such that only one direction of the possible distribution of values is considered. Power of hypothesis test - The probability of rejecting the null hypothesis when it is false. Significance level - The value of α that gives the probability of rejecting the null hypothesis when it is true. This gives rise to Type I error. Test criteria - Criteria consisting of (i) specifying a level of significance ex, (ii) determining a test statistic, (iii) determining the critical region(s), and (iv) determining the critical value(s) Test Statistic - The value of Z or t calculated for a sample statistic such as the sample mean or the sample proportion. Two-tail test - A statistical hypothesis test in which the alternative hypothesis is stated in such a way that it includes both the higher and the lower values of a parameter than the value specified in the null hypothesis.
Type I error - An error caused by rejecting a null hypothesis that is true. Type II error - An error caused by failing to reject a null hypothesis that is not true. SUMMARY This chapter had given clear picture about the process of Hypothesis Testing and to find out the details of Null hypothesis and Alternate hypothesis. This lesson also has given the significance and differences between Type I error and Type II error. The detailed step-wise procedures were given to perform the hypothesis testing. This chapter also gives details about the calculation of P value. KEY TERMS · · · · · · · · ·
Hypothesis testing Null hypothesis Alternative Hypothesis Type I error Type II error Probability value One & two tailed tests Decision rule Confidence interval and Statistical significance
IMPORTANT QUESTIONS 1. What do you mean by Hypothesis testing? 2. Define: Null hypothesis and Alternative Hypothesis. 3. Write down the steps to be performed for Hypothesis test 4. What are the differences between Type I and Type II error? 5. How will calculate the Probability value? 6. What do you understand by one and two tailed test? 7. Define Decision rule 8. What is the significance of understanding of confidence intervals?
- End of Chapter LESSON - 7
CHI-SQUARE TEST
OBJECTIVES · · · ·
To learn the significance of Statistical tests. To find out the details of Chi-square test, F-test and t-test To learn the detailed procedures for doing above mentioned tests To understand the concept of Measures of association
STRUCTURE Bivariate tabular analysis Chi-square requirements Computing chi-square Measure of Association INTRODUCTION Chi square is a non-parametric test of statistical significance for bivariate tabular analysis (also known as crossbreaks). Any appropriately performed test of statistical significance lets you know the degree of confidence you can have in accepting or rejecting a hypothesis. Typically, the hypothesis tested with chi square is whether or not two different samples (of people, texts, whatever) are different enough in some characteristic or aspect of their behavior that we can generalize from our samples, that the populations from which our samples are drawn are also different in the behavior or characteristic. A non-parametric test, like chi square, is a rough estimate of confidence; it accepts weaker, less accurate data as input than parametric tests (like t-tests and analysis of variance, for example) and therefore has less status in the pantheon of statistical tests. Nonetheless, its limitations are also its- strengths; because chi square is more 'forgiving' in the data it will accept, it can be used in a wide variety of research contexts. Chi square is used most frequently to test the statistical significance of results reported in bivariate tables, and interpreting bivariate tables is integral to interpreting the results of a chi square test, so we'll take a look at bivariate tabular (crossbreak) analysis. Bivariate Tabular Analysis
Bivariate tabular (crossbreak) analysis is used when you are trying to summarize the intersections of independent and dependent variables and understand the relationship (if any) between those variables. For example, if we wanted to know if there is any relationship between the biological sex of American undergraduates at a particular university and their footwear preferences, we might select 50 males and 50 females as randomly as possible, and ask them, "On average, do you prefer to wear sandals, sneakers, leather shoes, boots, or something else?" In this example, our independent variable is biological sex (in experimental research, the independent variable is actively manipulated by the researcher, for example, whether or not a rat gets a food pellet when it pulls on a striped bar. In most sociological research, the independent variable is not actively manipulated in this way, but controlled by sampling for, e.g., males vs. females). Put another way, the independent variable is the quality or characteristic that you hypothesize helps to predict or explain some other quality or characteristic (the dependent variable). We control the independent variable (and as much else as possible and natural) and elicit and measure the dependent variable to test our hypothesis that there is some relationship between them. Bivariate tabular analysis is good for asking the following kinds of questions: 1. Is there a relationship between any two variables in the data? 2. How strong is the relationship in the data? 3. What is the direction and shape of the relationship in the data? 4. Is the relationship due to some Is the relationship due to some intervening variable(s) in the data? To see any patterns or systematic relationship between biological sex of undergraduates at University of X and reported footwear preferences, we could summarize our results in a table like this: Table 1: Male and Female Undergraduate Footwear Preferences Sandals Sneakers
Leather Boots shoes
Other
Male Female
Depending upon how our 50 male and 50 female subjects responded, we could make a definitive claim about the (reported) footwear preferences of those 100 people. In constructing bivariate tables, typically values on the independent variable are arrayed on vertical axis, while values on the dependent variable are arrayed on the horizontal axis. This allows us to read 'across' from hypothetically 'causal' values on the
independent variable to their 'effects', or values on the dependent variable How you arrange the values on each axis should be guided "iconically" by your research question/hypothesis. For example, if values on an independent variable were arranged from lowest to highest value on the variable and values on the dependent variable were arranged left to right from lowest to highest, a positive relationship would show up as a rising left to right line. (But remember, association does not equal causation; an observed relationship between two variables is not necessarily causal.) Each intersection/cell-of a value on the independent variable and a value on the independent variable-reports the result of how many times that combination of values was chosen/observed in the sample being analyzed. (So you can see that crosstabs are structurally most suitable for analyzing relationships between nominal and ordinal variables. Interval and ratio variables will have to first be grouped before they can "fit" into a bivariate table.) Each cell reports, essentially, how many subjects/observations produced that combination of independent and dependent variable values? So, for example, the top left cell of the table above answers the question: "How many male undergraduates at University of X prefer sandals?" (Answer: 6 out of the 50 sampled.) Table 1b: Male and Female Undergraduate Footwear Preferences Sandals Sneakers Male
6
17
Female
13
5
Leather Boots shoes 13 9 7
Other 5
16
9
Reporting and interpreting cross tabs is the most easily done by converting raw frequencies (in each cell) into percentages of each cell within the values or categories of the independent variable. For example, in the footwear preferences table above, total each row, then divide each cell by its own total, and multiply that faction by 100. Table 1c: Male and Female Undergraduate Footwear Preferences (percentages) Sandals Sneakers Male
12
34
Female
26
10
Leather Boots shoes 26 18 14
32
Other
N
10
50
18
50
Percentages basically standardize cell frequencies as if there were 100 subjects/observations in each category of the independent variable. This is useful for comparing across values on the independent variable, but that usefulness comes at the price of a generalization--from the actual number of subjects/observations in that
column in your data to a hypothetical 100 subjects/observations. If the raw row total was 93, then percentages do little violence to the raw scores; but if the raw total is 9, then the generalization (on no statistical basis, i.e., with no knowledge of samplepopulation representativeness) is drastic. So you should provide that total N at the end of each row/independent variable category (for replicability and to enable the reader to assess your interpretation of the table's meaning). With this caveat in mind, you can compare the patterns of distribution of subjects/observations along the dependent variable between the values of the independent variable: e.g., compare male and female undergraduate footwear preference. (For some data, plotting the results on a line graph can also help you interpret the results: i.e., whether there is a positive (/), negative (), or curvilinear (/, /) relationship between the variables). Table 1c shows that within our sample, roughly twice as many females preferred sandals and boots as males, and within our sample, about three times as many men preferred sneakers as women and twice as many men preferred leather shoes. We might also infer from the 'Other' category that female students within our sample had a broader range of footwear preferences than did male students. Generalizing from Samples to Populations Converting raw observed values or frequencies into percentages does allow us to see more easily patterns in the data, but that is all we can see: what is in the data. Knowing with great certainty the footwear preferences of a particular group of 100 undergraduates at University of X is of limited use to us; we usually want to measure a sample in order to know something about the larger populations from which our samples were drawn. On the basis of raw observed frequencies (or percentages) of a samples behavior or characteristics, we can make claims about the sample itself, but we cannot generalize to make claims about the population from which we drew our sample, unless we submit our results to a test of statistical significance. A test of statistical significance tells us how confidently we can generalize to a larger (unmeasured) population from a (measured) sample of that population. How does chi square do this? Basically, the chi square test of statistical significance is a series of mathematical formulas which compare the actual observed frequencies of some phenomenon (in our sample) with the frequencies we would expect if there were no relationship at all between the two variables in the larger (sampled) population. That is, chi square tests our actual results against the null hypothesis and assesses whether the actual results are different enough to overcome a certain probability that they are due to sampling error. In a sense, chi-square is a lot like percentages; it extrapolates a population characteristic (a parameter) from the sampling characteristic (a statistic) similarly to the way percentage standardizes a frequency to a total column N of 100. But chi-square works within the frequencies provided by the sample and does not inflate (or minimize) the column and row totals. Chi Square Requirements
As mentioned before, chi square is a nonparametric test. It does not require the sample data to be more or less normally distributed (as parametric tests like t-tests do), although it relies on the assumption that the variable is normally distributed in the population from which the sample is drawn. But chi square does have some requirements: 1. The sample must be randomly drawn from the population. As with any test of statistical significance, your data must be from a random sample of the population to which you wish to generalize your claims. 2. Data must be reported in raw frequencies (not percentages). You should only use chi square when your data are in the form of raw frequency counts of things in two or more mutually exclusive and exhaustive categories. As discussed above, converting raw frequencies into percentages standardizes cell frequencies as if there were 100 subjects/observations in each category of the independent variable for comparability. Part of the chi square mathematical procedure accomplishes this standardizing, so computing the chi square of percentages would amount to standardizing an already standardized measurement 3. Measured variables must be independent. Any observation must fall into only one category or value on each variable. In our footwear example, our data are counts of male versus female undergraduates expressing a preference for five different categories of footwear. Each observation/subject is counted only once, as either male or female (an exhaustive typology of biological sex) and as preferring sandals, sneakers, leather shoes, boots, or other kinds of footwear. For some variables, no 'other' category may be needed, but often 'other' ensures that the variable has been exhaustively categorized. (For some kinds of analysis, you may need to include an "uncodable" category.) In any case, you must include the results for the whole sample. 4. Values/categories on independent and dependent variables must be mutually exclusive and exhaustive. Furthermore, you should use chi square only when observations are independent: i.e., no category or response is dependent upon or influenced by another. (In linguistics, often this rule is fudged a bit. For example, if we have one dependent variable/column for linguistic feature X and another column for number of words spoken or written (where the rows correspond to individual speakers/texts or groups of speakers/texts which are being compared), there is clearly some relation between the frequency of feature X in a text and the number of words in a text, but it is a distant, not immediate dependency.) 5. Observed frequencies cannot be too small.
Chi-square is an approximate test of the probability of getting the frequencies you've actually observed if the null hypothesis were true. It's based on the expectation that within any category, sample frequencies are normally distributed about the expected population value. Since (logically) frequencies cannot be negative, the distribution cannot be normal when expected population values are close to zero-since the sample frequencies cannot be much below the expected frequency while they can be much above it (an asymmetric/non-normal distribution). So, when expected frequencies are large, there is no problem with the assumption of normal distribution, but the smaller the expected frequencies, the less valid are the results of the chi-square test. We'll discuss expected frequencies in greater detail later, but for now remember that expected frequencies are derived from observed frequencies. Therefore, if you have cells in your bivariate table which show very low raw observed frequencies (5 or below), your expected frequencies may also be too low for chi square to be appropriately used. In addition, because some of the mathematical formulas used in chi square use division, no cell in your table can have an observed raw frequency of 0. The following minimum frequency thresholds should be obeyed: ♦ for a 1 x 2 or 2 x 2 table, expected frequencies in each cell should be at least 5; ♦ for a 2 x 3 table, expected frequencies should be at least 2; ♦ for a 2 x 4 or 3 x 3 or larger table, if all expected frequencies but one are at least 5 and if the one small cell is at least 1, chi-square is still a good approximation. In general, the greater the degrees of freedom (i.e., the more values/categories on the independent and dependent variables), the more lenient the minimum expected frequencies threshold. (We'll discuss degrees of freedom in a moment.) Collapsing Values A brief word about collapsing values/categories on a variable is necessary. First, although categories on a variable, especially a dependent variable, may be collapsed, they cannot be excluded from a chi-square analysis. That is, you cannot arbitrarily exclude some subset of your data from your analysis. Second, a decision to collapse categories should be carefully motivated, with consideration for preserving the integrity of the data as it was originally collected. (For example, how could you collapse the footwear preference categories in our example and still preserve the integrity of the original question/data? You can't, since there's no way to know if combining, e.g., boots and leather shoes versus sandals and sneakers is true to your subjects' typology of footwear.) As a rule, you should perform a chi square on the data in its uncollapsed form; if the chi square value achieved is significant, then you may collapse categories to test subsequent refinements of your original hypothesis. Computing Chi Square
Let’s walk through the process by which a chi square value is computed, using Table 1b above (renamed Table 1d below). The first step is to determine our threshold of tolerance for error. That is, what odds are we willing to accept that we are wrong in generalizing from the results in our sample to the population it represents? Are we willing to stake a claim on a 50 percent chance that we're wrong? a 10 percent chance? a five percent chance? 1 percent? The answer depends largely on our research question and the consequences of being wrong. If people's lives depend on our interpretation of our results, we might want to take only 1 chance in 100,000 (or 1,000,000) that we're wrong. But if the stakes are smaller, for example, whether or not two texts use the same frequencies of some linguistic feature (assuming this is not a forensic issue in a capital murder case!), we might accept a greater probability--1 in 100 or even 1 in 20 - that our data do not represent the population we're generalizing about. The important thing is to explicitly motivate your threshold before you perform any test of statistical significance, to minimize any temptation for post hoc compromise of scientific standards. For our purposes, we'll set a probability of error threshold of 1 in 20, or p < .05, for our Footwear study). The second step is to total all rows and columns. Table 1d: Male and Female Undergraduate Footwear Preferences: Observed Frequencies with Row and Column Totals Sandals Sneakers
Leather Boots shoes 13 9
Other
Total
5
50
Male
6
17
Female
13
5
7
16
9
50
Total
19
22
20
25
14
100
Remember, that chi square operates by comparing the actual, or observed frequencies in each cell in the table to the frequencies we would expect if there were no relationship at all between the two variables in the populations from which the sample is drawn. In other words, chi square compares what actually happened to what hypothetically would have happened if ‘all other things were equal' (basically, the null hypothesis). If our actual results are sufficiently different from the predicted null hypothesis results, we can reject the null hypothesis and claim that a statistically significant relationship exists between our variables. Chi square derives a representation of the null hypothesis—the 'all other things being equal' scenario—in the following way. The expected frequency in each cell is the product of that cell's row total multiplied by that cell's column total, divided by the sum total of all observations. So, to derive the expected frequency of the "Males who prefer Sandals" cell, we multiply the top row total (50) by the first column total (19) and divide that product by the sum total 100: (50 X 19)/100 = 9.5. The logic of this is that we are
deriving the expected frequency of each cell from the union of the total frequencies of the relevant values on each variable (in this case, Male and Sandals), as a proportion of all observed frequencies (across all values of each variable). This calculation is performed to derive the expected frequency of each cell, as shown in Table 1e below (the computation for each cell is listed below:
(Notice that because we originally obtained a balanced male/female sample, our male and female expected scores are the same. This usually will not be the case.) We now have a comparison of the observed results versus the results we would expect if the null hypothesis were true. We can informally analyze this table, comparing observed and expected frequencies in each cell (Males prefer sandals less than expected), across values on the independent variable (Males prefer sneakers more than expected, Females less than expected), or across values on the dependent variable (Females prefer sandals and boots more than expected, but sneakers and shoes less than expected). But so far, the extra computation doesn't really add much more information than interpretation of the results in percentage form. We need some way to measure how different our
observed results are from the null hypothesis. Or, to put it another way, we need some way to determine whether we can reject the null hypothesis, and if we can, with what degree of confidence that we're not making a mistake in generalizing from our sample results to the larger population. Logically, we need to measure the size of the difference between the pair of observed and expected frequencies in each cell. More specifically, we calculate the difference between the observed and expected frequency in each cell, square that difference, and then divide that product by the difference itself. The formula can be expressed as: (O - E)2 / E Squaring the difference ensures a positive number, so that we end up with an absolute value of differences. If we didn't work with absolute values, the positive and negative differences across the entire table would always add up to 0. (You really understand the logic of chi square if you can figure out why this is true.) Dividing the squared difference by the expected frequency essentially removes the expected frequency from the equation, so that the remaining measures of observed/expected difference are comparable across all cells. So, for example, the difference between observed and expected frequencies for the Male/Sandals preference is calculated as follows: 1. Observed (6) - Expected (9.5) = Difference (-3.5) 2. Difference (-3.5) squared = 12.25 3. Difference squared (12.25) divided by Expected (9.5) = 1.289 The sum of all products of this calculation on each cell is the total chi square value for the table. The computation of chi square for each cell is listed below:
(Again, because of our balanced male/female sample, our row totals were the same, so the male and female observed – expected frequency differences were identical. This is usually not the case) The total chi square value for Table 1 is 14.026. Interpreting the Chi Square Value We now need some criterion or yardstick against which to measure the table's chi square value, to tell us whether or not it is significant. What we need to know is the probability of getting a chi square value of a minimum given size even if our variables are not related at all in the larger population from which our sample was drawn. That is, we need to know how much larger than 0 (the absolute chi square value of the null hypothesis) our table's chi square value must be before we can confidently reject the null hypothesis. The probability we seek depends in part on the degrees of freedom of the table from which our chi square value is derived.
Degrees of freedom Mechanically, a table's degrees of freedom (df) can be expressed by the following formula: df = (r - 1)(c - 1) That is, a table's degrees of freedom equals the number of rows in the table minus one multiplied by the number of columns in the table minus one (for 1 x 2 table, df = k - 1, where k = number of values/categories on the variable). ‘Degrees of freedom’ is an issue because of the way in which expected values in each cell are computed from the row and column totals of each cell. All but one of the expected values in a given row or column are free to vary (within the total observed-and therefore expected) frequency of that row or column); once the free to vary expected cells are specified, the last one is fixed by virtue of the fact that the expected frequencies must add up to the observed row and column totals (from which they are derived). Another way to conceive of a table's degrees of freedom is to think of one row and one column in the table as fixed, with the remaining cells free to vary. Consider the following visuals (where X = fixed):
df = (r – 1) (c – 1) = (3 – 1) (2 – 1) = 2 x 1 = 2
df = (r – 1) (c – 1) = (5 – 1) (3 – 1) = 4 x 2 = 8 So, for our Table 1, Table 1: Male and Female Undergraduate Footwear Preferences Sandals Sneakers Male
X
Female
X
X
Leather Boots shoes X X
Other X
df = (2 - 1) (5 – 1) = 1 x 4 = 4 In a statistics book, the sampling distribution of chi square (also known as critical values of chi squares typically listed in an appendix. You read down the column representing your previously chosen probability of error threshold (e.g., p < 0.5) and across the row representing the degrees of freedom in your table. If your chi square value is larger than the critical value in that cell, your data presents a statistically significant relationship between the variables in your table. Table 1’s chi square value of 14.026 with 4 degrees of freedom, handily clears the related critical value of 9.49, so we can reject the null hypothesis and affirm the claim that male and female undergraduates at University of X differ in their (self-reported) footwear preferences.
Statistical significance does not help you to interpret that nature or explanation of that relationship; that must be done by other means (including bivariate tabular analysis of the data). But a statistically significant chi square value does denote the degree of confidence you may hold that the relationship between variables described in your results is systematic in the larger population and not attributable to random error. Statistical significance also does not ensure substantive significance. A large enough sample may demonstrate a statistically significant relationship between two variables, but that relationship may be trivially weak one. Statistical significance means only that the pattern of distribution and relationship between the variables which is found in the data from a sample can be confidently generalized to the larger population from which the sample was randomly drawn. By itself it does not ensure that the relationship is theoretically or practically important, or even very large. Measures of Association While the issue of theoretical or practical importance of a statistically significant result cannot be quantified, the relative magnitude of a statistically significant relationship can be measured. Chi Square allows you to make decisions about whether there is a relationship between two or more variables; if the null hypothesis is rejected, we conclude that there is a statistically significant relationship between the variables. But we frequently want a measure of the strength of that relationship - an index of degree of correlation, a measure of the degree of association between the variables represented in our table (and data). Luckily, several measures of association can be derived from a table’s chi square value. For tables larger than 2 x 2 (like our Table 1), a measure called 'Cramer's phi' is derived by the following formula (where N is the total number of observations, and k is the smaller of the number of rows or columns). Cramer’s phi = sqrt [(chi square) / (N x (k-1)] So, for our Table 1 (2 x 5), Cramer’s phi will be computed as follows: 1. N (k - 1) = 100 (2 – 1) = 100 2. Chi square / 100 = 14.026 / 100 = 0.14 3. Square root of 0.14 = 0.37 The product is interpreted as a Pearson r (that is, as a correlation coefficient). (For 2X2 tables, a measure called 'phi' is derived by dividing the table's chi square value by N (the total number of observations) and then taking the square root of the product. Phi is also interpreted as a Pearson r.)
A complete account of how to interpret correlation coefficients is unnecessary for present purposes. It will suffice to say that r2 is a measure called shared variance. Shared variance is the portion of the total behavior (or distribution) of the variables measured in the sample data which is accounted for by the relationship we've already detected with our chi square. For Table 1, r2 = 0.137, so approximately 14% of the total footwear preference story is explained/predicted by biological sex. Computing a measure of association like phi or Cramer's phi is rarely done in quantitative linguistic analyses, but it is an important benchmark of just 'how much' of the phenomenon under investigation has been explained. For example, Table 1's Cramer's phi of 0.37 (r2 = 0.137) means that there are one or more, variables still undetected which, cumulatively, account for and predict 86% of footwear preferences. This measure, of course, doesn't begin to address the nature of the relation(s) between these variables, which is a crucial part of any adequate explanation or theory. SUMMARY The above chapter has given the framework for performing key statistical tests like chisquare test. Chi-square is a non parametric test of statistical significance for bivariate tabular analysis. KEY WORDS · · · ·
Bivariate tabular analysis Chi square test Measure of association Degrees of freedom
IMPORTANT QUESTIONS 1. What do you mean by bivariate tabular analysis? 2. What are the statistical applications in chi square tests? 3. How will you calculate the degree of freedom?
- End of Chapter LESSON - 8 T-TEST, F-TEST
OBJECTIVES
· · ·
To know the significance of statistical list To find out the details of F-test and T-test To learn the detailed procedures for doing above mentioned test.
STRUCTURE · · · · ·
F-test Cumulative probability and F distribution F-Test for equality T-test Two sample t-test for equal means.
F-TEST The f statistic, also known as an f value, is a random variable that has an F distribution. The F Distribution The distribution of all possible values of the f statistic is called an F distribution, with v1 = n1 – 1 and v2 = n2 – 1 degrees of freedom. The curve of the F distribution depends on the degrees of freedom, v1 and v2. When describing an F distribution, the number of degrees of freedom associated with the standard deviation in the numerator of the f statistic is always stated first. Thus, f(5,9) would refer to an F distribution with v1 = 5 and v2 = 9 degrees of freedom; whereas f(9,5) would refer to an F distribution with v1 = 9 and v2 = 5 degrees of freedom. Note that the curve represented by f(5,9) would differ from the curve represented by f(9,5). The F distribution has the following properties: ♦ The mean of the distribution is equal to v1 / (v2 – 2) ♦ The variance is equal to v22 (v1 + 2) ---------------------------v1 (v2 – 2) (v2 – 4) Cumulative Probability and the F Distribution Every f statistic can be associated with a unique cumulative probability. This cumulative probability represents the likelihood that the f statistic is less than or equal to a specified value. Here are the steps required to compute an f statistic:
1. Select a random sample of size n1 from a normal population, having a standard deviation equal to σ1 2. Select an independent random sample of size n2 from a normal population, having a standard deviation equal to σ2 The f statistic is the ratio of and s12/σ12 and s22/σ22 The following equivalent equations are commonly used to compute an f statistic: f = [s12 / σ12] / [s22 / σ22] f = [s12 x σ22] / [s22 x σ12] f = [X12 / v1] / [X22 / v2] f = [X12 x v2] / [X22 x v1] where σ1 = standard deviation of population 1 s1 = standard deviation of the sample drawn from population 1 σ2 = standard deviation of population 2 s2 = standard deviation of the sample drawn from population 2 X12 = chi-square static for the sample drawn from population 1 v1 = degrees of freedom for X12 X22 = chi-square statistic for the sample drawn from population 2 v2 = degrees of freedom for X22 Degrees of freedom v1 = n1 - 1, and Degrees of freedom v2 = n2 – 1. Sample Problems Example
Suppose you randomly select 7 women from a population of women and 12 men from a population of men. The table below shows the standard deviation in each sample and in each population.
Compute the f statistic. Solution The f statistic can be computed from the population and sample standard deviations, using the following equation: f = [s12 / σ12] / [s22 / σ22] where, σ1 is the standard deviation of population 1 s1 = standard deviation of the sample drawn from population 1 σ2 = standard deviation of population 2 s2 = standard deviation of the sample drawn from population 2 As you can see from the equation, there are actually two ways to compute an f statistic from these data. If the women's data appears in the numerator, we can calculate an f statistic as follows: f = (35 / 30) / (45 / 50) = 1.66667 / 0.9 = 1.85 On the other hand, if the men's data appears in the numerator, we can calculate an f statistic as follows: f = (45 / 50) / (35 / 30) = 0.9 / 1.66667 = 0.54 This example illustrates the importance of specifying the degrees of freedom associated with the numerator and denominator of an f statistic. This topic is continued in the next example.
F-Test for Equality of Two Standard Deviations An F-test (Snedecor and Cochran, 1983) is used to test if the standard deviations of two populations are equal. This test can be a two-tailed test or a one-tailed test. The twotailed version tests against the alternative that the standard deviations are not equal. The one-tailed version only tests in one direction, that is, the standard deviation from the first population is either greater than or less than (but not both) the second population standard deviation. The choice is determined by the problem. For example, if we are testing a new process, we may only be interested in knowing if the new process is less variable than the old process. We are testing the hypothesis that the standard deviations for sample one and sample two are equal. The output is divided into four sections. 1. The first section prints the sample statistics for sample one used in the computation of the F-test. 2. The second section prints the sample statistics for sample two used in the computation of the F-test 3. The third section prints the numerator and denominator standard deviations, the Ftest statistic value, the degrees of freedom, and the cumulative distribution function (cdf) value of the F-test statistic. The F-test statistic cdf value is an alternative way of expressing the critical value. This cdf value is compared to the acceptance interval printed in section four. The acceptance interval for a two-tailed test is (0,1–α) 4. The fourth section prints the conclusions for a 95% test since this is the most common case. Results are printed for an upper one-tailed test. The acceptance interval column is stated in terms of the cdf value printed in section three. The last column specifies whether the null hypothesis is accepted or rejected. For a different significance level, the appropriate conclusion can be drawn from the F-test statistic cdf value printed in section four. For example, for a significance level of 0.10, the corresponding acceptance interval become (0.000,0.9000). The F-test can be used to answer the following questions: 1. Do two samples come from populations with equal standard deviations? 2. Does a new process, treatment, or test reduce the variability of the current process? T-TEST We have seen that the central limit theorem can be used to describe the sampling distribution of the mean, as long as two conditions are met: 1. The sample size must be sufficiently large (at least 30).
2. We need to know the standard deviation of the population, which is denoted by α. But sample sizes are sometimes small and often we do not know the true population standard deviation. When either of these problems occurs, statisticians rely on the distribution of the t statistic (the t-score), whose values are given by: T = [x – μ] / [s / sqrt(n)] where, x = sample mean μ = population mean s = standard deviation of the sample n = sample size The distribution of this t statistic is called the t-distribution or the Student tdistribution. The t-distribution can be used whenever samples are drawn from populations possessing a bell-shaped distribution (i.e., approximately normal). The t-distribution has the following properties · ·
The mean of the distribution is equal to 0. The variance is equal to v / (v - 2), where v is the degrees of freedom (see next section) and v > 2.
Parametric tests provide inferences for making statements about the means of parent populations. A t-test is commonly used for this purpose. This test is based on the Student's t statistic. The t statistic assumes that the variable is normally distributed, the mean is known (or assumed to be known), and the population variance is estimated from the sample. Assume that the random variable X is normally distributed, with mean μ and unknown population variance σ2, which is estimated by the sample variance S2. Recall that the standard deviation of the sample mean, X, is estimated as sx = s/n. Then, t = (X - μ) / sx is distributed with n - 1 degrees of freedom. The t-distribution is similar to the normal distribution in appearance. Both distributions are bell-shaped and symmetric. However, as compared to the normal distribution, that distribution has more area in the tails and less in the center. This is because population variance σ2 is unknown and is estimated by the sample variance S2. Given the uncertainty in the value of S2, the observed values of t are more variable than those of z. Thus we must go a large number of standard deviations from 0 to encompass a certain percentage of values from the t distribution than is the case with the normal distribution. Yet, as the number of freedom increases, the t distribution approaches the normal distribution. In large samples of 120 or more, the t distribution and the normal distribution are virtually indistinguishable. Table 4 in the statistical appendix shows
selected percentiles of the t distribution. Although normality is assumed, the t test is quite robust to departures from normality. The procedure for hypothesis testing, for the special case when the t statistic is used, is as follows: 1. Formulate the null (H0) and the alternative (H1) hypotheses. 2. Select the appropriate formula for the t statistic. 3. Select a significance level, a, for testing H0. Typically, the 0.05 level is selected. 4. Take one or two samples and compute the mean and standard deviation for each sample. 5. Calculate the t statistic assuming H0 is true. 6. Calculate the degrees of freedom and estimate the probability of getting a more extreme value of the statistic from Table 4. (Alternatively, calculate the critical value of statistic.) 7. If the probability computed in step 6 is smaller than the significance level selected in reject H0. If the probability is larger, do not reject H0. (Alternatively, if the value calculated t statistic in step 5 is larger than the critical value determined in step 6, reject H0. If the calculated value is smaller than the critical value, do not reject H0). To reject H0 does not necessarily imply that H0 is true. It only means that the result is not significantly different than that assumed by H0. 8. Express the conclusion reached by the t-test in terms of the marketing research problem. Degrees of Freedom There are actually many different t distributions. The particular form of the t distribution is determined by its degrees of freedom. The degree of freedom refers to the number of independent observations in a sample. The number of independent observations is often equal to the sample size minus one. Hence, the distribution of the t statistic from samples of size 8 would be described by a t distribution having (8 – 1) = 7 degrees of freedom. Similarly, a t-distribution having 15 degrees of freedom would be used with a sample of size 16. The t-distribution is symmetrical with a mean of zero. Its standard deviation is always greater than 1, although it is close to 1 when there are many degrees of freedom. With infinite degrees of freedom, the t distribution is the same as the standard normal distribution.
Sampling Distribution of the Mean When the sample size is small (< 30), the mean and standard deviation of the sampling distribution can be described as follows: μx = μ and σx = s * sqrt[1/n – 1/N] where, μx = mean of the sampling distribution μ = mean of the population σx = standard error (i.e., the standard deviation of the sampling distribution) s = standard deviation of the sample n = sample size N = population size Probability and the Student t Distribution When a sample of size n is drawn from a population having a normal (or nearly normal) distribution, the sample mean can be transformed into a t score, using the equation presented at the beginning of this lesson. We repeat that equation here: T = [x – μ] / [s / sqrt(n)] where, x = sample mean μ = population mean s = standard deviation of the sample n = sample size The degrees of freedom = n - 1 Every t score can be associated with a unique cumulative probability. This cumulative probability represents the likelihood of finding a sample mean less than or equal to x, given a random sample of size n. Two-sample F-Test for Equal Means
The two-sample t-test (Snedecor and Cochran, 1989) is used to determine if two population means are equal. A common application of this is to test if a new process or treatment is superior to a current process or treatment. There are several variations on this test: 1. The data may either be paired or not paired. By paired, we mean that there is a oneto-one correspondence between the values in the two samples. That is, if X 1, X2,… , Xn and Y1, Y2... , Yn are the two samples, then Xi corresponds to Yi. For paired samples, the difference Xi - Yi is usually calculated. For unpaired samples, the sample sizes for the two samples may or may not be equal. The formulas for paired data are somewhat simpler than the formulas for unpaired data. 2. The variances of the two samples may be assumed to be equal or unequal. Equal variances yield somewhat simpler formulas, although with computers this is no longer a significant issue. In some applications, you may want to adopt a new process or treatment only if it exceeds the current treatment by some threshold. In this case, we can state the null hypothesis in the form that the difference between the two populations means is equal to some constant (μ1 – μ2 = d0) where the constant is the desired threshold. Interpretation of Output 1. We are testing the hypothesis that the population mean is equal for the two samples. The output is divided into five sections. 2. The first section prints the sample statistics for sample one used in the computation of the f-test 3. The second section prints the sample statistics for sample two used in the computation of the t-test. 4. The third section prints the pooled standard deviation, the difference in the means, the t-test statistic value, the degrees of freedom, and the cumulative distribution function (cdf) value of the t-test statistic under the assumption that the standard deviations are equal. The t-test statistic cdf value is an alternative way of expressing the critical value. This cdf value is compared to the acceptance intervals printed in section five. For an upper one-tailed test, the acceptance interval is (0,1-α), the acceptance interval for a two-tailed test is (α/2, 1- α/2), and the acceptance interval for a lower onetailed test is (α,1). 5. The fourth section prints the pooled standard deviation, the difference in the means, the t-test statistic value, the degrees of freedom, and the cumulative distribution function (cdf) value of the t-test statistic under the assumption that the standard deviations are not equal. The t-test statistic cdf value is an alternative way of expressing the critical value. cdf value is compared to the acceptance intervals printed in section
five. For an upper one-tailed test, the alternative hypothesis acceptance interval is (1-α, 1), the alternative hypothesis acceptance interval for a lower one-tailed test is (0,α), and the alternative hypothesis acceptance interval for a two-tailed test is (1-α/2,1) or (0,α/2). Note that accepting the alternative hypothesis is equivalent to rejecting the null hypothesis. 6. The fifth section prints the conclusions for a 95% test under the assumption that the standard deviations are not equal since a 95% test is the most common case." Results are given in terms of the alternative hypothesis for the two-tailed test and for the onetailed test in both directions. The alternative hypothesis acceptance interval column is stated in terms of the cdf value printed in section four. The last column specifies whether the alternative hypothesis is accepted or rejected. For a different significance level, the appropriate conclusion can be drawn from the t-test statistic cdf value printed in section four. For example, for a significance level of 0.10, the corresponding alternative hypothesis acceptance intervals are (0,0.05) and (0.95,1), (0,0.10), and (0.90,1). Two-sample f-tests can be used to answer the following questions 1. Is process 1 equivalent to process 2? 2. Is the new process better than the current process? 3. Is the new process better than the current process by at least some pre-determined threshold amount? Matrices of t-tests T-tests for dependent samples can be calculated for long lists of variables, and reviewed in the form of matrices produced with case wise or pair wise deletion of missing data, much like the correlation matrices. Thus, the precautions discussed in the context of correlations also apply to t-test matrices: a. The issue of artifacts caused by the pair wise deletion of missing data in t-tests and b. The issue of "randomly" significant test values. SUMMARY The above chapter has given the frame work for performing key statistical tests like FTest and T-test. T-test and F-test are parametric tests. T-test is any statistical hypothesis test in which the test statistic has a Student’s distribution if the null hypothesis is true. KEY TERMS · ·
Degrees of Freedom T test
· · ·
T distribution F distribution Sampling distribution
IMPORTANT QUESTIONS 1. How will you calculate the Degrees of Freedom? 2. Explain the procedures for performing t test? 3. What are the metrics of t test? 4. Explain the procedures for performing f test? REFERENCE 1. Sumathi, S. and P. Saravanavel - Marketing research and Consumer Behaviour 2. Ferber, R., and Verdoorn, P.J., Research Methods in Business, New York-the Macmillan Company 1962. 3. Ferber, R., Robert (ed.) Hand Book of Marketing Research. New York McGraw Hill, Inc. 1948.
- End of Chapter LESSON – 9 METHODS OF DATA COLLECTION
OBJECTIVES · · · ·
To know the different types of Data and the sources of the same. To learn the different data collection methods and its merits, demerits. To understand the difference between Questionnaire and Interview Schedule. To apply the suitable data collection method for the research.
STRUCTURE · · · ·
Primary data Secondary data Interview Questionnaire
·
Observation
INTRODUCTION After defining the research problem and drawing the research design, the important task of the researcher is Data collection. While deciding on the research method, the method of data collection to be used for the study also should be planned. The same of information and the manner in which data are collected could well make a big difference to the effectiveness of the research project. In data collection, the researcher should be very clear on what type of data is to be used for the research. There are two types of data namely primary data and secondary data. The method of collecting primary and secondary data differ since primary data are to be originally collected while in case of secondary data, it is merely compilation of the available data. PRIMARY DATA Primary data are generally, information gathered by the researcher for the purpose of the project at hand. When the data are collected for the first time using experiments and surveys, the data is known as primary data. So, in case of primary data, it is always the responsibility of the researcher to decide on further processing of data. There are several methods of data collection each with its advantages and disadvantages. The data collection methods include the following: 1. Interview- Face to face interview, Telephone interview, Computer assigned interview, Interviews through electronic media 2. Questionnaire - These are personally administered sent through the mail, or electronically administered 3. Observation - of individuals and events with or without video taping or audio recording. Hence interviews, questionnaires and observation methods are three main data collection methods. Some of the other data collection methods used to collect primary data: 1. Warranty cards 2. Distributor Audits 3. Pantry Audits
4. Consumer Panels SECONDARY DATA As already mentioned, it the researcher who decides to collect secondary data for his research that can be collected through various sources. In the case of secondary data, the researcher may not face severe problems that are usually associated with primary data collection. Secondary data may either be published or unpublished data. Published data may be available with the following sources: 1. Various publications of the central, state or local governments. 2. Various publications of foreign governments or of international bodies. 3. Technical and Trade Journals. 4. Books, Magazines, Newspapers. 5. Reports and publication from various associations connected with industry and business. 6. Public records and statistics. 7. Historical documents. Though there are various sources for secondary data, it is the responsibility of the researcher that he should make a minute scrutiny of data in order to involve the data more suitable and adequate. INTERVIEWS An interview is a purposeful discussion between two or more people. Interview can help to gather valid and reliable data. There are several types of interviews. Types of Interviews Interviews may be conducted in a very formal manner, using structured and standardized questionnaire for each respondent. Interviews also may be conducted informally through unstructured conversation. Based on formal nature and structure, the interviews are classified as follows: 1. Structured Interviews : These interviews involve the use of a set of predetermined questions and of highly standardized techniques of recording. So, in this type of interview a rigid proved method is followed.
2. Semi Structured interviews : These interviews may have a structured questionnaire but the technique of interviewing may not have a proved method. These interviews have more scope for discussion and recording of respondent's opinion and views. 3. Unstructured Interviews : These interviews neither follow a system of predetermined questions nor a standardized technique of recording information. But the unstructured interviews need an in depth knowledge and greater skill on the part of the interviewers. All the three types of interviews may be conducted by the interviewer by asking questions generally in face to face contact. These interviews may take a form of direct personal investigation or may be indirect oral investigation. In case of direct personal investigation, the interviewer has to collect the information personally. So, it is the duty of interviewer to be there in the spot to meet the respondents to collect data. When this is not possible the interviewer may have to cross examine others who are supposed to have knowledge about the problem and here information may be recorded. Example: Commissions and committee appointed by Government. Depending on the approaches of the interviewer, the interviews may be classified as: 1. Non-directive interviews : In these types of interviews, the interviewer is very much free to arrange the form and order of questions. The questionnaire for these kinds of interviews also may contain open ended questions where in the respondent also feel free to respond to the questions. 2. Directive interviews : This is also a type of structured interview. In this method a predetermined questionnaire is used and the respondent is express to limit the answers only to the question asked. Market surveys and interviews by news paper correspondents are the suitable examples. 3. Focused interviews : These methods of interviews are in between directive and non - directive. Here the methods are neither fully standardized nor non-standardized. Here the objective is to focus the attention of the respondents on a specific issue or point. Example: A detective questioning a person regarding a crime committed in an area. 4. In-depth interviews : In these interview methods, the respondents are encouraged to express his thoughts on the topic of the study. In depth interviews are conducted to get important aspects of psycho - social situations, which are otherwise not readily evident. The major strength of these kinds of interviews is their capacity to uncover the basic and complete answers to the questions asked. Advantages of interviews
Despite the variations in interviews techniques the following are the advantages of the same. 1. Interviews may help to collect more information and also in depth information. 2. The unstructured interviews are more advantages since, there is always an opportunity for the interviewers to restructure the questions. 3. Since the respondents are contacted for the information, there are always greater advantages of creating support and collecting personal information also. 4. Interviews help the researcher to collect all the necessary information, here the evidence no response will be very low. 5. It is also possible for the interviewer to collect additional information about the environment, name, behavior and attitude of the respondents. Disadvantages of interviews 1. Interviews are expensive methods especially in case of widely spread geographical samples are taken. 2. There may be a possibility for the barriers in the case of both interviewer and the respondent. 3. These methods are also time-consuming especially when the large samples are taken for the study. 4. There may be a possibility for the respondent to hide the real opinion, so genuine data may not be collected. 5. Sometimes there will be great difficult in adopting interview methods became fixing appointment with the respondent itself may not be possible. Hence, for successful implementation of the interview method, the interviewer should be carefully selected, trained and briefed. How to make interviews successful? 1. As mentioned above, the interviewers must be carefully selected and trained properly. 2. The interviewer should have the knowledge of exploring to collect the needed information from the respondent. 3. Honesty and integrity of the interviewer also determines the outcomes of the interview.
4. The support with the respondent should be created by the interviewer. 5. Qualities such as politeness, courteousness friendly and conversational are necessary to make the interview successful. Telephonic Interviews A part from all the above the telephonic interviews are also conducted to collect the data. Respondents are contacted over the phone to gather data. Telephonic interviews are more flexible in timing - it is faster than other methods and this method is cheaper method also. For these sorts of interviews no field's staff is required and also the information can be recorded without causing embarrassment to respondents especially when very personal questions are asked. But these methods are much restricted to the respondents who have telephone facility. Possibility for the biased replies is relatively more and since there is not personal touch by both there is a greater possibility for non answered questions. QUESTIONNAIRES Questionnaires are widely used for data collection in social sciences, research particularly in surveys. This is been accepted as a reliable not for gathering data from large, diverse and scattered social groups. Bogardus describes the questionnaire as a list of questions sent to a number of persons for their answers and which obtains standards results that can be tabulated and treated statistically. There are two types of questionnaires, also which are structured and unstructured. The design of the questionnaire may vary based on the way it is administered. The questionnaire methods are most extensively used in economic and business surveys. Structured questionnaire These contain definite concrete and preordained questions. The answer collected using this structured questionnaire is very precise and there is no vagueness and ambiguity. The structured questionnaire may have the following types: 1. Closed - form questionnaire: Questions are set in such a way that leaves only few alternative answers. Example: Yes / No type questions. 2. Open - ended questionnaire: Here the respondent has the choice of using his own style, expression of language, length and perception. The respondents are not restricted in his replies. Unstructured questionnaire
The questions in this questionnaire are not structured in advance. These sorts of questionnaire give more scopes for variety of answers. Mainly to conduct interviews where in different responses are expected, these type of questionnaire are used. The researcher should be much clear on when to use the questionnaire. These can be mostly used in case of descriptive and explanatory type of research. Example: Questionnaire on attitude, opinion, and organizational practices, enable the researcher to identify and describe the variability. Advantages of the Questionnaire · · · ·
Cost involved is low in case of widely spread geographical sample. It is more appreciable one because it more free from the subjectivity of the interviews. Respondents also may find adequate time to give well thought answers. It is more advantages in the case when respondents are not reachable.
But the rate of return of questionnaire and the fulfillment at needed data for the study may be doubtful. This can be used only when the respondents are educated and will know to read the language in which questionnaire is prepared. Possibilities for ambiguous replies, omission of replies are more. This method is more time consuming. Before sending the final questionnaire to the respondents it is always more important to conduct the Pilot study for resting the questionnaire. Pilot study is just a rehearsal of the main survey such survey conducted with help of experts brings more strength to the questionnaire. Data collection through schedules This method is very much like data collection through questionnaires. Schedules are a proforma containing set of questions which are filled in by the enumerators who are specially appointed for this purpose. In this method the enumerators are expected to perform well and they must be knowledgeable and must possess the capacity of cross examination in order to find the truth. These methods are usually expensive and are conducted by bigger and Government organizations. OBSERVATION Observation is one of the cheaper and effective techniques of data collection. The observation is understood as a systematic process of recording the behavioral patterns of people, objects and occurrence as they are witnessed. Using this observational method of data collection the following data related to movements, work habits, the statements mad and meetings conducted,[by human begins] facial expressions, body language and other emotions such as joy, anger and sorrowfulness of the human beings can be collected.
Also other environmental factors includes layout, workflow patterns, physical arrangements can also be noted. In this method of data collection the investigator collects the data without interacting with the respondents. Example: Instead of asking about the brands of shirts or the program they watch on just observing their pattern. The observation would be classified based on the role of researcher as: 1. Non participant observation The role of the investigator here would be an external agent who sits in a corner without interacting with the samples which is to be observed. 2. Participant observation Joins with the group and work along with them in way the work is done but many not be asking questions related to the research / investigation. It also can be classified based on the methods as: 1. Structured observation In this case the researcher may have a predetermined set of categories of activities or phenomena planned to be studied. Example: To observe the behavior pattern of individual when he/she go for purchasing to be planned in such a way where the frequency of purchasing, and the interest during the purchase and the way goods are preferred / selected. Any researcher would be having a plan on the observation to be made. 2. Unstructured observation The researcher may not do the data collection based on the specific ideas. These sorts of methods are used more qualitative research studies. Example: To observe the behavior pattern of individual when he/she go for purchasing to be planned in such a way where the frequency of purchasing, and the interest during the purchase and the way goods are preferred / selected. Merits of observational studies · ·
The data collected in this method are generally reliable and they are free from respondent's bias. It is easier to observe the busy people rather meeting and collecting the data.
Demerits of observational studies · ·
The observer has to be present in the situation where the data to be collected. This method is very slow and expensive.
·
Since the observer is going to collect the data by consuming more time and observing the sample there is a possibility for biased information.
Survey Method Survey is a popular method of "fact-finding" study which involves collection of data directly from the population at a particular time. Survey can be defined as a research technique in which information may be gathered from a sample of people by use of questionnaire or interview. Survey is considered as field study which is conducted in a natural setting which collects information directly from the respondents. Surveys are been conducted for many purposes - population census, socio economic surveys, expenditure surveys and marketing surveys. The purpose of these surveys would be providing information to the government planners or business enterprises. The surveys are also conducted to explain phenomena where in causal relationship between two variables to be assessed. Surveys are been used to compare the demographic groups such as to compare the high and low income groups, to compare the preference based on age. These surveys are been conducted in the care of descriptive type of research and in which large samples are focused. The surveys are more appropriate in the social and behavioral science. Surveys are more convened with formulating the hypothesis and testing the relationship between nonmanipulated variables. The survey research requires skillful workers to gather data. The subjects for surveys may be classified as: 1. Social surveys which include · · ·
Demographic characteristics of a group of people The social environment of people People's opinion & attitudes
2. Economic surveys · ·
Economic conditions of people Operations of economic system
Important stages in survey methods Selecting the universe of the field ↓ Choosing samples from the universe
↓ Deciding on tools used for data collection ↓ Analyzing the data Other methods of Data Collection Warranty cards : A type of post card with few focused typed questions may be used by the dealers / retailers to collect information from the customers. The dealers / researchers may request customers to fill in the required data. Distributor Audits : These sorts of data may be collected by distributors to estimate the market size, market share and seasonal purchasing pattern. This information is collected by observational methods. Example: Auditing the provisional stores and collecting data on inventories recorded by copying information from store records. Pantry Audits : This method is used to estimate the consumption of the basket of the goods at the consumer level. Here the researcher collects the inventory of types, quantities and prices of commodities consumed. Thus pantry audit data are recorded from the consumption of consumer's pantry. An important limitation of pantry audit is that, sometimes the audit data alone may not be sufficient to identify the consumer's preferences. Consumer Panels : An extension of Pantry Audit approach on regular basis is known as Consumer Panels. A set of consumers are arranged to come to an understanding to maintain a daily records of their consumption and the same is made available to researcher on demand. In other words Consumer Panel is a sample of consumers interviewed repeatedly over a period of time. Field works : To collect the primary data any researcher or investigator may are different methods wherein they go to door to door, use telephone to collect data. SUMMARY Since there are various methods of data collection, the researcher must select the appropriate data. Hence, the following factors to be kept in mind by the researcher: 1. Nature, Scope and Object of the Enquiry 2. Availability of funds 3. Time factor 4. Precision required
KEY TERMS · · · · · · · · · · ·
Primary data Secondary data Interview Questionnaire Schedule Observation Survey Warranty cards Pantry Audits Distributor Audits Consumer Panels
QUESTIONS 1. Describe the different data sources, explaining their advantages and disadvantages. 2. What is bias and how it can be reduced during interviews? 3. "Every data collection method has it own built in biases. Therefore resorting to multi methods of data collection is only going to compound the biases". How would you evaluate the statement? 4. Discuss the role technology in data collection. 5. What is your view on using the warranty cards and Distributor audits in data collection? 6. Differentiate the questionnaire and Interview schedules to decide the best one. 7. Discuss the main purposes for which Survey methods are used.
- End of Chapter LESSON – 10 SAMPLING METHODS
OBJECTIVES · ·
To define the terms Sampling, Sample, Population To describe the various sampling designs
· ·
To discuss the importance of Confidence and Precision To identify and use the appropriate sampling designs for different research purposes
STRUCTURE · · · ·
Universe/population Sampling Sampling techniques Importance of sample design and sample size
INTRODUCTION Sampling Methods is a process of selecting sufficient number of elements from the population. This is also understood as the process of obtaining information about entire population by examining only a part of it. So sample can be defined as a subset of the population. In other words, some, but not all elements of the population would form the sample. Basically sampling helps a researcher in variety of ways as follows: · · · ·
It saves time and money. A sample study usually is less expensive than a population survey. It also helps the researcher to obtain the accurate results. Sampling is only way when the population is very large in size. It enables to estimate the sampling error so; this assists in obtaining information and in convening the characteristics of population.
To understand the sampling process, the researcher also should understand the following terms: 1. Universe / Population In any of the research, the interest of the researcher is mainly in studying the various characteristics relating to items or individuals belonging to a particular group. This group of individuals under study is known as the population or universe. 2. Sampling A finite subset-selected from a population with the objective of investigating its properties is called a "sample". The number of units in the sample is known as the 'sample size'. This is the important role of any research which enables to draw conclusions about characteristics of the population. 3. Parameter & Statistics
The statistical constants used for further analysis of Data collected such as Mean (μ), Variance (σ2), Skewness (β1), Kurtosis (β2), Correlation (γ) can be computed for the sample drawn from the population. Sampling is the important part of any research before data collection. So the sampling process should be done in a careful way to obtain the exact samples and sample size of the population on which the research is to be done. Example: A researcher who would like to study the customer satisfaction for a health drink namely Horlicks should identify the population who are consuming Horlicks. If the consumers are varying in age, genders all over the state or country, he should be able to decide to particular consumers are going to be focused. Again, if the number is more to survey he has to decide on how many individuals he targets for his study. Hence the effecting sampling process should have the following steps: Define the population (Elements, units, extent, and time) ↓ Specify the sample frame (The mean of representing the elements of population map, city directory) ↓ Specifying the sampling unit (Sampling unit containing more population elements) ↓ Specifying the sampling method (The method by which sampling units are to be selected) ↓ Determine the sample size (The no. of elements of the population to decided) ↓
Sampling plan (Procedure for selected sampling unit) ↓ Select the sample (The effective and field work revision for section of samples) Hence a sample design is a definite plan for obtaining a sample from a given population. So whenever samples have to be decided for the study, the following can be considered: - Outline the universe - Define a sampling unit - Sampling frame - Size of the sample Sampling Techniques Sampling techniques can be divided into two types: 1. Probability or representative sampling a. Simple random sampling b. Stratified random sampling c. Systematic sampling d. Cluster sampling e. Multistage sampling 2. Non probability or judgmental sampling a. Quota sampling b. Purposive sampling Other methods ·
Snow ball sampling
· ·
Spatial sampling Saturation sampling
1. Probability Sampling This is a scientific technique of drawing samples form the population according to some laws of change according to which each unit in the universe has some definite preassigned probability of being selected in the sample. Simple Random Sampling In this technique, sample is drawn in such a way that every elements or unit in the population has an equal and independent chance of being included in the sample. The unit selected in any draw from the population is not preplanned in population before making the next draw is known as simple random sampling without replacement. If the unit is replaced back before making the next draw the sampling plan is called as simple random sampling with replacement. Stratified Random Sampling When the population is heterogeneous with respect to the variable or characteristics under the study this sampling method is used. Stratification means division into homogenous layers or groups. Stratified random sampling involves stratifying the given population into a number of sub-groups or sub-population known as strata. The characteristics of stratified samples are as follows: · · ·
The units within each stratum are as homogenous as possible. The differences between various strata are as marked as possible. Each and every unit in the population belongs to one and only one stratum.
The population can be stratified according to geographical, sociological or economic characteristics. Some of the commonly used stratifying factors are age, sex, income, occupation, education level, geographic area, economic status etc. To decide on the no. of samples or items drawn from the different strata, will be wept proportional to the sizes of the strata. Example: If pi represents the proportion of the population included in stratum 'i', and n represents the total sample size, then the number of elements selected from stratum i is (n - pi) Example: Suppose we need a sample size of n = 30 to be drawn from a population of size N= 6000 which is divided into three strata of sizes N1 = 3000, N2 = 1800, N3 = 1200.
Total population = 3000+1800+1200 = 6000
Hence using the proportional allocation, the sample size for different strata are 15, 9, and 6, which are proportionate to the strata sizes of 3000, 1800, and 1200. Systematic Sampling This sampling is a slight variation of simple random sampling in which only the first sample unit is selected at random while remaining units are selected automatically in a definite sequence at equal spacing from one another. This kind of sampling is recommended only when if a complete and up to date list of sampling units is available and the units are arranged in a systematic order as alphabetical, chronological, geographical etc. Systematic sampling can be taken as an improvement over a simple random sampling since it spreads more evenly over the entire population. This method is one of the easier and less costly methods of sampling and can be conveniently used in case of large population. Cluster Sampling If the total area of interest happens to be a big one, a convenient way in which a sample can be taken is to divide the area into a number of smaller non-overlapping areas and then to randomly select a number of these smaller areas. In cluster sampling, the total population is divided into a number of relatively small sub divisions which are themselves clusters of still smaller units and some of these clusters are randomly selected for inclusion in overall sample. Cluster sampling reduces the cost
by concentrating surveys in clusters. But this type of sampling is less accurate than random sampling. Multistage Sampling It is a further development of the principle of cluster sampling. In case of investigating the working efficiency of nationalized banks in India, a sample of few banks may be taken for the purpose of investigation. Here to select the banks as a first step, the main states in a country are selected from the states from which the banks to be included for the study will be selected. This represents the two stage sampling. Even further, from the district certain towns may be selected, from where the banks will be selected. This may represent three stages sampling. Even thereafter, instead of taking census from all the banks in all the towns we have selected, once again banks may be selected randomly for the survey. Hence the random selection at all levels (various levels) is known as multistage random sampling design. Sequential Sampling This is one of the complex sampling designs. The ultimate size of sample in this technique is not fixed in advance but it is determined according to the mathematical decision rules on the basis of information yielded in the survey. This method is adopted when sampling plan is accepted in context of Statistical Quality Control. Example: When a lot is to be accepted or rejected on the basis of single sample, it is known as single sampling; when the decision is to be taken on the basis of two samples it is known as double sampling, and in case the decision is based on the more than two samples but the number of samples is certain and decided in advance, the sampling is known as multi sampling. In case when the number of samples is more than two but it is neither certain nor decided in advance, this type of system is often referred to as Sequential Sampling. So in case of Sequential Sampling, one can go on taking samples one after another as long as one desires to do so. 2. Non-probability Sampling Quota Sampling This is stratified-cum-purposive or judgment sampling and thus enjoys the benefits of both. It aims at making the best use of stratification without incurring the high costs involved in probabilistic methods. There is considerable saving in time and money as the simple units may be selected that they are close together. If carefully experienced by skilled and experienced investigators who are aware of the limitations of judgment sampling and if proper controls are imposed on the investigators, this sampling method may give reliable results.
Purposive or Judgment Sampling A desired number of sampling units are selected deliberately so that only important items representing the true characteristics of the population are included in the sample. A major disadvantage of this sampling is that it is highly subjective, since the selection of the sample depends entirely on the personal convenience and beliefs. Example: In the case of socio-economic survey on the standard of living from people in Chennai, if the researcher wants to show that the standard has gone down, he may include only individuals from the low income stratum of the society in the samples and exclude people from rich areas. Other Forms of Sampling Snow Ball Sampling : This method is used in the cases where information about units in the population is not available. If a researcher wants to study the problem of the weavers in a particular region, he may contact the weavers who are known to him. From them, he may collect the addresses of other weavers in the various parts of the region he selected. From them again he may collect the information on other known weavers to him. By repeating like this for several times, he will be able to identify and contact the majority of weavers from a selected region. He could then draw a sample from this group. This method is useful only when individuals in the target group have contact with one another, and also willing to reveal the names of others in the group. Spatial Sampling : Some populations are not static and moving from place to place but staying at one place when an event is taking place. In such case the whole population in a particular place is taken into the sampling and studied. Example: The number of people living in Dubai may vary depending on many factors. Saturation Sampling : Sometimes if all members of population is need to be studied so as to get a picture of entire population. The sampling method that requires a study of entire population is called Saturation Sampling. This technique is more familiar in Socio metric studies where in distorted results will be produced even if one person is left out. Example: In case of analyzing the student's behavior of one particular class room, all the students in the class room must be examined. From the above discussion on sampling methods, normally one may resort to simple random sampling since biasness is generally eliminated in this type of sampling. At the same time, purposive sampling is considered more appropriate when the universe happens to be small and a known characteristic of it is to be studied intensively. In situations where random sampling is not possible then it is advisable to use necessarily a sampling design other than random sampling. Determination of Sample size
Determination of appropriate sample size is crucial part of any business research. The decision on proper sample size tremendously requires the use of statistical theory. When a business research report is been evaluated, the evaluation start with the question of 'how big is the sample size?' Having discussed various sampling designs it is important to focus the attention on Sample Size. Suppose we select a sample size of 30 from the population of 3000 through a simple random sampling procedure, will we able to generalize the findings to the population with confidence? So in this case what is the sample size that would be required to carry out the research? It is the known fact that larger the sample size, the more accurate the research is. In fact this is the fact based on the statistics. According to this fact, increasing the sample size decreases the width of the confidence interval at a given confidence level. When the standard deviation of the population is unknown, a confidence interval is calculated by using the formula: Confidence Interval, μ = X ± KSx where, Sx = S / Sqrt(n) In sum, choosing the appropriate sampling plan is one of the important research design decisions the researcher has to make. The choice of a specific design will depend broadly on the goal of research, the characteristics of the population, and considerations of cost. Issues of Precision and Confidence in determining Sample size We now need to focus attention on the second aspect of the sampling design issue—the sample size. Suppose we select 30 people from a population of 3,000 through a simple random sampling procedure. Will we be able to generalize our findings to the population with confidence? What is the sample size that would be required to make reasonably precise generalizations with confidence? What do precision and confidence mean? A reliable and valid sample should enable us to generalize the findings from the sample to the population under investigation. No sample statistic (X, for instance) is going to be exactly the same as the population parameter (Sx), no matter how sophisticated the probability sampling design is. Remember that the very reason for a probability design is to increase the probability that the sample statistics will be as close as possible to the population parameters. Precision
Precision refers to how close our estimate is to the true population characteristic. Usually, we would estimate the population parameter to fall within a range, based on the sample estimate. Example: From a study of a simple random sample of 50 of the total 300 employees in a workshop, we find that the average daily production rate per person is 50 pieces of a particular product (X = 50). We might then (by doing certain calculations, as we shall see later, be able to say that the true average daily production of the product (X) would lie anywhere between 40 and 60 for the population of employees in the workshop. In saying this, we offer an interval estimate, within which we expect the true population mean production to be (μ = 50 ± 10). The narrower this interval, the greater is the precision. For instance, if we are able to estimate that the population mean would fall anywhere between 45 and 55 pieces of production (μ = 50 ± 5) rather than 40 and 60 (μ = 50 ± 10), then we would have more precision. That is, we would now estimate the mean to lie within a narrower range, which in turn means that we estimate with greater exactitude or precision. Precision is a function of the range of variability in the sampling distribution of the sample mean. That is, if we take a number of different samples from a population, and take the mean of each of these, we will usually find that they are all different, are normally distributed, and have a dispersion associated with them Even if we take only one sample of 30 subjects from the population, we will still be able to estimate the variability of the sampling distribution of the sample mean. This variability is called the standard error, denoted by 'S'. The standard error is calculated by the following formula: Sx = S / Srqt(n) where, S = Standard deviation of the sample n = Sample size Sx = Standard error or the extent of precision offered by the sample. In sum, the closer we want our sample results to reflect the population characteristics, the greater will be the precision we would aim at. The greater the precision required, the larger is the sample size needed, especially when the variability in the population itself is large. Confidence Whereas precision denotes how close we estimate the population parameter based on the sample statistic, confidence denotes how certain we are that our estimates will really hold true for the population. In the previous example of production rate, we know we are more .precise when we estimate the true mean production (μ) to fall somewhere between 45 and 55 pieces, than somewhere between 40 and 60.
In essence confidence reflects the level of certainty with which we can state that our estimates of the population parameters, based on our sample statistics will hold true. The level of confidence can range from 0 to 100%. A 95% confidence is the conventionally accepted level for most business research, most commonly expressed by denoting the significance level as p = .05. In other words, we say that at least 95 times out of 100, our estimate will reflect the true population characteristic. In sum, the sample size n, is a function of 1. The variability in the population 2. Precision or accuracy needed 3. Confidence level desired 4. Type of sampling plan used, for example, sample random sampling versus stratified random sampling It thus becomes necessary for researchers to consider at least four points while making decisions on the sample size needed to do the research: (1) Much precision is really needed in estimating the population characteristics interest, that is, what is the margin of allowable error? (2) How much confidence is really needed, i.e. how much chance can we take of making errors in estimating the population parameters? (3) To what extent is there variability in the population on the characteristics investigated? (4) What is the cost-benefit analysis of increasing the sample size? Determining the Sample Size Now that we are aware of the fact that the sample size is governed by the extent of precision and confidence desired, how do we determine the sample, retired for our research? The procedure can be illustrated through an example: Suppose a manager wants to be 95% confident that the withdrawals in a bank will be within a confidence interval of ±$500. Example of a simple of clients indicates that the average withdrawals made by them have a standard deviation of $3,500. What would be the sample size needed in this case? We noted earlier that the population mean can be estimated by using the formula: μ = X ± KSx
Given, S = 3500. Since the confidence level needed here is 95%, the applicable K value is 1.96 (t-table). The interval estimate of ±$500 will have to encompass a dispersion of (1.96 x standard error). That is,
The sample size needed in the above is 188. Let us say that this bank has the total clientele of only 185. This means we cannot sample 188 clients. We can, in this case, apply the correction formula and see what sample size would be needed to have the same level of precision and confidence given the fact that we have a total of only 185 clients. The correction formula is as follows:
where, N = total number of elements in the population = 185 n = sample size to be estimated = ? Sx = Standard error of estimate of the mean = 255.10 S = Standard deviation of the sample mean = 3500 Applying the correlation formula, we find that 255.10 = 3500 × √n × √185-n/184 the value of n to be 94. We would now sample 94 of the total 185 clients.
To understand the impact of precision and/or confidence on the sample size, let us try changing the confidence level required in the bank withdrawal exercise which needed a sample size of 188 for a confidence level of 95%. Let us say that the bank manager now wants to be 99% sure that the expected monthly withdrawals will be within the interval of ±$500. What will be the sample size now needed?
The sample has now to be increased 1.73 times (from 188 to 325) to increase the confidence level from 95% to 99%.It is hence a good idea to think through how much precision and confidence one really needs, before determining the sample size for the research project. So far we have discussed sample size in the context of precision and confidence with respect to one variable only. However, in research, the theoretical framework has several variables of interest, and the question arises how one should come up with a sample size when all the factors are taken into account. Krejcie and Morgan (1970) greatly simplified size decision by providing a table that ensures a good decision model. The Table provides that generalized scientific guideline for sample size decisions. The interested student is advised to read Krejcie and Morgan (1970) as well as Cohen (1969) for decisions on sample size.
Importance of Sampling Design and Sample Size It is now possible to see how both sampling design and the sample size are important to establish the representativeness of the sample for generality. If the appropriate sampling design is not used, a large sample size will not, in itself, allow the findings to be generalized to the population. Likewise, unless the sample size is adequate for the desired level of precision and confidence, no sampling design, however sophisticated, can be useful to the researcher in meeting the objectives of the study. Hence, sampling decisions should consider both the sampling design and the sample size. Too large a sample size, however (say, over 500) could also become a problem in as
much as we would be prone to committing Type II errors. Hence, neither too large nor too small sample sizes help research projects. Roscoe (1975) proposes the following rules of thumb for determining sample size: 1. Sample sizes larger than 30 and less than 500 are appropriate for most research. 2. Where samples are to be broken into sub samples; (male/females, juniors/ seniors, etc.), a minimum sample size of 30 for each category is necessary. 3. In multivariate research (including multiple regression analyses), the sample size should be several times (preferably 10 times or more) as large as the number of variables in the study. 4. For simple experimental research with tight experimental controls (matched pairs, etc.), successful research is possible with samples as small as 10 to 20 in size. KEY TERMS 1. "What is Sample Design"? What all are the points to be considered to develop a sample design? 2. Explain the various sampling methods under probability sampling. 3. Discuss the non probability sampling methods. 4. What are the importance of sample size and sampling design? 5. Discuss the other sampling methods. 6. Explain why cluster sampling is a probability sampling design. 7. What are the advantages and disadvantages of cluster sampling? 8. Explain what precision and confidence are and how they influence sample size. 9. The use of a convenience sample used in organizational research is correct because all members share the same organizational stimuli and go through almost the same kinds of experience in their organizational life. Comment. 10. Use of a sample of 5,000 is not necessarily better than one of 500. How would you react to this, statement? 11. Non-probability sampling designs ought to be preferred to probability sampling designs in some cases. Explain with an example
- End of Chapter LESSON – 11 THE NATURE OF FIELD WORK
OBJECTIVES · · ·
To recognize that field work can be performed by many different parties. To understand the importance of training for new interviewers To understand the principle tactics of asking questions
STRUCTURE · · · · ·
Interviewing Training for interviewing The major principle for asking questions Probing Recording the response
INTRODUCTION A personal interviewer administering a questionnaire door to door, a telephone interviewer calling from a central location, an observer counting pedestrians in a shopping mall, and others involved in the collection of data and the supervision of that process are all Field Workers in the field. The activities of the field workers may vary in nature. This lesson would help to understand the interview methods in data collection process of the research and field work management. Who conducts the field work? Data collection is rarely carried out by the person who designs the research project. However, the data collecting stage is crucial, because the research project is no better than the data collected in the field. Therefore, it is important that the research administrator select capable people who may be entrusted to collect the data. An irony of business research is that highly educated and trained individuals design the research, but the people who collect the data typically have little research training or experience. Knowing the vital importance of data collected in the field, research administrators must concentrate on carefully selecting field workers. INTERVIEWING
Interviewing process is establishing rapport with the, respondent Interviewer bias may enter in if the field worker's clothing or physical appearance is unattractive or unusual. Suppose that a male interviewer, wearing a dirty T-shirt, interviews subjects in an upper-income neighborhood. Respondents may consider the interviewer slovenly and be less cooperative than they would be with a person dressed more appropriately. Interviewers and other fieldworkers are generally paid an hourly rate or a per-interview fee. Often interviewers are part-time workers, housewives, graduate students, secondary school teachers from diverse backgrounds. Primary and secondary school teachers are an excellent source for temporary interviewers during the summer, especially when they conduct interviews outside the school districts where they teach. Teachers' educational backgrounds and experiences with the public make them excellent candidates for fieldwork. TRAINING FOR INTERVIEWERS The objective of training is to ensure that the data collection instrument is administered uniformly by all field investigators. The goal of training sessions is to ensure that each respondent is provided with common information. If the data are collected in a uniform manner from all respondents, the training session will have been a success. After personnel are recruited and selected, they must be trained. Example: A woman who has just sent her youngest child off to first grade is hired by an interviewing firm. She has decided to become a working mother by becoming a professional interviewer. The training that she will receive after being selected by a company may vary from virtually no training to a 3-day program if she is selected by one of the larger survey research agencies. Almost always there will be a briefing session on the particular project. Typically, the recruits will record answers on a practice questionnaire during a simulated training interview. More extensive training programs are likely to cover the following topics: 1. How to make initial contact with the respondent and secure the interview 2. How to ask survey questions 3. How to probe 4. How to record responses 5. How to terminate the interview Making Initial Contact and Securing the Interview Interviewers are trained to make appropriate opening remarks that will convince the person that his or her cooperation is important.
Example: "Good afternoon, my name is _____ and I'm from a national survey research company. We are conducting a survey concerning. I would like to get a few of your ideas". Much fieldwork is conducted by research suppliers who specialize in data collection. When a second party is employed, the job of the study designed by the parent firm is not only to hire a research supplier but also to establish supervisory controls over the field service. In some cases a third party is employed. For example, a firm may contact a survey research firm, which in turn subcontracts the fieldwork to a field se vice. Under these circumstances it is still desirable to know the problems that might occur in the field and the managerial practices that can minimize them. Asking the Questions The purpose of the interview is, of course, to have the interviewer ask questions and record the respondent's answers. Training in the art of stating questions can be extremely beneficial, because interviewer bias can be a source of considerable error in survey research. There are five major principles for asking questions: i. Ask the questions exactly as they are worded in the questionnaire. ii. Read each question very slowly. iii. Ask the questions in the order in which they are presented in the questionnaire. iv. Ask every question specified in the questionnaire. v. Repeat questions those are misunderstood or misinterpreted. Although interviewers are generally trained in these procedures, when working in the field many interviewers do not follow them exactly. Inexperienced interviewers may not understand the importance of strict adherence to the instructions. Even professional interviewers take shortcuts when the task becomes monotonous. Interviewers may shorten questions or rephrase unconsciously when they rely on their memory of the question rather than reading the question as it is worded. Even the slightest change in wording can distort the meaning of the question and cause some bias to enter into a study. By reading the question, the interviewer may be reminded to concentrate on avoiding slight variations in tone of voice on particular words phases in the question. PROBING General training of interviewers should include instructions on how to probe when respondents give no answer, incomplete answers, or answers that require clarification.
Probing may be needed for two types of situations. First, it is necessary when the respondent must be motivated to enlarge on, clarify or explain his or her answer. It is the interviewer's job to probe for complete, unambiguous answers. The interviewer must encourage the respondent to clarify or expand on answers by providing a stimulus that will not suggest the interviewer's own ideas or attitudes. The ability to probe with neutral stimuli is the mark of an experienced interviewer. Second, probing may be necessary in situations in which the respondent begins to ramble or lose track of the question. In such cases the respondent must be led to focus on the specific content of the interview and to avoid irrelevant and unnecessary information. The interviewer has several possible probing tactics to choose from, depending on the situation: i. Repetition of the question: The respondent who remains completely silent may not have understood the question or may not have decided how to answer it. Mere repetition may encourage the respondent to answer in such cases. For example, if the question is "What is there that you do not like about your supervisor?" and the respondent does not answer, the interviewer may probe: "Just to check; is there anything you do not like about your supervisor?" ii. An expectant pause: If the interviewer believes the respondent has more to say, the "silent probe" accompanied by an expectant look, may motivate the respondent to gather his or her thoughts and give a complete response Of course, the interviewer must be sensitive to the respondent so that the silent probe does not become an embarrassed silence. iii. Repetition of the Respondent's Reply: Sometimes the interviewer may repeat the verbatim of the respondent. This may help the respondent to expand the answer. RECORDING THE RESPONSES The analyst who fails to instruct fieldworkers in the techniques of recording answers for one study rarely forgets to do so in the second study. Although the concept of recording an answer seems extremely simple, mistakes can be made in the recording phase of the research. All fieldworkers should use the same mechanics of recording. Example: It may appear insignificant to the interviewer whether she uses a pen or pencil, but to the editor who must erase and rewrite illegible words, using a pencil is extremely important. The rules for recording responses to closed questionnaires vary with the specific questionnaire. The general rule, however, is to place a check in the box that correctly reflects the respondent's answer. All too often interviewers don't bother recording the answer to a filter question because they believe that the subsequent answer will make the answer to the filter question obvious. However, editors and coders do not know how the respondent actually answered a question.
The general instruction for recording answers to open-ended-response questions is to record the answer verbatim, a task that is difficult for most people. Inexperienced interviewers should be given the opportunity to practice The Interviewer's Manual of the Survey Research Center provides the instructions on the recording of interviews. Some of its suggestions: · · · · · ·
Recording answers to open-ended-response questions follow Record the responses during the interview Use the respondent's own words Do not summarize or paraphrase the respondent's answer Include everything that pertains to the question objectives Include all of your probes
The basics of effective Interviewing Interviewing is a skilled occupation; not everyone can do it, and even few can do it extremely well. A good interviewer observes the following principles: 1. Have integrity and be honest. 2. Have patience and tact. 3. Pay attention to accuracy and detail. 4. Exhibit the real enquiry at hand, but keep your own opinions to yourself. 5. Be a good listener. 6. Keep the inquiry and respondent's responses confidential. Respect other's rights Terminating the Interview The final aspect of training deals with instructing the interviewers on how to close the interview. Fieldworkers should not close the interview before pertinent information has been secured. The interviewer whose departure hasty will not be able to record those spontaneous comments responds sometimes offer after all formal questions have been asked. Avoiding hasty departures is also a matter of courtesy. Fieldworkers should also answer to the best of their ability any quest the respondent, concerning the nature and purpose of the study. Beat the fieldworker may be required to re-interview the respondent at some his time, he or she should leave the respondent with a positive feeling about having cooperated in a worthwhile undertaking. It is extremely important thank the respondent for his or her cooperation. FIELDWORK MANAGEMENT
Managers of the field operation select, train, supervise, and control fieldworkers. Our discussion of fieldwork principles mentioned selection and training. This section investigates the tasks of the fieldwork manager in greater detail. Briefing Session for Experienced Interviewers After interviewers have been trained in fundamentals, and even when they have become experienced, it is always necessary to inform workers about the individual project. Both experienced and inexperienced fieldworkers must be instructed on the background of the sponsoring organization, sampling techniques, asking questions, callback procedures, and other matters specific to the project. If there are special instructions - for example, about using show cards or video equipment or about restricted interviewing times-these should also be covered during the briefing session. Instructions for handling certain key questions are always important. For example, the following fieldworker instructions appeared in a survey of institutional investors who make buy-and sell decisions about stocks for banks, pension funds, and the like. A briefing session for experienced interviewers might go like, All interviewers report to the central office, where the background of the firm and the general aims of the study are briefly explained. Interviewers are not provided with too much information about the purpose of the study, thus ensuring that they will not transmit any preconceived notions to respondents. For example, in a survey about the banks in a community, the interviewers would be told that the research is a banking study, but not the name of the sponsoring bank. To train the interviewers about the questionnaire, a field supervisor conducts an interview with another field supervisor who acts as a respondent. The trainees observe the interviewing process, after which they each interview and record the responses of another field supervisor. Additional instructions are given to the trainees after the practice interview. Training to Avoid Procedural Errors in Sample Selection The briefing session also covers the sampling procedure. A number of research projects allow the interviewer to be at least partially responsible for selection of the sample. When the fieldworker has some discretion in the selection of respondents, the potential for selection bias exists. This is obvious in the case of quota sampling, but less obvious in other cases. Example: In probability sampling where every nth house is selected, the fieldworker uses his or her discretion in identifying housing units. Avoiding selection error may not be as simple as it sounds. Example: In an older, exclusive neighborhood, a mansion's coach house or servant's quarters may have been converted into an apartment that should be identified as a housing unit. This type of dwelling and other unusual housing units (apartments with alley entrances only, lake cottages, rooming houses) may be overlooked, giving rise to
selection error. Errors may also occur in the selection of random digit dialing samples. Considerable effort should be expended in training and supervisory control to minimize these errors. The activities involved in collecting data in the field may be performed by the organization needing information, by research suppliers, or by third party field service organizations. Proper execution of fieldwork is essential for producing research results without substantial error. Proper control of fieldwork begins with interviewer selection. Fieldworkers should generally be healthy, outgoing, and well groomed. New fieldworkers must be trained in opening the interview, asking the questions, probing for additional information, recording the responses, and terminating the interview. Experienced fieldworkers are briefed for each new project so that they are familiar with its specific requirements. A particular concern of the briefing session is reminding fieldworkers to adhere closely to the prescribed sampling procedures. Careful supervision of fieldworkers is also necessary. Supervisors gather and edit questionnaires each day. They check to see that field procedures are properly followed and that interviews are on schedule. They also check to be sure that the proper sampling units are used and that the proper people are responding in the study. Finally, supervisors check for interviewer cheating and verify a portion of the interviews by reinterviewing a certain percentage of each fieldworker's respondents. SUMMARY This paper outlined the importance of training for new interviewers. In this chapter five major principles for asking questions have been dealt in detail. KEY TERMS · · · · · ·
Field worker Probing Field interviewing Briefing session Training Interview Reinterviewing questions
QUESTIONS 1. What qualities should a field worker possess? 2. What is the proper method of asking questions? 3. When should an interviewer probe? Give examples of how probing should be done? 4. How should an Interviewer terminate the interview?
5. What are the qualities of the interviewer that make him more effective? REFERENCES 1. Ramanuj Majumdar, Marketing Research, Wiley Eastern Limited New Delhi (1991) 2. Cochran, W.G., Sampling Techniques, 2nd ed. New York: John Wiley and Sons. 3. Chaturvedi, J.C., Mathematical Statistics, Agra: Nok Jhonk Karyalaya, 1953.
- End of Chapter LESSON – 12 SOURCES OF DATA
Sources of data - primary - secondary data - Questionnaire design: attitude measurement techniques - motivational research techniques - selection appropriate statistical techniques - correlation - research. OBJECTIVES · · · · ·
To explain the difference between secondary and primary data To discuss the advantages and disadvantages of secondary data To learn the nature of secondary data To understand the evaluation of secondary data sources To learn the sources of secondary data
STRUCTURE · · · ·
Value of secondary data Disadvantage of secondary data Nature and scope of secondary data Sources of secondary data
INTRODUCTION The availability of data source is very much needed to solve the problem and there are many ways by which the data is collected. The task of data collection begins after a research problem has been defined and research designed is prepared. Thus the data to be collected can be classified as being either secondary or primary. The determination of data source is based on three fundamental dimensions as given below:
1. The extent of data already exist in some type, 2. The degree to which the data has been interpreted by someone, and 3. The extent to which the researcher or decision maker understands the reasons and why the data was collected and researched. Primary Data is the data gathered and assembled specifically for the project at hand. It is "finished" raw data, and has yet to receive any type of meaningful interpretation. The primary data is fresh, and since it is collected for the first time, it happens to be original in character. On the other hand, Secondary Data is that which has already been collected by someone else and also passed through statistical process and interpretation. Secondary data is historical data structure of variables previously collected and assembled for some research problem other than the current situation. The sources of primary data tend to be the output of conducting some type of exploration, descriptive or casual research that employs surveys, experiments and / or observation as technique of collecting the needed data. The greater insights underlying primary data will be discussed in the chapter "methods of data collection". The pros & cons of primary data also discussed with reference to various techniques involved in the process. The source of secondary data can be found inside a company at public libraries, and universities, on World Wide Web (www) sites or purchased from a firm specializing in providing secondary information. Here, evaluation and source of data are discussed. THE VALUE / ADVANTAGES OF SECONDARY DATA More and more companies are interested in using the existing data as a major tool in the management decisions. As more and more such data become available, many companies are realizing that they can be used to make sound decisions. Data of this nature are more readily available, often more highly valid and usually less expensive to secure than primary data. "Nowhere in science do we start from scratch" - this quote explains the value of secondary data. Researchers are able to build on the past research - a body of business knowledge. The researchers use other's experience and data when it is available as secondary data. The primary advantage of secondary data is that obtaining data is almost always less expensive and in addition the data can usually be obtained rapidly. The major advantage and disadvantages are discussed below. Advantages of Secondary Data are: 1. It is more economical as the cost of collecting original data is saved. In the collection of primary data, a good deal of effort is required which includes preparation of data collection forms, designing and printing of forms, persons appointed to collect data in
turn involves travel plan need to verify and finally data to be tabulated. All these need large funds which can be utilized elsewhere if secondary data can serve the purpose. 2. Another advantage is that the use of secondary data saves much of the time of the researcher. This also leads to prompt completion of research project. 3. Secondary data are helpful not only because it is useful but the familiarity with the data indicates deficiencies and gaps. As a result, the researcher can make the primary data collection more specific and more relevant to the study. 4. It also helps in gaining new insights to the problem, then can be used to fine tune the research hypothesis and objectives. 5. Finally, secondary data can be used as a basis of comparison with primary data that has been collected for this study. DISADVANTAGES OF SECONDARY DATA An inherent disadvantage of secondary data is that it is not designed specifically to meet the researcher's needs. Secondary data quickly becomes outdated in our rapidly changing environments. Since the purpose of the most of the studies is to predict the future, the secondary data must be timely. Hence the most common problems with secondary data are: 1. Outdated information 2. Variation in definition of terms or classifications. The unit of measurement may cause problems if they are not identical to the researcher's needs. Even though original units were comparable, the aggregated or adjusted units of measurements are not suitable for the present study. When the data are reported in a format that does not exactly meet the researchers needs, the data conversion may be necessary. 3. Another disadvantage of secondary data is that the user has to control over their accuracy even though it is timely & pertinent, they may be inaccurate. THE NATURE AND SCOPE OF SECONDARY DATA Focusing on the particular business or management problem, the researcher needs to determine whether useful information already exists, of exists how relevant the information. Since existing information are more widespread than one might expect. The secondary data exists in three forms: 1. Internal secondary data: The data collected by the individual company for some purpose and reported periodically. This is also called as primary sources. The primary sources are original work of research or raw data without interpretation or pronouncements that represent an official opinion or position - Memos, complete interviews, speeches, laws, regulations, court decisions, standards, and most
government data, including census, economic and labor data. Primary sources are always the most authoritative because the information is not filtered. It also includes inventory records, personnel records, process charts and similar data. 2. External secondary data: It consists of data collected by outside agencies such as government, trade associations of periodicals. This is called as secondary sources. Encyclopedia, text books, handbooks, magazine and newspaper articles and most news crafts are considered to secondary sources. Indeed all reference materials fall into this category. 3. Computerized data sources: This includes internal and external data usually collected by specific companies with online information sources. This can be called as territory sources. These are represented by indexes, bibliography and other finding aids, e.g., internet search engines. Evaluation of secondary data sources The emphasis on secondary data will increase if an attempt is made to establish a set of procedures to evaluate the secondary data regarding the quality of information obtained via secondary data sources. Specifically, if secondary data are to be used to assist in the decision process, then they should be assessed according to the following principles. 1. Purpose: Since most secondary data are collected for purpose other than the one at hand, the data must be carefully evaluated on how they relate to the current research objectives. Many times the original collection of data is not consistent with the particular research study. These inconsistencies usually result from the methods and units of measurement. 2. Accuracy: When observing secondary data researchers need to keep in mind what was actually measured. For example, if the actual purchases in a test market were measured, did they measure the first-time trial purchases or repeat purchases? Researchers must also asses the generality of the data. 3. Questions like i) was the data collected from certain groups only or randomly? ii) was the measure developed properly? iii) was the data presented as the total of responses from all respondents or were they categorized by age, sex or socio economic status? 4. In addition to the above dimensions, researchers must assess when the data were collected. This factor not only damages the accuracy of the data but also may be useless for interpretation. Researchers also must keep in mind that the flaws in the research design and methods will alter the current research in process. 5. Consistency: When evaluating any source of secondary data, a good strategy is to seek out multiple sources of the same data to assure consistency. For example, when evaluating the economic characteristics of a foreign market, a researcher may try to
gather the same information from government sources, private business publications and specially import or export trade publications. 6. Credibility: Researcher should always question the credibility of the secondary data source. Technical competence, service quality, reputation, and tracing and expertise of personnel representing the organization are some measures of credibility. 7. Methodology: The quality of secondary data is only as good as the methodology employed to gather them. Flaws in methodological procedures could produce results that are invalid, unreliable or not generalizable beyond the study itself. Therefore, researchers must evaluate the size and description of the sample, the response date, the questionnaire, and the overall procedure for collecting the data (telephone, mail, or personal interview). 8. Bias: Researchers must try to determine the underlying motivation or hidden agenda, if any, behind the secondary data. It is not uncommon to find many secondary data sources published to advance the interest of commercial, political or other intersect groups. Researchers should try to determine if the organization reporting the report is motivated by certain purpose.
SOURCES OF SECONDARY DATA
A. Internal sources Generally, internal data consists of sales or cost information. Data of this kind is found in internal accounting or financial records. The two most useful sources of information are sales invoices and accounts receivable reports; quarterly sales reports and sales activity reports are also useful.
The major sources of internal secondary data are given below: 1. Sales invoices 2. Accounts receivable reports 3. Quality sales reports 4. Sales activity reports 5. Other types a. customer letters b. Customer comment cards c. Mail order forms d. Credit applications e. Cash register receipts f. Sales person expense reports g. Employee exit interviews h. Warranty cards i. Post marketing research studies. B. External Sources When undertaking the search for secondary data researchers must remember that the numbers of resources are extremely large. The researcher needs to connect the sources by common theme. The key variables most often sought by the researchers are given below: 1. Demographic dimensions 2. Employment characteristics 3. Economic characteristics 4. Competitive characteristics
5. Supply characteristics 6. Regulations characteristics 7. International market characteristics The external secondary data do not originate in the firm and are obtained from outside sources. It may be noted that secondary data can be collected from the originating sources or from secondary sources. For example, the office of economic advisor, GOI is the originating source for wholesale prices. In contrast a publication such as RBI bulletin on wholesale price is a secondary source. These data may be available through Government publications, non-governmental publications or syndicated services. Some examples are given below: Government publications 1. Census by Registrar General of India 2. National Income by Central Statistical Organization also statistical abstract, annual survey of industries. 3. Foreign trade by Director General of Commercial Intelligence. 4. Wholesale price index by Office of Economic Advisor 5. Economic Survey - Dept of Economic Affairs. 6. RBI Bulletin - RBI 7. Agricultural Situation in India - Ministry of Agriculture 8. Indian Labor Year Book - Labor Bureau 9. National Sample Survey – Ministry of Planning Non- government publications Besides official agencies, there are number of private organizations which bring out statistics in one form or another on a periodical basis of course industry and trade associations are important like: 1. Indian Cotton Mills Federation or Confederation of Indian Textile Industry - about textile industry. 2. Bombay Mill Owners Association - statistics of workers of mills.
3. Bombay Stock Exchange - on financial accounts & ratios. 4. Coffee Board - coffee statistics 5. Coir Board - coir & coir goods 6. Rubber Board - Rubber statistics 7. Federation of Indian Chambers of Commerce & Industry (FICCI) Syndicated services Syndicated services are provided by certain organization which collect and tabulate information on continuous basis. Reports based on marketing information are sent periodically to subscribers. A number of research agencies offer customized research services to their clients like consumer research, advertising research etc. Publication by international organizations There are several International organizations that publish statistics in their respective areas. SUMMARY In this chapter the importance of secondary data has been outlined. Disadvantage of secondary data have been dealt in detail in this chapter. Sources of secondary data have been outlined in this chapter. KEY TERMS · · · · · · ·
Primary data Secondary data Advantages and disadvantages Evaluation of secondary data Sources of secondary data Governmental publications Syndicated services
QUESTIONS 1. Discuss the difference between primary and secondary data. 2. Explain the advantages and disadvantages of secondary data. 3. Write short notes on nature and scope of secondary data. 4. How will you evaluate the secondary data sources?
5. Discuss the internal and external sources of secondary data.
- End of Chapter LESSON - 13 QUESTIONNAIRE DESIGN
OBJECTIVES · · · · ·
To recognize the importance and relevance of questionnaire design To recognize that the type of information will influence the structure of questionnaire To understand the role data collection method in designing questionnaire To understand how to plan and design without mistakes, and improve its layout To know importance of pretesting.
STRUCTURE · · · ·
Questionnaire design Phrasing question Art of asking question Layout of traditional questionnaires
Many experts in survey research believe that improving the wording of questions can contribute far more to accuracy than can improvements in sampling. Experiments have shown that the range of error due to vague questions or use of imprecise words may be as high as 20 or 30 percent. Consider the following example, which illustrates the critical importance of selecting the word with the right meaning. The following questions differ only in the use of the words should, could, and might: · · ·
Do you think anything should be done to make it easier for people to pay doctor or hospital bills? Do you think anything could be done to make it easier for people to pay doctor or hospital bills? Do you think anything might be done to make it easier for people to pay doctor or hospital bills?
The results from the matched samples: 82 percent replied something should be done, 77 percent replied something could be done, and 63 percent replied something might be done. Thus, a 19 percent difference occurred between the two extremes, should and might. Ironically, this is the same percentage point error as in the Literary Digest Poll, which is a frequently cited example or error associated with sampling.
The chapter outlines procedure for questionnaire design and illustrates that a little bit of research knowledge can be a dangerous thing. A Survey Is Only As Good As the Questions It Asks Each stage of the business research process is important because of its interdependence with other stages of the process. However, a survey is only as good as the questions it asks. The importance of wording questions is easily overlooked, but questionnaire design is one of the most critical stages in the survey research process. "A good questionnaire appears as easy to compose as does a good poem. But it is usually the result of long, painstaking word". Business people who are inexperienced in business research frequently believe that constructing a questionnaire in a matter of hours. Unfortunately, newcomers who naively believe that common sense and good grammar are all that are needed to construct a questionnaire generally learn that their hasty efforts are inadequate. While common sense and good grammar are important in question writing, more is required in the art of questionnaire design. To assume that people will understand the questions is a common error. People simply may not know what is being asked. They may be unaware of the product or topic interest, they may confuse the subject with something else, or the question may not mean the same thing to everyone interviewed. Respondents may refuse to answer personal questions. Further, properly wording the questionnaire is crucial, as some problems may be minimized or avoided altogether if a skilled researcher composes the questions. QUESTIONNAIRE DESIGN: AN OVERVIEW OF THE MAJOR DECISIONS Relevance and accuracy are the two basic criteria a questionnaire must meet if it is to achieve the researcher's purpose. To achieve these ends, a researcher who systematically plans a questionnaire's design will be required to make several decisions - typically, but not necessarily, in the order listed below: 1. What should be asked? 2. How should each question be phrased? 3. In what sequence should the questions be arranged? 4. What questionnaire layout will best serve the research objectives? 5. How should the questionnaire be pretested? Does the questionnaire need to be revised? What Should Be Asked?
During the early stages of the research process, certain decisions will have been made that will influence the questionnaire design. The preceding chapters stressed the need to have a good problem definition and clear objectives for the study. The problem definition will indicate which type of information must be collected to answer the manager's questions; different types of questions may be better at obtaining certain types of information than others. Further, the communication mediums used for data collection - telephone interview, personal interview, or self-administered survey will have been determined. This decision is another forward linkage that influences the structure and content of the questionnaire. The specific questions to be asked will be a function of the pervious decisions later stages of the research process also have an important impact o questionnaire wording. For example, determination of the questions that should be asked will be influenced by the requirements for data analysis. As the questionnaire is being designed, the researcher should be thinking about the types of statistical analysis that will be conducted. Questionnaire Relevancy A questionnaire is relevant if no unnecessary information is collected and if the information that is needed to solve the business problem is obtained. Asking the wrong or an irrelevant question is a pitfall to be avoided. If the task is to pinpoint compensation problems, for example, questions asking for general information about morale may be inappropriate. To ensure information relevancy, the researcher must be specific about data needs, and there should be a rationale for each item of information. After conducting surveys, many disappointed researchers have discovered that some important questions were omitted. Thus, when planning the questionnaire design, it is essential to think about possible omissions. Is information being collected on the relevant demographic and psychographic variables? Are there any questions that might clarify the answers to other questions? Will the results of the study provide the solution to the manager's problem? Questionnaire Accuracy Once the researcher has decided what should be asked, the criterion of accuracy becomes the primary concern. Accuracy means that the information is reliable and valid while experienced researchers generally believe that one should use simple, understandable, unbiased, unambiguous, nonirritating words, no step-by-step procedure to ensure accuracy in question writing can be generalized across projects. Obtaining accurate answers from respondents is strongly influenced by the researcher's ability to design a questionnaire that facilitates recall and that will motivate the respondent to cooperate. Respondent tend to be most cooperative when, the subjects of the research is interesting. Also, if questions are not lengthy, difficult to answer, or ego threatening, there is higher probability of obtaining unbiased answers, question wording and sequence substantially influence accuracy. These topics are treated in subsequent sections of this chapter.
PHRASING QUESTIONS There are many ways to phrase question, and many standard question formats have been developed in previous research studies. This section presents h classification of question types and provides some helpful guidelines to researchers who must write questions. Open-Ended Response versus Fixed-Alternative Questions Questions may be categorized as either of two basic types, according to the amount of freedom respondents are given in answering them. Response questions pose some problem or topic and ask the respondent to answer in his or her own words. For example: What things do you like most about your job? What names of local banks can you think of offhand? What comes to mind when you look at this advertisement? Do you think that there are some ways in which life in the United States is getting worse? How is that? If the question is asked in a personal interview, the interviewer may probe for more information by asking such questions as: Anything else? or Could you tell me more about your thinking on that? Open-ended response questions are free-answer questions. They may be contrasted to the fixed-alternative question, sometimes called a "closed question", in which the respondent is given specific, limited-alternative responses and asked to choose the one closest to his or her own viewpoint. For example: Did you work overtime or at more than one job last week? Yes ____ No _____ Compared to ten years ago, would you say that the quality of most products made in Japan is higher, about the same, or not as good? Higher ____ About the same _____ Not as good _____ Open-ended response questions are most beneficial when the researcher is conducting exploratory research, especially if the range of responses is not known. Open-ended questions can be used to learn what words and phrases people spontaneously give to the free-response questions. Respondents are free to answer with whatever is uppermost in their thinking. By gaining free and uninhibited responses, a researcher may find some unanticipated reaction toward the topic. As the responses have the "flavor" of the conversational language that people use in talking about products or jobs, responses to these questions may be a source for effective communication.
Open-ended response questions are especially valuable at the beginning of an interview. They are good first questions because they allow respondents up warm up to the questioning process. The cost of open-ended response questions is substantially greater than that of fixedalternative questions, because the job of coding, editing and analyzing the data is quite extensive. As each respondent's answer is somewhat unique, there is some difficulty in categorizing and summarizing the answers. The process requires an editor to go over a sample of questions to classify the responses in to some sort of scheme, and then all the answers are received and coded according to the classification scheme. Another potential disadvantage of the open-ended response question is that interviewer may influence the responses. While most instructions state that the interviewer is to record answers verbatim, rarely can even the best interviewer get every word spoken by the respondent. There is a tendency for interviewer to take short-cuts in recording answers - but changing even a few of the respondents' words may substantially influence the results. Thus, the final answer often is a combination of the respondent's and the interviewer's ideas rather than the respondent's ideas alone. The simple-dichotomy or dichotomous-alternative question requires the respondent to choose one of two alternatives. The answer can be a simple "yes" or "no" or a choice between "this" and "that". For example: Did you make any long-distance calls last week? Yes _____
No _____
Several types of questions provide the respondent with multiple-choice alternatives. The determinant-choice questions require the respondent to choose one and only one response from among several possible alternatives. For example: Please give us some information about your flight. In which section of the aircraft did you sit? First Class _______ Business Class ______
Coach Class ______
The frequency-determination question is a determinant-choice question that asks for an answer about general frequency of occurrence. For example: How frequently do you watch the MTV television channel? __ Every day __ 5-6 times a week __ 2-4 times a week __ Once a week
__ Less than a week __ Never Attitude rating scales, such as the Likert Scale, Semantic Differential, and Stapel Scale, are also fixed-alternative questions. The checklist question allows the respondent to provide multiple answers to a single question. The respondent indicates past experience, preference, and the like merely by checking off an item. In many cases the choices are adjectives that describe a particular object. A typical checklist follows: Please check which of the following sources of information about investments you regularly use, if any. __ Personal advice of your brokers(s) __ Brokerage newsletters __ Brokerage research reports __ Investment advisory service(s) __ Conversations with other investors __ Reports on the internet __ None of these __ Other (please specify) Most questionnaires include a mixture of open-ended and closed questions. Each form has unique benefits; in addition, a change of pace can eliminate respondent boredom and fatigue. Phrasing Questions for Self-Administered, Telephone, and Personal Interview Surveys The means of data collection (personal interview, telephone, mail, or Internet questionnaire) will influence the question format and question phrasing. In general, questions for mail and telephone surveys must be less complex than those utilized in personal interviews. Questionnaires for telephone and personal interviews should be written in a conversational style. Consider the following question from a personal interview: There has been a lot of discussion about the potential health threat to nonsmokers from tobacco smoke in public building, restaurants, and business offices. How serious a
health threat to you personally is the inhaling of this secondhand smoke, often called passive smoking: Is it a very serious health threat, somewhat serious, not too serious, or not serious at all? 1. Very serious 2. Somewhat serious 3. Not too serious 4. Not serious at all 5. Don't know THE ART OF ASKING QUESTIONS In develop a questionnaire, there are no hard-and-fast rules. Fortunately, however, some guidelines that help to prevent the most common mistakes have "been developed from research experience. 1. Avoid Complexity: Use Simple, Conversational Language Words used in questionnaires should be readily understandable to all respondent. The researcher usually has the difficult task of adopting the conversational language of people from the lower educational levels without talking down to better-educated respondents. Remember, not all people have the vocabulary of a college student. A substantial number of Americans never go beyond high school. Respondents can probably tell an interviewer whether they are married, single, divorced, separated, or windowed, but providing their "marital status" may present a problem. Also, the technical jargon of corporate executives should be avoided when surveying retailers, factory employees, or industrial users. "Marginal analysis," "decision support systems," and other words from the language of the corporate staff will not have the same meaning to- or be understood by- a store owner / operator in a retail survey. The vocabulary in following question (from an attitude survey on social problems) is probably confusing for many respondents: When effluents from a paper mill can be drunk, and exhaust from factory smokestacks can be breathed, then humankind will have done a good job in saving the environment... Don't you agree that what we want is zero toxicity and no effluents? This lengthy question is also a leading question. 2. Avoid Leading and Loaded Questions
Leading and loaded questions are a major source of bias in question wording. Leading Questions suggest or imply certain answers. In a study of the dry-cleaning industry, this question was asked: Many people are using dry cleaning less because of improved wash-and-wear clothes. How do you feel wash-and-wear clothes have affected your use of dry-cleaning facilities in the past 4 years? _______Use less
______No change
__________Use more
The potential "bandwagon effect" implied in this question threatens the study's validity. Loaded questions suggest a socially desirable answer or are emotionally charged. Consider the following: In light of today’s farm crisis, it would be in the public's best interest to have the federal government require labeling of imported meat. _____Strongly ____Agree ____Uncertain Disagree ____Strongly disagree
____
Answers might be different if the loaded portion of the statement, "farm crisis" had another wording suggesting a problem of less magnitude than a crisis. A television station produced the following 10-second spot asking for viewer feedback: We are happy when you like programs on Channel 7. We are sad when you dislike programs on Channel 7. Write to us and let us know what you think of our programming. Most people do not wish to make others sad. This question is likely to elicit only positive comments. Some answers to certain questions are more socially desirable than others. For example, a truthful answer to the following classification question might be painful Where did you rank academically in your high school graduating class? ___Top quarter ___2nd quarter
___3rd quarter
___4th quarter
When taking personality tests, respondents frequently are able to determine which answers are most socially acceptable, even though those answers do not portray their true feelings. 3. Avoid Ambiguity: Be a Specific as Possible Items on questionnaires are often ambiguous because they are too general. Consider indefinite words such as often, usually, regularly, frequently, many, good, fair, and poor. Each of these words has many meanings. For one person, frequent reading of Fortune magazine may be reading six or seven issues a year; for another it may be two issues a
year. The word fair has a great variety of meanings; the same is true for many indefinite words. Questions such as the following should be interpreted with care: How often do you feel that you can consider all of the alternatives before making a decision to follow a specific course of action? ___Always
___Fairly
___Occasionally
___Seldom ___Never often
In addition to utilizing words like occasionally, this question asks respondents to generalize about their decision-making behaviour. The question is not specific. What does consider mean? The respondents may have a tendency to provide stereotyped "good" management responses rather than to describe their actual behavior. People's memories are not perfect. We tend to remember the good and forget the bad. 4. Avoid Double- Barreled Items A question covering several issues at once is referred to as double-barreled and should always be avoided. It's easy to make the mistake of asking two questions rather than one. For example: Please indicate if you agree or disagree with the following statement: "I have called in sick or left work to golf". Which reason is it - calling in sick or leaving work (perhaps with permission) to play golf? When multiple questions are asked in one question, the results may be exceedingly difficult to interpret. For example, consider the following question from a magazine survey entitled – "How Do You Feel about Being a Woman?": Between you and your husband, who does the housework (cleaning, cooking, dishwashing, laundry) over and above that done by any hired help? I do all of it I do almost all if it I do over half of it We split the work fifty-fifty My husband does over half of it The answers to this question do not tell us if the wife cooks and the husband dries the dishes. 5. Avoid Making Assumptions
Consider the following question: Should Mary's continue its excellent gift-wrapping program? ___Yes
___ No
The question contains the implicit assumption that people believe the gift-wrapping program is excellent. By answering yes, the respondent implies that the program is, in fact, excellent and that things are just fine as they are. By answering no, he or she implies that the store should discontinue the gift wrapping. The researcher should not place the respondent in that sort of bind by including an implicit assumption in the question. 6. Avoid Burdensome Questions That May Tax the Respondent's Memory A simple fact of human life is that people forget. Researchers writing questions about past behavior or events should recognize that certain questions may make serious demands on the respondent's memory. Writing questions about prior events requires a conscientious attempt to minimize the problem associated with forgetting. It many situations, respondents cannot recall the answer to a question. For example, a telephone survey conducted during the 24-hour period following airing of the Super Bowl might establish whether the respondent watched the Super Bowl and then ask: "Do you recall any commercials on that program?" If the answer is positive, the interviewer might ask: "what brands were advertised?" These two questions measure unaided recall, because they give the respondent no clue as to the brand of interest. What is the Best Question Sequence? The order of questions, or the question sequence, may serve several functions for the researcher. If the opening questions are interesting, simple to comprehend, and easy to answer, respondents' cooperation and involvement can be maintained throughout the questionnaire. Asking easy-to-answer questions teaches respondents their role and builds confidence; they know this is a researcher and not another salesperson posing as an interviewer. If respondents' curiosity is not aroused at the outset, they can become disinterested and terminate the interviewer. A mail research expert reports that a mail survey terminates the interview. A mail research expert reports that a mail survey among department store buyers drew an extremely poor return. However, when some introductory questions related to the advisability of congressional action on pending legislation of great importance to these buyers were placed first on the questionnaire, a substantial improvement in response rate occurred. Respondents completed all thequestions, not only those in the opening section. In their attempts to "warm up" respondents toward the questionnaire dependent researchers frequently ask demographic or classification questions at the beginning of the questionnaire. This is generally not advisable. It may embarrass or threaten respondents. It is generally better to ask embarrassing questions at the middle or end of
the questionnaire, after rapport has been established between respondent and interviewer. Sequencing specific questions before asking about broader issues is a common cause of order bias. For example, bias may arise if questions about a specific clothing store are asked prior to those concerning the general criteria for selecting a clothing store. Suppose a respondent who indicates in the first portion of a questionnaire that the shops at a store where parking needs to be improved. Later in the questionnaire, to avoid appearing inconsistent, she may state that parking is less important a factor than she really believes it is. Specific questions may thus influence the more general ones. Therefore, it is advisable to ask general questions before specific questions to obtain the freest of open ended responses. This procedure, known as the funnel technique, allows for researcher to understand the respondent’s frame of reference before asking more specific questions about the level of the respondent’s information and the intensity of his or her opinions. One advantage of internet surveys is the ability to reduce order bias by having the computer randomly order questions and/or response alternatives. With complete randomization, question order is random and respondents see response alternatives in random positions. Asking a question that doesn’t apply to the respondent or that the respondent is not qualified to answer may be irritating or may cause a biased response because the respondent wishes to please the interviewer or to avoid embarrassment. Including a filter question minimizes the chance of asking questions that are inapplicable. Asking "where do you generally have cheque-cashing problems in Delhi" may elicit a response even though the respondent has not had any cheque-cashing problems and may simply wish to please the interviewer with an answer. A filter question such as: Do you ever have a problem cashing a cheque in Delhi? ___Yes
___No
would screen out the people who are not qualified to answer. Another form of filter question, the pivot question, can be used to obtain income information and other data that respondents may be reluctant to provide. For example, a respondent is asked. "Is your total family income over Rs.75,000?" IF NO, ASK... "Is it over or under Rs.50,000?" IF UNDER, ASK… "Is it over or under Rs.25,000?" So, the options are 1. Over Rs.75,000 2. Rs.50,001 - Rs.75,000
3. Rs.25,001 - Rs.50,000 4. Under Rs.25,001 Structuring the order of questions so that they are logical will help to ensure the respondent’s cooperation and eliminate confusion or indecision. The researcher maintains legitimacy by making sure that the respondent can comprehend the relationship between a given question and the overall purpose of the study. What is the best layout? Good layout and physical attractiveness are crucial in mail, Internet, and other selfadministered questionnaires. For different reasons it is also important to have a good layout in questionnaires designed for personal and telephone interviews. LAYOUT OF TRADITIONAL QUESTIONNAIRES The layout should be neat and attractive, and the instructions for the interviewer should be clear. Often money can be spent on an incentive to improve the attractiveness and quality of the questionnaire. Mail questionnaires should never be overcrowded. Margins should be of decent size, white space should be used to separate blocks of print, and any unavoidable columns of multiple boxed should be kept to a minimum. All boldface capital letters should easy to follow. Questionnaires should be designed to appear as brief and small as possible. Sometimes it is advisable to use a booklet form of questionnaire, rather than a large number of pages stapled together. In situations where it is necessary to conserve space on the questionnaire or to facilitate entering the data into a computer or tabulating the data, a multiple-grid layout may be used. In this type of layout, a question is followed by corresponding response alternatives arranged in a grid or matrix format. Experienced researchers have found that is pays to phrase the title of the questionnaire carefully. In self-administered and mail questionnaires a carefully constructed title may by itself capture the respondent’s interest, underline the important of the research ("Nationwide Study of Blood donors"), emphasize the interesting nature of the study ("Study of Internet Usage"), appeal to the respondent’s ego ("Survey among Top Executives"), or emphasize the confidential nature of the study ("A Confidential Survey among…"). The researcher should take steps to ensure that the wording of the title will not bias the respondent in the same way that a leading question might. When an interviewer is to administer the questionnaire, the analyst can design the questionnaire to make the job of following interconnected questions much easier by
utilizing instruction, directional arrows, special question formats, and other tricks of the trade. SUMMARY Many novelists write, rewrite, and revise certain chapters, paragraphs, and even sentences of their books. The research analyst lives in a similar world. Rarely does one write only a first draft of a questionnaire. Usually, the questionnaire is tried out on a group that is selected on a convenience basis and that is similar in makeup to the one that ultimately will be sampled. Researchers should select a group that is not too divergent from the actual respondents. (e.g. business students as surrogates for business people), but it is not necessary to get a statistical sample for protesting. The protesting process allows the researchers to determine if the respondents have any difficulty understanding the questionnaire and whether there are any ambiguous or biased questions. This process is exceedingly beneficial. Making a mistake with 25 or 50 subjects can avert the disaster of administering an invalid questionnaire to several hundred individuals. KEY TERMS · · · · · · · ·
Open ended response questions Fixed alternative questions Leading question Loaded question Double-barreled question Funnel technique Filter question Pretesting
QUESTIONS 1. What is the difference between leading question and loaded question? 2. Design an open end question to measure a reaction to a particular advertisement. 3. Design a complete a questionnaire to evaluate job satisfaction. 4. Develop a checklist to consider in questionnaire construction.
- End of Chapter LESSON – 14
MEASUREMENT
OBJECTIVES · · · · · ·
To know what is to be measured To define the operation definition and scale measurement To distinguish among nominal, ordinal, interval, and ratio scales To understand the criteria of good measurement To discuss the various methods of determining reliability To discuss the various methods of assessing validity
STRUCTURE · · · ·
Measurement Scale measurement Types of scale Criteria for good measurement
MEASUREMENT Measurement is an integral part of the modern world. Today we have progressed in the physical sciences to such as extent that we are now able to measure the rotation of a distant star, the attitude in micro-inches and so on. Today such a precise physical measurement is very critical. In many business situations, the majority of the measurements are applies to things that are much more abstract than attitude or time. The accurate measurement is essential for effective decision making. The purpose of this chapter is to provide with a basic understanding of the measurement process and rules needed for developing sound scale measurements. In management research, measurement is viewed as the integrative process of determining the amount (intensity) of information about constructs, concepts or objects of interest and their relationship to a defined problem or opportunity. It is important to understand the two aspects of measurement one is construct development, which provides necessary and precise definition which begins the research process called problem definition in turn determine what specific data should be collected. Another is scale measurement means how the information is collected with reference to construct. In other words, the goal of construct development is to precisely identify and define what is to be measured including dimensions. In turn, the goal of scale measurement is to determine how to precisely measure the constructs. Regardless of whether the researcher is attempting to collect primary data or secondary data, all data can be logically classified as under. a) State-of-Being Data
When the problem requires collecting responses that are pertinent to the physical, demographical or socioeconomic characteristics of individuals, objects or organizations, the resulting raw data are considered as state-of-being data. This data represent factual characteristics that can be verified through several sources other than the persons providing the information. b) State-of-Mind Data This represents that mental attributes of individuals that are not directly observable or available through some other external sources. It exists only within the minds of people. The researcher has to ask a person to respond the stated questions. Examples are personality traits, attitudes, feelings, perceptions, beliefs, awareness level, preferences, images etc. c) State-of-Behavior Data This represents an individuals or organizations current observable actions or reactions or recorded past actions. A person may categorically ask the past behavior. This can be checked using external secondary sources, but that is very difficult process in terms of time, effort and accuracy. d) State-of-Intension Data This represents individuals or organizations expressed plans of future behavior. Again this also collected by asking carefully defined questions. Like the above data, this also very difficult to verify through external, secondary sources, but verification is possible. With the background information about the type of data which are collected, the following pages will be very useful in understanding the concepts of scale measurement. SCALE MEASUREMENT Scale measurement can be defined as the process of assigning a set of descriptions to represent the range of possible responses to a question about a particular object or construct. Scale measurement directly determines the amount of raw data that can be ascertained from a given questioning or observation method. This attempts to assign designated degrees of intensity to the responses, which are commonly referred to as scale points. The researcher can control the amount of raw data that can be obtained from asking questions by incorporating scale properties or assumption in scale points. There are four scaling properties that a researcher can use in developing scales namely assignment, order distance and origin. 1. Assignment (also referred to as description or category property): It is the researchers’ employment of unique description to identity each object within a set, e.g., the use of numbers, colors, yes & no responses.
2. Order refers to the relative magnitude between the raw responses. It establishes and creates hierarchical rank-order relationship coming objects, e.g., 1st place is better than 4th place. 3. Distance, is the measurement the express the exact difference between the two responses. This allows the researcher and respondent to identify, understand, and accurately express absolute difference between objects, e.g., Family A has 3 children and Family B has 6 children. 4. Origin, refers to the use of a unique starting as being "true zero" e.g., asking respondent his or her weight or current age, market share of specific brand. TYPES OF SCALES While scaling properties determine the amount of raw data that can be obtained from any scale design, all questions and the scale measurement can be logically and accurately classified as one of four basic scale types: nominal, ordinal, integral or ratio. A scale may be defined as any series of items that are arranged progressively according to value or magnitude, into which an item can be placed according to its quantification. The following table represents the relationship between types of scales & scaling properties:
1. Nominal Scale In business research, nominal data are probably more widely collected than any other. It is the simplest type of scale and also the most basic of the four types of scale designs. In such a scale, the numbers serve as labels to identify persons, objects or events. Nominal scales are the least powerful of the four data types. They suggest no order or distance relationship and have no arithmetic origin. This scale allows the researcher only to categorize the raw responses into mutually exclusive and collectively exhaustive. In the nominal scale, the only operation is the counting of numbers in each group. An example of typical nominal scale in business research is the coding of males as 1 and females as 2.
Example 1: Please indicate your current marital status. ____Married
____Single
___Never married ___Widowed
Example 2: How do you classify yourself? ____Indian ___American ___Asian ___Black 2. Ordinal Scale As the name implies, these are ranking scales. Ordinal data include the characteristics of the nominal data plus an indicator of order, means, this data activates both are assignment and order scaling properties. The researcher can rank-order the raw responses into a hierarchical pattern. The use of ordinal data scale implies a statement of "greater than" or "less than" without stating how much greater or less. Examples of ordinal data include opinion and preference scales. A typical ordinal scale in business research asks respondents to rate career opportunities, brands, companies etc., as excellent, good, fair or poor. Example 1: Which of the following one category best describes your knowledge about computers? 1) Excellent 2) Good
3) Basic
4) Little
5) No knowledge
Example 2: Among the listed below, please indicate top three preference using 1, 2, 3 as your choice in the respective source provided: By post By courier By telephone By speed post By internet By person
Also to be noted is that the individual ranking can be combined and get a collective ranking of a group. 3. Interval Scales The structure of this scale not only show the assignment, order scaling properties but also the distance property with interval scale, researchers can identity not only some type of hierarchical order among to raw data but also the specific differences between the data. The classic example of this scale is the Fahrenheit temperature scale. If a temperature is 80 degree, it cannot be said that it is twice as hot as 40 degree. The reason is that 0 degree does not represent the lack of temperature, but a relative point on the Fahrenheit scale. Similarly, when this scale is used to measure psychological attributes, the researcher can comment on the magnitude of differences or compare the average differences but cannot determine the actual strength of attitude toward an object. However many attitude scales are presumed to be interval scales. Interval scales are more powerful than nominal and ordinal scales. Also they are quicker to complete and it is convenient for researcher. Example 1: Into which of the following categories does your income fall? 1. Below 5000 2. 5000 - 10,000 3. 10,000 - 15,000 4. 15,000 - 25,000 5. above 25,000 Example 2: Approximately how long you lived in the current address? 1. Less than 1 year 2. 1-3 years 3. 4-6 years 4. More than 6 years 4. Ratio Scales
This is the only scale that simultaneously activates all four scaling properties. A ratio scale tends to be the most sophisticated scale in the sense that it allows not only to identify the absolute differences between each represents but also absolute comparisons. Examples of ratio scales are the commonly used physical dimensions such as height, weight, distance, money value and population counts. It is necessary to remember that ratio scale structures are designed to allow a "zero" or "true state of nothing" response to be a valid raw response to the question. Normally, the ratio scale requests that respondents give a specific singular numerical value as their response, regardless of whether or not a set of scale points used. The following are the examples of ratio scales: Example 1: Please circle the numbers of children below 18 years of ages in your house? 0
1
2
3
4
5
Example 2: In past seven days, how many times did you go to retail shop? ____ number of times Mathematical and Statistical Analysis of Scales The type of scale that is utilized in business research will determine the form of statistical analysis. For example certain operations can be conducted only if a scale, of a particular nature. The following will show the relationship between scale types and measures of central tendency and dispersion.
Here, A - Appropriate More A - More appropriate Most A - Most appropriate IA - Inappropriate Criteria for good measurement There are four major criteria for evaluating measurement: reliability, validity, sensitivity and practicality. 1. Reliability It refers to the extent to which a scale can reproduce the same measurement results in repeated trials. Reliability applies to a measure when similar results are obtained overtime across situations. Broadly defined, reliability is the degree to which measures are free from error and therefore yield consistent results. As discussed in the earlier chapter the error in scale measurements leads to lower scale reliability. Two dimensions underline the concept of reliability: one is repeatability and the other is internal consistency. First, the test-retest method involves administrating the same scale or measure to the same respondents at two separate times to test for stability. If the measure is stable over time, the test, administered under the same conditions each time, should obtain similar results. The high stability correlation or consistency between the two measures at time 1 and 2 indicates a high degree of reliability. The second dimension of reliability concerns the homogeneity of the measure. The Split-half technique can be used when the measuring tool has many similar questions or statements to which subjects can respond. The instrument is administered and the results are separated by item into even and odd numbers or randomly selected halves. When the two halves are correlated, if the result of the correlation is high, the instrument is said to be high reliability in internal consistency. The Spearman-Brown Correction Formula is used to adjust the effect of test length and to estimate reliability of the whole set. But, this approach may influence the integral consistency because of the way in which the test is split. In order to overcome Kuder - Richardson Formula (KR 20) and Cronbach's Coefficient Alpha are two frequently used examples. The KR 20 is the method from which alpha was generalized and is used to estimate reliability for dichotomous items. Cronbach's alpha has the most utility for multi-scale items at the interval level of measurement.
The third perspective on reliability considers how much error may be introduced by different investigators or different samples of items being studied. In other words, the researcher creates two similar yet different scale measurements for the given construct. An example of this is the scoring of Olympic skaters by a panel of judges. 2. Validity The purpose of measurement is to measure what it intend to measure; but this is not as simple as it sounds at first. Validity is the ability of a measure to measure what it is proposed to measure. If it does not measure what it is designated to measure, there will be problems. To assess the validity there are second ways which are discussed here under. Face validity or Content validity, refers to the subjective agreement among professionals that a scale logically appears to reflect accurately what it intend to measure. When it appears evident to experts that the measure provides adequate coverage of the concept, a measure has face validity. Criterion - Related validity reflects the success of measures used for prediction or estimation. Criterion validity may be classified as either concurrent validity or predictive validity, depending on the time sequence in which the 'new' measurement scale and the criterion measure are correlated. If the new measure is taken at the same time as the criterion measure and shown to be valid, then it has concurrent validity. Predictive validity is established when a new measure predicts a future event. These two measures differ only on the basis of time. Construct validity is established by the degree to which a measure confirms the hypotheses generated from theory based on concepts. It implies the empirical evidence generated by a measure with the theoretical logic. To achieve this validity, the researcher may use convergent validity (should converge with similar measure) or discriminant validity (when it has low correlation with the measures of dissimilar concepts.) 3. Sensitivity It is an important concept, particularly when changes in attitude or other — hypothetical constructs are under investigation. It refers to an instruments ability to accurately measure variability in stimuli or responses. A dichotomous response category such as "agree" or "disagree" does not allow attitude change. But the scale staring from "strongly agree", "agree", "neither agree nor disagree", "disagree" and "strongly disagree" increases the sensitivity. 4. Practicality It can be defined in terms of economy, convenience and interpretability. This means the scientific requirements for the measurement process is called reliable and valid, while
operational requirements called it as practical where the above mentioned three aspects are more important. SUMMARY This paper outlined the importance of the measurement. Different types of scales have been dealt in detail. This chapter has given the criteria for a good measurement. KEY TERMS · · · · · · · · · · · · · · · ·
Nominal scale Ordinal scale Integral scale Ratio scale Reliability Split-half Technique Spearman-Brown Correction Formula Kuder - Richardson Formula (KR 20) Cronbach's Coefficient Alpha Validity Face validity Content validity Criterion - Related validity - concurrent validity or predictive validity Construct validity Sensitivity Practicality
QUESTIONS 1. What are different types of data in the attitude measurement could be collected? 2. Discuss the measurement scaling properties. 3. Explain different scales of measurement. 4. Is the statistical analysis is based on the type of scale? Explain. 5. What do you mean by good measurement? 6. Explain various methods of reliability and validity.
- End of chapter LESSON – 15
ATTITUDE MEASUREMENT AND SCALING TECHNIQUES
OBJECTIVES · ·
To understand the definition of attitude To learn the techniques for measuring attitudes
STRUCTURE · · · · ·
Techniques for measuring attitude Physiological measures of attitude Summated rating method Numerical scale Graphic rating scale
ATTITUDE DEFINED There are many definitions for the term attitude. An attitude is usually viewed as an enduring disposition to respond consistently in a given manner to various aspects of the world, including persons, events, and objects. One conception of attitude is reflected in this brief statement: "Sally loves working at Sam's. She believes it's clean, conveniently located, and has the best wages in town She intends to work there until she retires”. In this short description are three components of attitude: the affective, the cognitive, and the behavioral. The affective component reflects an individual's general feelings or emotion toward an object. Statements such as "I love my job", "I liked that book, A Corporate Bestiary”, and "I hate apple juice" - reflect the emotional character of attitudes. The way one feels about a product, a person, or an object is usually tied to one‘s beliefs or cognitions. The cognitive component represents one's awareness of and knowledge about an object. A woman might feel happy about her job because she "believes that the pay is great" or because she knows "that my job is the biggest challenge in India." The third component of an attitude is the behavioral component. Intention and behavioral expectations are included in this component, which therefore reflects a predisposition to action. Techniques for Measuring Attitudes A remarkable variety of techniques have been devised to measure attitudes part, this diversity stems from the lack, of consensus about the exact definite of the concept. Further, the affective, cognitive, and behavioral component an attitude may be measured by different means. For example, sympathetic nervous system responses may
be recorded using physiological measures to measure affect but they are not good measures of behavioral intentions. Direct verbal statements concerning affect, belief, or behavior are utilized to measure behavioral intent. However, attitudes may also be measured indirectly by using the qualitative explanatory techniques. Obtaining verbal statements from respondents generally requires that the respondent perform a task such as ranking, rating, sorting, or making a choice or a comparison. A ranking task requires that the respondents rank order a small number of items on the basis of overall preference or some characteristic of the stimulus. Rating asks the respondents to estimate the magnitude of a characteristic or quality that an object possesses. Quantitative scores, along a continuum that has been supplied to the respondents, are used to estimate the strength of the attitude or belief. In other words, the respondents indicate the position, on a scale, where they would rate the object. A sorting technique might present respondents with several product concepts, printed on cards, and require that the respondents arrange the cards into a number of piles or otherwise classify the product concepts. The choice technique, choosing one of two or more alternatives, is another type of attitude measurement. If a respondent chooses one object over another, the researcher can assume that the respondent prefers the chosen object over the other. The most popular techniques for measuring attitudes are presented in this chapter. Physiological Measures of Attitudes Measures of galvanic skin response, blood pressure, and pupil dilation and other physiological measures may be utilized to assess the affective component of attitudes. They provide a means of measuring attitudes without verbally questioning the respondent. In general, they can provide a gross measure of like or dislike, but they are not sensitive measures for identifying gradients of an attitude. Attitude Rating Scales Using rating scales to measure attitudes is perhaps the most common practice in business research. This section discusses many rating, scales designed to enable respondents to report the intensity of their attitudes. Simple Attitude Scales In this most basic form, attitude scaling requires that an individual agree or disagree with a statement or respond to a single question. For example, respondents in a political poll may be asked whether they agree or disagree with the statement "The president should run for re-election", or an individual might be asked to indicate whether he likes or dislikes labor unions. Because this type or self-rating scale merely classifies respondents into one of two categories, it has only the properties of a nominal scale. This, of course, limits the type of mathematical analysis that may be utilized with this basic scale. Despite the disadvantages, simple attitude scaling may be used when
questionnaires are extremely long, when respondents have little education, or for other specific reasons. Most attitude theorists believe that attitudes vary along continua. An early attitude researcher pioneered the view that the task of attitude scaling is to measure the distance from "good" to "bad", "low" to "high", "like" to "dislike , and so on. Thus the purpose of an attitude scale is to find an individual's position on the continuum. Simple scales do not allow for making fine distinctions in attitudes. Several scales have been developed to help make more precise measurements. Category Scales Some rating scales have only two response categories: agree and disagree. Expanding the response categories provides the respondent more flexibility in the rating task. Even more information is provided if the categories are ordered according to a descriptive or evaluative dimension. Consider the questions below: How often is your supervisor courteous and friendly to you? ___Never
___Rarely ___Often ___Very often
Each of these category scales is a more sensitive measure than a scaled with only two response categories. Each provides more information. Wording is an extremely important factor in the usefulness of these scales. Exhibit 14.1 shows some common wordings for category scales.
Summated Ratings Method: The Likert Scale Business researchers' adaptation of the summated ratings method, developed by Rensis Likert, is extremely popular for measuring attitudes because the method is simple to administer. With the Likert scale, respondents indicate their attitudes by checking how strongly they agree or disagree with carefully constructed statements that range from very positive to very negative toward the attitudinal object. Individuals generally choose from five alternatives: strongly agree, agree, uncertain, disagree, and strongly disagree; but the number of alternatives may range from three to nine. Consider the following example from a study on mergers and acquisitions: Mergers and acquisitions provide a faster means of growth than internal expansions. Strongly Disagree (1)
Disagree (2)
Uncertain
Agree
Strongly agree
(3)
(4)
(5)
To measure the attitude, researchers assign scores or weights to the alternative responses. In thus example, weights of 5, 4, 3, 2, and 1 are assigned to the answers. (The weights, shown in parentheses, would not be printed questionnaire). Because the statement used as an example is positive towards the attitude, strong agreement indicates the most favorable attitudes on the statement and is assigned a weight of 5. If a negative statement toward the object (such as "Your access to copy machines is limited") were given the weights would be reversed, and "strongly degree" would be assigned the weight of 5. A single scale item on a summated rating scale is an ordinal scale. A Likert scale may include several scale items to form an index. Each statement is assumed to represent an aspect of a common attitudinal domain For example, Exhibit 14.2 shows the items in a Likert scale to measure attitudes toward a management by objectives program. The total score is the summation of the weights assigned to an individual's response. For example: Here are some statements that describe how employees might feel about the MBO (management by objectives, form of management. Please indicate your agreement or disagreement for each statement. Please encircle the appropriate number to indicate whether you: 1 - Strongly Agree
2 – Agree
3 – Neutral
4 – Disagree
5 - Strongly Disagree
Circle one and only one answer for each statement. There are no right or wrong answers to these questions:
In Likert's original procedure, a large number of statements are generated and then an item analysis is performed. The purpose of the item analysis is to ensure that final items evoke a wide response and discriminate among those with positive and negative attitudes. Items that are poor because they lack clarity or elicit mixed response patterns are eliminated from the final statement list. However, many business researchers do not follow the exact procedure prescribed by Likert. Hence, many business researches do not follow the exact procedure prescribed by Likert. Hence, a disadvantage of the Likerttype summated rating method is that it is difficult to know what a single summated score means. Many patterns of response to the various statements can produce the same total score. Thus, identical total scores may reflect different "attitudes" because respondents endorsed different combinations of statements. Semantic Differential The semantic differential is a series of attitude scales. This popular attitudemeasurement technique consists of presenting an identification of a company, product, brand, job, or other concept, followed by a series of seven-point bipolar rating scales. Bipolar adjectives, such as "good" and "bad", "modern" and "old-fashioned", or "clean" and "dirty," anchor the beginning and end (or poles) of the scale. Modern_____:______ :_____:______:_____:_____ :_____ Old-Fashioned The subject makes repeated judgments of the concept under investigation on each of the scales. The scoring of the semantic differential can be illustrated by using the scale bounded by the anchors "modern" and "old-fashioned." Respondents are instructed to check the place that indicates the nearest appropriate adjective. From left to right, the scale intervals are interpreted as extremely modern, very modern, slightly modern, both modern and old-fashioned, slightly old-fashioned, very old-fashioned, and extremely old-fashioned. A weight is assigned to each position on the rating scale. Traditionally, scores are 7, 6, 5, 4, 3, 2, 1, or +3, +2, +1, 0, -1, -2, -3.
Many researchers find it desirable to assume that the semantic differential provides interval data. This assumption, although widely accepted, has its critics, who argue that the data have only ordinal properties because the weights are arbitrary. Depending on whether the data are assumed to be interval or ordinal, the arithmetic mean or the median is utilized to plot the profile of one concept, product, unit, etc., compared with another concept, product, or units. The semantic differential technique was originally developed by Charles Osgood and others as a method for measuring the meaning of objects or the "semantic space" of interpersonal experience." Business researchers have found the semantic differential versatile and have modified it for business applications. Numerical Scales Numerical scales have numbers, rather than "semantic space" or verbal descriptions as response options to identify categories (response positions). If the scale items have five response positions, the scale is called a 5-point numerical scale; with seven response positions, it is called a 7-point numerical scale, and so on. Consider the following numerical scale: Now that you've had your automobile for about one year, please tell us how satisfied you are with your Ford Ikon:, Extremely Satisfied 7
Extremely Dissatisfied 6
5
4
3
2
1
This numerical scale utilizes bipolar adjectives in the same manner as the semantic differential. Constant-Sum Scale If a Parcel Service company wishes to determine the importance of the attributes of accurate invoicing, delivery as promised, and price to organizations that use its service in business-to-business marketing. Respondents might be asked to divide a constant sum to indicate the relative importance of the attributes. For example: Divide 100 points among the following characteristics of a delivery service according to how important each characteristic is to you when selecting a delivery company. Accurate invoicing___ Delivery as promised___ Lower price___
The constant-sum-scale works best with respondents with high educational levels. If respondents follow instructions correctly, the results approximate interval measures. As in the paired-comparison method, as the number of stimuli increases this technique becomes more complex. Stapel Scale The Stapel scale was originally developed in the 1950s to measure the direction and intensity of an attitude simultaneously. Modern versions of the scale use a single adjective as a substitute for the semantic differential when it is difficult to create pairs of bipolar adjectives. The modified Stapel scale places a single adjective in the center of an even number of numerical values (for example, ranging from +3 to -3). It measures how close to or how distant from the adjective a given stimulus is perceived to be. The advantages and disadvantages of the Stapel scale are very similar to those of the semantic differential. However, the Stapel scale is markedly easier to administer, especially over the telephone. Because the Stapel scale does not requires bipolar adjectives, as does the semantic differential, the Stapel scale is easier to construct. Research comparing the semantic differential with the Stapel scale indicates that results from the two techniques are largely the same. Graphic Rating Scale A graphic rating scale presents respondents with graphic continuum. The respondents are allowed to choose any point on the continuum to indicate their attitudes. Typically, a respondent's score is determined by measuring the length (in millimeters) from one end of the graphic continuum to the point marked by the respondent. Many researchers believe scoring in this manner strengthens the assumption that graphic rating scales of this type are interval scales. Alternatively, the researcher may divide the line into predetermined scoring categories (lengths) and record respondent's marks accordingly. In other words, the graphic rating scale has the advantage of allowing the researchers to choose any interval they wish for purposes of scoring. The disadvantage of the graphic rating scale is that there are no standard answers. Thurstone Equal-Appearing Interval Scale In 1927, Louis Thurstone, an early pioneer in attitude research, developed the concept that attitudes vary along continua and should be measured accordingly. Construction of a Thurstone scale is a rather complex process that requires two stages. The first stage is a ranking operation, performed by judges, who assigns scale values to attitudinal statements. The second stage consists of asking subjects to respond to the attitudinal statements. The Thurstone method is time-consuming and costly. From a historical perspective it is valuable, but its current popularity is low, because it is rarely utilized in most applied business research.
Scales Measuring Behavioral Intentions and Expectations The behavioral component of an attitude involves the behavioral expectations of an individual toward an attitudinal object. Typically, this represents an intention or a tendency to seek additional information. Category scales that measure the behavioral component of an attitude attempt to determine a respondent's "likelihood" of action or intention to perform some future action, as in the following examples: How likely is it that you will change jobs in the next six months · · · · ·
I definitely will change. I probably will change. I might change. I probably will not change. I definitely will not change.
I would write a letter to my congressmen or other government official in support of this company if it were in a dispute with government. · ·
Extremely likely Very likely
· · · · ·
Somewhat likely Likely, about 50-50 chance Somewhat unlikely Very unlikely Extremely unlikely
Behavioral Differential A general instrument, the behavioral differential, has been developed to measure the behavioral intentions of subjects toward an object or category of objects. As in the semantic differential, a description of the object to be judged is placed on the top of a sheet, and the subjects indicate their behavioral intentions toward this object on a series of scales. For example, one item might be: A 25-year-old female commodity broker Would: ______:_____
:_____ :_____ :_____ :_____:Would not
ask this person for advice. Ranking People often rank order their preferences. An ordinal scale may be developed by asking respondents to rank order (from most preferred to lease preferred) a set of objects or attributes. It is not difficult for respondents to understand the task of rank ordering the importance of fringe benefits or arranging a set of job tasks according to preference. Paired Comparisons The following question is the typical format for asking about paired comparisons. I would like to know your overall opinion of two brands of adhesive bandages. They are Curad brand and Band-Aid brand. Overall, which of these two brands - Curad or Band-Aid - do you think is the better one? Or are both the same? Curad is better____ Band-Aid is better____ They are the same____ Ranking objects with respects to one attribute is not difficult if only a few concepts or items are compared. As the number of items increases, the number of comparisons increases geometrically. If the number of comparisons is too great, respondents may become fatigued and no longer carefully discriminate among them. Sorting
Sorting tasks requires that respondents indicate their attitudes or beliefs by arranging items. SUMMARY This chapter describes the technique for measuring attitude. This paper outlined the importance of the attitude. KEY TERMS · · · · · · · · · · · · · ·
Attitude Affective component Cognitive component Behavioral component Ranking Rating Category scale Likert scale Semantic differential scale Numerical scale Constant sum scale Stapel scale Graphic rating scale Paired comparison
QUESTIONS 1. What is an attitude? 2. Distinguish between rating and ranking. Which is a better attitude measurement? Why? 3. Describe the different methods of scale construction, pointing out the merits and demerits of each. 4. What advantages do numerical scales have over semantic differential scales?
- End of chapter LESSON – 16 STATISTICAL TECHNIQUES
OBJECTIVES · · ·
Know the nature of Statistical study Recognize the importance of Statistics as also its limitations Differentiate descriptive Statistics from inferential Statistics.
STRUCTURE · · · · · · ·
Major characteristics of statistics Description statistics Inferential statistics Central tendency of data Use of different average Types of frequency distribution Measure of dispersion
INTRODUCTION Business researchers edit and code data to provide input that results in tabulated information that will answer research question. With this input, the results can be produced statistically and logically. Aspects of Statistics are important if the quantitative data are to serve their purpose. If Statistics, as a subject, is inadequate and consists of poor methodology, we would not know the right procedure to extract from the data the information they contain. On the other hand, if our figures are defective in the sense that they are inadequate or inaccurate, we would not reach the right conclusions even though our subject is well developed. With this brief introduction, let us first see how Statistics has been defined. Major characteristics of statistics: 1. Statistics are aggregates of facts. This means that a single figure is not Statistics. For example, national income of a country for a single year is not Statistics but the same for two or more years is. 2. Statistics are affected by a number of factors. For example, sale of a product depends on a number of factors such as its price, quality, competition, the income of the consumers, and so on. 3. Statistics must be reasonably accurate. Wrong figures, if analyzed, will lead to erroneous conclusions. Hence, it is necessary that conclusions must be based on accurate figures. 4. Statistics must be collected in a systematic manner. If data are collected in a haphazard manner, they will not be reliable and will lead to misleading conclusions.
5. Finally, Statistics should be placed in relation to each other. If one collects data unrelated to each other, then such data will be confusing and will not lead to any logical conclusions. Data should be comparable over tin e and over space. Subdivisions in Statistics The statisticians commonly classify this subject into two broad categories: descriptive statistics and inferential statistics. 1. Descriptive Statistics As the name suggests descriptive statistics includes any treatment designed to describe or summaries the given data, bringing out their important features. These statistics do not go beyond this. This means that no attempt is made to infer anything that pertains to more than the data themselves. Thus, if someone compiles the necessary data and reports that during the financial year 2000-2001, there were 1500 public limited companies in India of which 1215 earned profits and the remaining 285 companies sustained losses, his study belongs to the domain of descriptive Statistics. He may further calculate the average profit earned per company as also average loss sustained per company. This set of calculations too is a part of descriptive statistics. Methods used in descriptive statistics may be called as descriptive methods. Under descriptive methods, we learn frequency distribution, measures of central tendency, that is, averages, measures of dispersion and skewness. 2. Inferential Statistics Although descriptive Statistics is an important branch of Statistics and it continues to be so, its recent growth indicates a shift in emphasis towards the methods of Statistical inference. A few examples may be given here. The methods of Statistical inference are required to predict the demand for a product such as tea or coffee for a company for a specified year or years. Inferential Statistics are also necessary while comparing the effectiveness of a given medicine in the treatment of any disease. Again, while determining the nature and extent of relationship between two or more variables like the number of hours studied by students and their performance in their examinations, one has to take recourse to inferential Statistics. Each of these examples is subject to uncertainty on account of partial, incomplete, or indirect information. In such cases, the Statistician has to judge the merits of all possible alternatives in order to make the most realistic prediction or to suggest the most effective medicine or to establish a dependable relationship and the reasons for the same. In this text, we shall first discuss various aspects of descriptive Statistics. This will be followed by the discussion on different topics in inferential Statistics. The latter will understandably be far more comprehensive than the former. CENTRAL TENDENCY OF DATA
In many frequency distributions, the tabulated values show small frequencies at the beginning and at the end and very high frequency at the middle of the distribution. This indicates that the typical values of the variable lie near the central part of the distribution and other values cluster around these central values. This behavior of the data about the concentration of the values in the central part of the distribution is called central tendency of the data. We shall measure this central tendency with the help of mathematical quantities. A central value which 'enables' to comprehend in a single effort the significance of the whole is known as Statistical Average or simply Average. In fact, an average of a statistical series is the value of the variable which is representative of the entire distribution and, therefore, gives a measure of central tendency. Measures of Central Tendency There are three common measures of central tendency I. Mean II. Median III. Mode The most common and useful measure of central tendency is, however the Mean. In the following articles the method of calculation of various measures of central tendency will be discussed. In all such discussion we need a very useful notation known as Summation. Choice of a Suitable Average The different statistical average has different characteristics. There is no all-purpose average. The choice of a particular average is usually determined by the purpose of investigation. Within the framework of descriptive statistics, the main requirement is to know what each average means and then select the one that fulfils the purpose at hand. The nature of distribution also determines the type of average to be used. Generally the following points should be kept in mind while making a choice of average for use 1. Object The average should be chosen according to the object of enquiry. If all the values in a series are to be given equal importance, then arithmetic mean will be a suitable choice. To determine the most stylish or most frequently occurring item mode should be found out. If the object is to determine an average that would indicate its position or ranking in relation to all the values, naturally, median should be the choice. If small items are to be given greater importance than the big items, geometric mean is the best mean. 2. Representative
The average chosen should be such that it represents the basic characteristics of the distribution. 3. Nature of form of data If the frequently distribution is symmetrical or nearly symmetrical X, M or Mo may be used almost interchangeably. If there are open-end class intervals, mean cannot be calculated definitely. In a closed frequently distribution of unequal class intervals, it is impossible to determine mode accurately. If there are a few values, it may not be possible to determine mode. Mean will not give a representative picture, if there are few extremely large or small values at either end of the array, and yet the great majority of the values concentrate around a narrow band. In a variable of non-continuous type, median or mode may give a value that actually exists in the data. Davis' Test: Arithmetic mean is considered as an appropriate average for use for data which has a symmetrical distribution or even if it has a moderate degree of asymmetry. Prof. George Davis has devised a test which is:
If this coefficient works out to be more than +0.20, the distribution is symmetrical enough to use arithmetic mean. 4. Characteristics of Average While choosing a suitable average for a purpose, the merits and demerits of various averages should always be considered and that average which fits into the purpose most should be preferred over others. The following points should be given due consideration in the process of selection of an average. (i) In certain commonly encountered applications, the mean is subject to less sampling variability than the median or mode. (ii) Given only the original observations, the median is sometimes easiest to calculate. Sometimes when there is no strong advantage for the mean, this advantage is enough to indicate the use of the median. (iii) Once a frequency distribution has been formed, the mode and the median are mode quickly calculated than the mean. Moreover, when some classes are open-ended the mean cannot be calculated from the frequency distribution. (iv) The median is not a good measure when there are very few possible values for the observations as with number of children or size of family.
(v) The mode and the median are relatively little affected by 'extreme' observations. (vi) Calculations of geometric mean and harmonic mean is difficult as it involves the knowledge of logarithms and reciprocals. Hence "the justification of employing them (averages) must be determined by an appeal to all the facts and in the light of the peculiar characteristics of the different types". Uses of Different Averages Different averages, due to their inherent characteristics are appropriate in different circumstances. Thus, their use may be guided by the purpose at hand or circumstances in which one is. Here a brief discussion is being made of uses of different statistical averages: 1. Arithmetic Average The arithmetic average used in the study of a social, economic or commercial problem like production, income, price, imports, exports, etc. The central tendency if these phenomena can best be studied by taking out an arithmetic average. Whenever we talk of an 'average income' or 'average production' or 'average price' we always mean arithmetic average of all these things. Whenever there is no indication about the type of the average to be used, arithmetic average is computed. 2. Weighted Arithmetic Average When it is desirable to give relative importance to the different items of a series, weighted arithmetic average is computed. If it is desired to compute per capita consumption of a family, due weights should be assigned to children, males and females. This average is also useful in constructing numbers. The weighted average should be used in the following cases: a) If it is desired to have an average of whole group, which is divided into a number of sub-classes, widely divergent from each other? b) When items falling in various sub-classes change in such a way that the proportion which the items bear among themselves also undergoes a change. c) When combined average has to be computed. d) When it is desired to calculate to find an average of ratios, percentages or rates. 3. Median Median is especially applicable to cases which are not capable of precise quantitative studies such as intelligence, honesty, etc. It is less applicable in economic or business statistics, because there is lack of stability in such data.
4. Mode The utility of mode is being appreciated more and more day by day. In r the sciences of Biology and Meteorology it has been found to be of great value. In commerce and industry it is gaining very great importance. Whenever as shop-keeper wants to stock the goods he sells, he always looks to the modal size of those goods. Model size of shoes, is of great importance to the businessman dealing in ready-made garments or shoes. Many problems of production are related with mode. Many business establishments these days are engaging their attention in keeping statistics of their sale ascertain the particulars of the modal articles sold. 5. Geometric Mean Geometric mean can advantageously be used in the construction of index numbers. It makes the index numbers reversible and gives equal weight to equal ratio of changes. This average is also useful in measuring the growth of population, because increases in geometric progression. When there is wide dispersion in a serious, geometric mean is a useful average. 6. Harmonic Mean This average is useful in the cases where time, rate and prices are involved. When it is desired to give the largest weight to the smallest item, this average is used. Summation Notation (∑) The symbol ∑ (read: sigma) means summation. If x1, x2, x3, …, xn be the n values of a variable x. Then their sum x1 +x2+ x3+ …+ xn is shortly written as
Similarly, the sum w1x1+w2x2+ …+ wnxn is denoted by
Some important results
There are three types of mean 1. Arithmetic Mean (AM) 2. Geometric Mean (GM) 3. Harmonic Mean (HM) Of the three means, Arithmetic Mean is most commonly used. In fact, if no specific mention be made, by Mean we shall always refer to Arithmetic Mean (AM) and calculate accordingly. Simple Arithmetic Mean Definition: The Arithmetic Mean (x ̅) of a given series of values, say x1, x2, x3, …, xn, is defined as the sum of these values divided by their total number; thus Weighted Arithmetic Mean is defined by:
Example 1: Find the AM of 3, 6, 24, and 48. Solution: AM = (3 + 6 + 24 + 48) / 4 = 81/4 = 20.25
Weighted Arithmetic Mean
Definition: x1, x2, x3, …, xn be n values of a variable x, and if f1, f2, …, fn be their respective weights (or frequencies), then the weighted arithmetic mean is defined by
where, N = ∑f = total frequency SKEWNESS A frequency distribution is said to be symmetrical when the values of the variable equidistant from their mean have equal frequencies. If a frequency distribution is not symmetrical, it is said to be asymmetrical or skewed. Any deviation from symmetry is called Skewness. In the words of Riggleman and Frisbee: "Skewness is the lack of symmetry. When a frequency distribution is plotted on a chart, skewness present in the items tends to be dispersed more on one side of the mean than on the other". Skewness may be positive or negative. A distribution is said to be positively skewed if the frequency curve has a longer tail towards the higher values of x, ie, if the 'frequency curve gradually slopes down towards the high values of x. For a positively skewed distribution, Mean (M) > Median (Me) > Mode (Mo) A distribution is said to be negatively skewed if the frequency curve has a longer tail towards the lower values of x. For a negatively skewed distribution, Mean (M) < Median (Me) < Mode (Mo)
For a symmetrical distribution, Mean (M) = Median (Me) = Mode (Mo)
Measures of Skewness The degree of skewness is measured by its coefficient. The common measures of skewness are: 1. Pearson's first measure Skewness = (Mean – Mode) / Standard Deviation
2. Pearson's second measure Skewness = 3 x (Mean - Mode)/ Standard Deviation 3. Bowley's Measure
where Q1, Q2, Q3 are the first, second and third quartiles respectively. 4. Moment Measure Skewness = m3 / σ3 = m3 / m23/2 where m2 and m3 are the second and third central moments and σ is the S.D. All the four measure of Skewness defined above are independent of the units of measurement. Example: Calculate the Pearson's measure of Skewness on the basis of Mean, Mode, and Standard Deviation.
Solution: According to Pearson's first measure, Skewness = (Mean – Mode) / Standard Deviation Here mid-values of the class-intervals are given. Assuming a continuous series, we construct the following table:
Types of Frequency Distributions In general, frequency distributions that form a balanced pattern are called symmetrical distributions, and those that form an unbalanced pattern are called skewed or asymmetrical distributions. In a symmetrical distribution frequencies go on increasing up to a point and then begin to decrease in the same fashion. A special kind of symmetrical distribution is the normal distribution; the pattern formed is not only symmetrical but also has the shape of a bell. In a symmetrical distribution, mean median and mode coincide and they lie at the centre of distribution. As the distribution departs from symmetry these three values are pulled apart. A distribution, in which more than half of the area under the curve is to the right side of the mode, is a positively skewed distribution. In a positively skewed distribution, its right tail is longer than its left tail. Under such a distribution mean is
greater than the median, and the median is greater than the mode (M > Me > Mo), and the difference between upper quartile and median is greater than the difference between median and lower quartile (Q3-Me > Me-Q1). In a negatively skewed distribution, more than half of the area under the distribution curve is to the left side of the mode. In such a distribution the elongated tail us to the left and mean is less than the median and median is less than the mode (M < Me < Mo), and the difference between upper quartile and median is less than the difference between median and lower quartile (Q3-M < MQ1).The following figures show these facts. The following table will also show these facts of Position of Average on Various Distributions.
Test of Skewness In order to find out whether a distribution is symmetrical or skewed, the following facts should be noticed: 1. Relationship between Averages If in a distribution mean, median and mode are not identical, then it is a skewed distribution. The greater is the difference between mean and mode more will be the skewness in the distribution. 2. Total of Deviations If the sum of positive deviations from median or mode is equal to the sum of negative deviation, then is no skewness in the distribution. The extent of difference between the sums of positive and negative deviations from median or mode will determine the extent of skewness in the data. 3. The distance of partition values from median
In a symmetrical distribution Q1 and Q2, D1 and D9 and P10 and P90 are equidistant from median. In an asymmetrical distribution it is not so. 4. The frequencies on either side of the mode In an asymmetrical distribution, frequencies on either side of the mode are not equal. 5. The curve When the data are plotted on a graph paper, the curve will not be bell-shaped, or when cut along a vertical line through the centre, the two halves will not be identical. Conversely stated, in the absence of skewness in the distribution: (i) Values of mean, median and mode will coincide. (ii) Sum of the positive deviations from the median or mode will be equal to the sum of negative deviations. (iii) The two quartiles, deciles one and nine, and percentile ten and ninety will be equidistant from the median. (iv) Frequencies on the either side of the mode will be equal. (v) Data when plotted on a graph paper will take a bell-shaped form. Measures of Skewness To find out the direction and the extent of symmetry in a series statistical measures of skewness are calculated, these measures can be absolute or relative. Absolute measures of skewness tell us the extent of asymmetry and whether it is positive or negative. The absolute skewness can be known by taking the deference between mean and mode. Symbolically, Absolute SK = X - Mo If the value of mean is greater than the mode (M > Mo) skewness will be positive. In case the value of mean is less than the mode (M < Mo) skewness will be negative. The greater is the amount of skewness, the more the mean and mode differ because of the influence of extreme items. The reason why the difference between mean and mode is taken for the measure of skewness is that in a symmetrical distribution, both the values along with median coincide, but in an asymmetrical distribution, there will be a difference between the mean and mode. Thus the difference between the mean and the mode, whether positive or negative, indicates that the distribution is asymmetrical. However such absolute measure of skewness is unsatisfactory, because:
(1) It cannot be used for comparison of skewness in tow distributions if they are in different units, because the difference between the mean and the mode will be in terms of the units of distribution. (2) The difference between the mean and mode may be more in one series and less in another, yet the frequency curves of the two distributions may be similarly skewed. For comparison, the absolute measures of skewness are changed to relative measures, which are called Coefficient of Skewness. There are four measures of relative skewness. They are: 1. The Karl Pearson's Coefficient of Skewness 2. The Bowley's Coefficient of Skewness. 3. The Kelly's Coefficient of Skewness 4. Measure of skewness based onmoments. Measures of Skewness 1. The Karl Pearson's Coefficient of Skewness Karl Pearson has given a formula, for relative measure of Skewness. It is known as Karl Pearson's Coefficient of Skewness or Pearsonian Coefficient of Skewness. The formula is based on the difference between the mean and mode divided by the standard deviation. The coefficient is represented by J
If in a particular frequency distribution, the mode is ill-defined, the coefficient of Skewness can be determined by the following changed formula.
This is based on the relationship between different averages in a moderately asymmetrical distribution. In such a distribution:
The Pearsonian coefficient of skewness has the interesting characteristic that it will be positive when the mean is larger than the mode or median, and it will be negative when the arithmetic mean is smaller than the mode or median. In a symmetrical distribution, the value of Pearsonian coefficient of skewness will be zero. There is no theoretical limit to this measure, however, in practical the value given by this formula is rarely very high and usually lies between ±∞. The direction of the skewness is given by the algebraic sign of the measure; if it is plus then the skewness is positive, if it is minus, the skewness is negative. The degree of skewness is obtained by the numerical figure such as 0.9, 0.4, etc. Thus this formula gives both the direction as well as the degree of skewness. There is another relative measure of skewness also based on the position of averages. In this, the difference between two averages is divided by the mean deviation. The formula is:
These formulas are not very much used in practice, because of demerits of mean deviation. DISPERSION Measures of Dispersion An average nay give a good idea of the type of data, but it alone can't reveal all the characteristics of data. It cannot tell us in what manner all the values of the variable are scattered / dispersed about the average. Meaning of Dispersion The Variation or Scattering or Deviation of the different values of a variable from their average is known as Dispersion. Dispersion indicates the extent to which the values vary among themselves. Prof. W.I. King defines the term, 'Dispersion' as it is used to indicate
the facts that within a given group, the items differ from another in size or in other words, there is lack of uniformity in their sizes. The extent of variability in a given set of data is measured by comparing the individual values of the variable with the average all the values and then calculating the average of all the individual differences. Objectives of Measuring Variations 1. To serve as a basis for control of the variability itself. 2. To gauge the reliability of an average 3. To serve as a basis for control of the variability itself Types of Measures of Dispersion There are two types of measures of dispersion. The first, which may be referred to as Distance Measures, describes the spread of data in terms of distance between the values of selected observations. The second are those which are in terms of an average deviation from some measure of central tendency. Absolute and Relative Measures of Dispersion Measures of absolute dispersion are in the same units as the data whose scatter they measure. For example, the dispersion of salaries about an average is measured in rupees and the variation of time requires for workers to do a job is measured in minutes or hours. Measures of absolute dispersion cannot be used to compare the scatter in one distribution with that in another distribution when the averages of the distributions differ in size or the units of measure differ in kind. Measures of relative dispersion show some measure of scatter as a percentage or coefficient of the absolute measure of dispersion. They are generally used to compare the scatter in one distribution with the scatter in other. Relative measure of dispersion is called coefficient of dispersion. Methods of Measuring Dispersion There are two meanings of dispersion, as explained above. On the basis of these two meanings, there are two mathematical methods of finding dispersion, i.e. methods of limits and methods of moment. Dispersion can also be studied graphically. Thus, the following are the methods of measuring dispersion: I. Numerical Methods 1. Methods of Limits i. The Range ii. The Inter-Quartile Range
iii. The Percentile Range 2. Methods of Moments i. The first moment of dispersion or mean deviation ii. Second moment of dispersion from which standard deviation is computed iii. Third moment of dispersion 3. Quartile Deviation II. Graphic Method Lorenz Curve Range The simplest measure of dispersion is the range of the data. The range is determined by the two extreme values of the observations and it is simply the differences between the largest and the smallest value in a distribution. Symbolically, Range (R) = Largest value (L) - Smallest value (S) Coefficient of Range (CR) = (Largest Value + Smallest Value) / (Largest Value - Smallest Value)
Quartile Deviation or Semi-Interquartile Range Definition: Quartile Deviation (Q) is an absolute measure of dispersion and is defined by formula: Q = (Q3 – Q1) / 2 Where, Q1 and Q2 are the first (or lower) and the third (or upper) quartiles respectively. Here Q3 – Q1 is the interquartile range and hence quartile deviation is called Semiinterquartile Range. As it is based only on the Q1 and Q3, it does not take into account the variability of all the values and hence it is not very much used for practical purposes.
Example: Find the quartile deviation of the following frequency distribution. Daily wages: No of workers:
10-15
15-20
20-25
25-30
30-35
6
12
18
10
4
Solution:
N/4 = 50/4 = 12.5 and 3N/4 = 37.5. By simple interpolation, (Q1 – 15) / (20 – 15) = (12.5 – 6) / (18 – 6) (Q1 – 15) / 5 = 6.5 / 12 Q1 = (6.5 / 12) x 5 + 15 Q1 = 17.71 Similarly, (Q3 – 25) / (30 – 25) = (37.5 – 36) / (46 – 36) (Q3 – 25) / 5 = 1.5 / 10 Q3 = 0.15 x 5 + 25 Q3 = 25.75 Hence, Quartile Deviation,
(Q3 – Q1) / 2 = (25.75 – 17.71) / 2 = 4.02 MEAN DEVIATION (or Average Deviation or Mean Absolute Deviation) Definition: Mean Deviation of a series of values of a variable is the arithmetic mean of all the absolute deviations (i.e., difference without regard to sign) from any one of its averages (Mean, Median or Mode, but usually Mean or Median). It is an absolute measure of dispersion. Mean deviation of a set of n values x1, x2,…, xn about their AM is defined by
where x = values or mid-value accordingly to data being ungrouped or grouped, and x bar = Mean
where M = Median and d= x – M = Value (or mid value) – Median. Similarly, we can define Mean Deviation about Mode.
Note: The expression |d| is read as mod. d and gives only numerical or absolute value of d without regard to sign. Thus, |-3| = 3, |-4| = 4, |0.56| = 0.56 The reason for taking only the absolute and not the algebraic values of the deviation is that the algebraic sum of the deviations of the value from their mean is zero. Example: Find the Mean Deviation about the Arithmetic Mean of the numbers 31, 35, 29, 63, 55, 72, 37. Solution: Arithmetic Mean = (31 + 35 + 29 + 63 + 55 + 72 + 37) / 7 = 322/7 = 46 Calculation of absolute Deviations
The required Mean Deviation about the Mean = ∑|d| / n = 104 / 7 = 14.86 Advantages Mean deviation is based on all the values of the variable and sometimes gives fairly good result as a measure of dispersion. However the practice of neglecting signs and taking
absolute deviations for the calculation of the mean Deviation seems rather unjustified and this makes algebraic treatment difficult. STANDARD DEVIATION It is the most important absolute measures of Dispersion. Standard Deviation of set values of a variable is defined as the positive square root of arithmetic mean of the squares of all deviations of the values from their arithmetic mean. In short it is the square root of the mean of the squares of deviations from mean. If x1, x2, ...,xn be a series of values of a variable and x bar their AM, then S.D (σ) is defined by
The square of Standard Deviation is known as Variance i.e., Variance σ2 = (SD)2 SD is often defined as the positive square root of Variance. Example: Find the standard deviation of the following numbers: 1, 2, 3, 4, 5, 6, 7, 8, and 9 Solution: The deviations of the numbers from the AM 5 are respectively -4, -3, -2, -1, 0, 1, 2, 3, and 4 The squares of the deviations from AM are 16, 9, 4, 1, 0, 1, 4, 9, and 16 Therefore,
= (16 + 9 + 4 + 1 + 0 + 1 + 4 + 9 + 16) / 9 = 2.58 Advantages and Disadvantages
Standard Deviation is the most important and widely used among the measures of dispersion and it possesses almost all the requisites of a good measure of dispersion. It is rigidly defined and based on all the values of the variable. It is suitable for algebraic treatment. SD is less affected by sampling fluctuations than any other absolute measure of dispersion. SD is difficult to understand. The process of squaring the deviations from their AM and then taking the square-root of the AM of these squared deviations is a complicated affair. The calculation of SD can be made easier by changing the origin and the scale conveniently. Relative Measures of Dispersion Absolute measures expressed as a percentage of a measure of a control tendency gives relative measures of dispersion. Relative measures are independent of the units of measurement and hence they are used for the comparison of dispersion of two or more distributions given in different units. Co-efficient of Variation Co-efficient of variation is the first important relative measure of dispersion and is defined by the following formula: Co-efficient of variation = Standard Deviation / Mean x 100 Co-efficient of variation is thus the ratio of the Standard deviation to the mean, expressed as a percentage. In the words of Karl Pearson, Co-efficient of Variation is the percentage variation in the mean.
Coefficient of Quartile Deviation Co-efficient of quartile deviation is a relative measure of dispersion and is defined by Coefficient of Quartile Deviation = Quartile Deviation / Median x 100
Coefficient of Mean Deviation It is a relative measure of dispersion. Co efficient of Mean Deviation is defined by Coefficient of Mean Deviation = Mean Deviation / (Mean or Median) x 100
Example: Find the Mean Deviation about the Median in respect of the following numbers: 46, 79, 26, 85, 39, 65, 99, 29, 56, and 72. Find also the Co efficient of Mean Deviation. Solution: By arranging the given numbers in ascending order of magnitude, we obtain 26, 29, 39, 46, 56, 65, 72, 79, 85, and 99. Median = [(n+1)/2]th value = [11/2]th value = 5.5th value = (5th value + 6th value) / 2 = (56+65)/2 = 60.5 Absolute deviation of the values from the Median 60.5 is respectively… 34.5, 31.5, 21.5, 14.5, 4.5, 4.5, 11.5, 18.5, 24.5, and 38.5 Therefore, Mean Deviation (MD) about the Median =(34.5+31.5+21.5+14.5+4.5+4.5+11.5+18.5+24.5+38.5)/2 = 20.4 Co-efficient of MD = MD / Median x 100 = 20.4 / 60.5 x 100 = 33.72% KURTOSIS Kurtosis in Greek means ‘bulginess’. The degree of kurtosis of a distribution is measured relative to the peakedness of a normal curve. The measure of kurtosis indicates whether the curve of the frequency distribution is flat or peaked. Kurtosis is the peakedness of a frequency curve. In two or more distributions having same average, dispersion and skewdness, one may have higher concentration of values near the mode, and its frequency curve will show sharper peak than the others. This characteristic of frequency distribution is known as Kurtosis. Kurtosis is measured by the coefficient β2, which is defined by the formula β2 = m4 / m22 = m4 / σ4 or by γ2 = β2 – 3, where m4 and m2 are the 2nd and 4th control moments, and σ = SD. A distribution is said to be Platy-kurtic, Meso-kurtic, and Lepto-kurtic corresponding to β2 < 3, β2 = 3, or β2 > 3. Accordingly, It’s Platy-kurtic, if m4 / σ4 < 3
It’s Meso-kurtic, if m4 / σ4 = 3 It’s Lepto-kurtic, if m4 / σ4 > 3 Karl person in 1905 introduced the terms MESOKURTIC, LEPTOKURTIC, and PLATYKURTIC. A peaked curve is called "Leptokurtic" and a flat topped curve is termed "Platykurtic". These are evaluated by comparison with intermediate peaked curve. These three curves differ widely in regard to convexity. Example: Calculate the measures of the following distribution:
Solution:
V1 = ∑fd / N = 0/100 = 0 V2 = ∑fd2 / N = 446/100 = 4.46 V3 = ∑fd3 / N = -36/100 = -0.36 V4 = ∑fd4 / N = 4574/100 = 45.74 μ4 = V4 – 4V1V3 + 6V12V2 – 3V14 = 45.74 – 4(0)(-0.36) + 6(0)2(4.46) – 3(0)4 α4 or α2 = V4/ V22 = 45.74 / 4.462 = 2.3 The value of β2 is less than 3, hence the curve is Platykurtic. SUMMARY This chapter helps us to know the nature of the statistical study. This chapter recognizes the importance of statistics and also its limitations. The differences between descriptive statistics and inferential statistics are dealt in detail. KEY WORDS ·
Descriptive Statistics
· · · · · · · · · · · · ·
Inferential Statistics Mean Standard Deviation Median Mean Deviation Mode Arithmetic Mean Geometric Mean Averages Dispersion Skewness Mean deviation Kurtosis
REVIEW QUESTIONS 1. Explain the special features of measures of central tendency. 2. How will you choose an 'average'? 3. What is dispersion? State its objectives. Explain the various types measures of dispersion. 4. Explain the various methods of measuring dispersion. 5. Differentiate standard deviation from mean deviation. 6. Define 'skewdness'. How will you measure it? 7. Explain the application of averages in research.
- End of Chapter LESSON – 17 MEASURES OF RELATIONSHIP
OBJECTIVES To study simple, partial and multiple correlation and their application in research
STRUCTURE · · · · ·
Measure of relationship Correlation Properties of correlation co-efficient Methods of studying correlations Application of correlation
MEASURES OF RELATIONSHIP The following statistical tools measure the relationship between the variables analyzed in social science research: (i) Correlation · · ·
Simple correlation Partial correlation Multiple correlation
(ii) Regression · ·
Simple regression Multiple regressions
(iii) Association of attributes CORRELATION Correlation measures the relationship (positive or negative, perfect) between the two variables. Regression analysis considers relationship between variables and estimates the value of another variable, having the value of one variable. Association of Attributes attempts to ascertain the extent of association between two variables. Aggarwal Y.P., in his book 'Statistical Methods' has defined coefficient of correlation as, "a single number that tells us to what extent two variables or things are related and to what extent variations in one variable go with variations in the other". Richard Levin in his book, 'Statistics for Management' has defined correlation analysis as "the statistical tool that we can use to describe the degree to which one variable is linearly related to another". He has further stated that, "frequently correlation analysis is used in conjunction with regression analysis to measure how well the regression line explains the variation of the dependent variable 'r'. Correlation can also be used by itself, however, to measure the degree of association between two variables". Srivastava U.K. Shenoy G.V, and Sharma S.C, in their book 'Quantitative Techniques for Managerial Decision' have stated that, "correlation analysis is the statistical technique that is used to describe the degree to which one variable is related to another.
Frequently correlation analysis is also used along with the regression analysis to measure how well the regression line explains the variations of the dependent variable. The correlation coefficient is the statistical tool that is used to measure the mutual relationship between the two variables". Coefficient of correlation is denoted by 'r'. The sign of 'r' shows the direction of the relationship between the two variables X and Y. Positive correlation reveals that there is a positive correlation between the two variables. Negative correlation reveals negative relationship. Levin states that if an inverse relationship exists — that is, if Y decreases as X increases — then ‘r’ will fall between 0 and -1. Likewise, if there is a direct relationship (if Y increases as X increases), then ‘r’ will be a value within the range of 0 to 1. Aggarwal Y.P. has highlighted in his book 'Statistical Methods', the properties of correlation and factors influencing the size of the correlation coefficient. The details are given below: PROPERTIES OF THE CORRELATION COEFFICIENT The range of correlation coefficient is from -1 through 0 to +1. The values of r = -1 and r = +1 reveal a case of perfect relationship, though the direction of relationship is negative in the first case, and positive in the second case. The correlation coefficient can be interpreted in terms of r2. It is known as 'coefficient of determination'. It may be considered as the variance interpretation of r2. Example: r = 0.5 r2 = 0.5 x 0.5 = 0.25. In terms of percentage, 0.25x100% = 25% It refers that 25 percent of the variance in Y scores has been accounted for by the variance in X. The correlation coefficient does not change if every score in either or both distribution is increased or multiplied by a constant. Causality cannot be inferred solely as the basis of a correlation between two variables. It can be inferred only after conducting controlled experiments. The direction of the relation is indicated by the sign (+ or -) of the correlation. The degree of relationship is indicated by the numerical value of the correlation. A value near 1 indicates a nearly perfect relation, and a value near 0 indicates no relationship.
In a positive relationship both variables tend to change in the same direction: as X increases, Y also tends to increase. The Pearson correlation measures linear (straight line) relationship. A correlation between X and Y should not be interpreted as a cause-effect relationship. Two variables can be related without one having a direct effect on the other. FACTORS INFLUENCING THE SIZE OF CORRELATION COEFFICIENT 1. The size of r is very much dependent upon the variability of measured values m the correlated sample. The greater the variability, the higher will be the correlation, everything else being equal. 2. The size of r is altered when researchers select extreme groups of subjects in order to compare these groups with respect to certain behaviors. Selecting extreme groups on one variable increases the size of r over what would be obtained with more random sampling. 3. Combining two groups which differ in their mean values on one of the variables is not likely to faithfully represent the- true situation as far as the correlation is concerned. 4. Addition of an extreme case (and conversely dropping of an extreme case) can lead to changes in the amount of correlation. Dropping of such a case leads to reduction in the correlation while the converse is also true. TYPES OF CORRELATION a) Positive or Negative b) Simple, Partial and Multiple c) Linear and Non-linear a) Positive and Negative Correlations Both the variables (X and Y) will vary in the same direction. If variable X increases, variable Y also will increase; If variable X decreases, variable Y also will decrease. In negative correlation, the given variables will vary in opposite direction. If one variable increases, other variable will decrease. b) Simple, Partial and Multiple Correlations In simple correlation, relationship between two variables are studied. In partial and multiple correlations, three or more variables are studied. In multiple correlation, three or more variables are simultaneously studied. In partial correlation, more than two
variables are studied, but the effect on one variable is kept constant and relationship between other two variables is studied c) Linear and Non-Linear Correlations It depends upon the constancy of the ratio of change between the variables. In linear correlation the percentage change in one variable will be equal to the percentage change in another variable. It is not so in non-linear correlation. METHODS OF STUDYING CORRELATION a) Scatter Diagram Method b) Graphic Method c) Karl Pearson's Coefficient of Correlation d) Concurrent Deviation Method e) Method of Least Squares Karl Pearson's Coefficient of Correlation
Procedure i. Compute mean of the X series data ii. Compute mean of the Y series data iii. Compute deviations of X series from the mean of X. It is denoted as x. iv. Square the deviations. It is denoted as x2. v. Compute deviations of Y series from the mean of Y. It is denoted as y. vi. Square the deviations. It is denoted as y2. vii. Multiply deviation (X series, Y series) and compute total. It denoted as ∑xy.
The above values can be applied in the formula and correlation can be computed. Karl Pearson's Coefficient of Correlation (r)
where, dx = sum of deviations of X series from the assumed mean dy = sum of deviations of Y series from the assumed mean ∑dxdy = total of deviations (X and Y series) ∑dx2 = deviations of X series from assumed mean are squared ∑dy2 = deviations of Y series from assumed mean are squared N = Number of items The above values can be applied in the above formula and correlation can be computed. Correlation for the grouped data can be computed with the help of the following formula:
In the above formula, deviations are multiplied by the frequencies. Other steps are the same. CALCULATION OF CORRELATION ► Raw Score Method
r = 0.7
► Deviation Score Method (using Actual Mean) Calculate Karl Pearson’s Coefficient of Correlation from the following data:
Solution Karl Pearson’s Correlation Coefficients: Year 1985
Index of Production (X) 100
x = (XXmean) -4
x2 16
No. of unemployed y = (Y(Y) Ymean) 15 0
y2
xy
0
0
1986 1987 1988 1989 1990 1991 1992
102 104 107 105 112 103 99 ∑X = 832
-2 0 +3 +1 +8 -1 -5 ∑x =0
4 0 9 1 64 1 25 ∑x2 = 120
12 13 11 12 12 19 26 ∑Y = 120
-3 -2 -4 -3 -3 +4 +11 ∑y =0
= ∑X / N = 832/8 = 104 = ∑Y / N = 120/8 = 15
r = -92 / (120 x 184)1/2 = -0.619 The Correlation between index of production and unemployed is negative.
► Calculation of Correlation Coefficient (using Assumed Mean) Calculate Coefficient of Correlation from the following data:
Solution
9 +6 4 0 16 -12 9 -3 9 -24 16 -4 121 -55 2 ∑y ∑xy = 184 = -92
RANK CORRELATION It is a method of ascertaining co variability or the lack of it between the two variables. Rank correlation method is developed by the British Psychologist Charles Edward Spearmen in 1904. Gupta S.P has stated that "the rank correlation method is used when quantitative measures for certain factors cannot be fixed, but individual in the group can be arranged in order thereby obtaining for each individual a number of indicating his/her rank in the group". The formula for Rank Correlation
Rank – Difference Coefficient of Correlation (Case of no ties in ranks)
Rank Correlation,
= 1 – (6 x 38) / 5x(52-1) = 1 – 228/(5x24) = 1 – 1.9 = -0.9 Relationship between X and Y is very high and inverse. Relationship between Scores on Test I and Test II is very high and inverse. Procedure for Assigning Ranks First rank is given to the student secured highest score. For example, in Test I, student F is given first rank, as his score is the highest. The second rank is given to the next highest score. For example, in Test I, student E is given second rank. Student A and G have similar scores of 20 each and they stand for 6th and 7th ranks. Instead of giving either 6th or 7th ranks to both the students, the average of the two ranks [average of 6 and 7] 6.5 is given to each of them. The same procedure is followed to assign ranks to the scores secured by students in Test II. Calculation of Rank Correlation when ranks are tied Rank – Difference Coefficient of Correlation (in case of ties in ranks)
= 1 – [(6x24) / 10(102-1)] = 1 – [144/990] = 0.855 APPLICATION OF CORRELATION Karl Pearson Coefficient of Correlation can be used to assess the extent of relationship between motivation of export incentive schemes and utilization of such schemes by exporters. Motivation and Utilization of Export Incentive Schemes – Correlation Analysis
Opinion scores of various categories of exporters towards motivation and utilization of export incentive schemes can be recorded and correlated by using Karl Pearson Coefficient of Correlation and appropriate interpretation may be given based on the value of correlation.
Testing of Correlation 't' test is used to test correlation coefficient. Height and weight of a random sample of six adults is given.
It is reasonable to assume that these variables are normally distributed, so the Karl Pearson Correlation coefficient is the appropriate measure of the degree of association between height and weight. r = 0.875 Hypothesis test for Pearson's population correlation coefficient H0: p = 0 - this implies no correlation between the variables in the population H1: p > 0 - this implies that there is positive correlation in the population (increasing height is associated with increasing weight) 5% significance level
= 0.875 x [(6–2)1/2] / (1–0.8752) = 0.875 x 2 / 0.234 = 3.61 Table value of 5% significance level 4 degrees of freedom (n-2) = (6-2) = 2.132 Calculated value is more than the table value. Null hypothesis is rejected. There is significant positive correlation between height and weight. Partial Correlation Partial Correlation is used in a situation where three and four variables involved. There variables such as age, height and weight are given. Here, partial correlation is applied. Correlation between height and weight can be computed by keeping age constant. Age may be the important factor influences the strength of relationship between height and weight. Partial correlation is used to keep constant the effect age. The effect of one
variable is partially out from the correlation between other two variables. This statistical technique is known as partial correlation. Correlation between variables x and y is denoted as rxy Partial correlation is denoted by the symbol r123. This is correlation between variables 1 and 2, keeping 3rd variable constant.
where, r123 = partial correlation between variables 1 and 2 r12 = correlation between variables 1 and 2 r13 = correlation between variables 1 and 3 r23 = correlation between variables 2 and 3
Multiple Correlation Three or more variables are involved in multiple correlation. The dependent variable is denoted by X1 and other variables are denoted by X2, X3 etc. Gupta S. P. has expressed that "the coefficient of multiple linear correlation is represented by R1 and it is common to add subscripts designating the variables involved. Thus R1.234 would represent the coefficient of multiple linear correlation between X1 on the one hand, X2, X3 and X4 on the other. The subscript of the dependent variable is always to the left of the point". The coefficient of multiple correlation for r12, r13 and r23 can be expressed as follows:
Coefficient of multiple correlations for R1.23 is the same as R1.32. A coefficient of multiple correlation lies between 0 and 1. If the coefficient of multiple correlations is 1, it shows that the correlation is perfect. If it is 0, it shows that there is no linear relationship between the variables. The coefficients of multiple correlation are always positive in sign and range from +1 to 0. Coefficient of multiple determinations can be obtained by squaring R1.23. Multiple correlation analysis measures the relationship between the given variables. In this analysis the degree of association between one variable considered as the dependent variable and a group of other variables considered as the independent variables. SUMMARY This chapter outlined the significance in measuring the relationship. This chapter discuss the factors that affecting correlation. The different applications of correlation have been dealt in detail. KEY WORDS · · · · · · · · · · ·
Measures of Relationship Correlation Simple correlation Partial correlation Multiple correlation Regression Simple regression Multiple regressions Association of Attributes Scatter Diagram Method Graphic Method
· · ·
Karl Pearson's Coefficient of Correlation Concurrent Deviation Method Method of Least Squares Karl Pearson's Coefficient of Correlation
REVIEW QUESTIONS 1. What are the different measures and their significance in measuring Relationship? 2. Discuss the factors affecting Correlation. 3. What are the applications of Correlation? 4. Discuss in detail on different types of Correlation. REFERENCE BOOKS 1. Robert Ferber, Marketing research, New York: McGraw Hill Inc., 1976. 2. Chaturvedhi, J.C., Mathematical Statistics, Agra: Nok Jhonk Karyalaya, 1953. 3. Emony, C. William, Business Research Methods, Illinois, Irwin, Homewood. 1976.
- End of Chapter LESSON – 18 TABULATION OF DATA
STRUCTURE · · · ·
Table Relations frequency table Cross tabulation and stub-and-banner tables Guideline for cross tabulation
INTRODUCTION To get meaningful information from the data it is arranged in the tabular form. Frequency tables, histograms are simple form of tables.
Frequency Tables Frequency table or frequency distribution is a better way to arrange data. It helps in compressing data. Though some information is lost, compressed data show a pattern clearly. For constructing a frequency table, the data are divided into groups of similar values (class) and then record the number of observations that fall in each group. Table 1: Frequency table on age-wise classification of respondents
The data of collection days are presented in the following table as a frequency table. The number of classed can be increased by reducing the size of the class. The choice of class intervals is mostly guided by practical consideration rather than by rules. Class intervals are made in such a way that measurements are uniformly distributed over the class and the interval is not very large. Otherwise, the mid value will either overestimate or underestimate the measurement. Relative frequency tables Frequency is total number of data points that fall within that class. Frequency of each value can also be expressed as a fraction or percentage of the total number of observations. Frequencies expressed in percentage terms are known as relative frequencies. A relative frequency distribution is presented in the table below. Table 2: Relative frequency table on occupation-wise classification of respondents
It may be observed that the sum of all relative frequencies is 1.00 or 100% because frequency of each class has been expressed as a percentage of the total data.
Cumulative frequency tables Frequency or one-way tables represent the simplest method for analyzing categorical data. They are often used as one of the exploratory procedures to review how different categories of values are distributed in the sample. For example, in a survey of spectator interest in different sports, we could summarize the respondents' interest in watching football in a frequency table as follows: Table 3: Cumulative frequency table on Statistics about football watchers
The table above shows the number, proportion, and cumulative proportion of respondents who characterized their interest in watching football as either (1) Always interested, (2) Usually interested, (3) Sometimes interested, or (4) Never interested. Applications In practically every research project, a first "look" at the data usually includes frequency tables. For example, in survey research, frequency tables can show the number of males and females who participated in the survey, the number of respondents from particular ethnic and racial backgrounds, and so on. Responses on some labeled attitude measurement scales (e.g., interest in watching football) can also be nicely summarized via the frequency table. In medical research, one may tabulate the number of patients displaying specific symptoms; in industrial research one may tabulate the frequency of different causes leading to catastrophic failure of products during stress tests (e.g., which parts are actually responsible for the complete malfunction of television sets under extreme temperatures?). Customarily, if a data set includes any categorical data, then one of the first steps in the data analysis is to compute a frequency table for those categorical variables. Cross Tabulation and Stub-and-Banner Tables Managers and researchers frequently are interested in gaining a better understanding of the differences that exist between two or more subgroups. Whenever they try to identify characteristics common to one subgroup but not common to other subgroups, (i.e. they
are trying to explain differences between the subgroups). Cross tables are used to explain the difference between the subgroups. Cross tabulation is a combination of two (or more) frequency tables arranged such that each cell in the resulting table represents a unique combination of specific values of cross tabulated variables. Thus, cross tabulation allows us to examine frequencies of observations that belong to specific categories on more than one variable. By examining these frequencies, we can identify relations between cross tabulated variables. Only categorical variables or variables with a relatively small number of different meaningful values should be cross tabulated. Note that in the cases where we do want to include a continuous variable in a cross tabulation (e.g., income), we can first recode it into a particular number of distinct ranges (e.g. low, medium, high). Guidelines for Cross Tabulation The most commonly used method of data analysis is cross tabulation. The following guidelines will helpful to design proper cross tabulation, 1. The data should be in categorical form Cross tabulation is applicable to data 1 which both the dependent and the independent variables appear in categorical form. There are two types of categorical data. One type (assume type A) consists of variables that can be measured only in classes or categories. Like marital status, gender, occupation variables can be measured in categories not quantifiable (i.e. no measurable number). Another type (say type B) variables, which can be measured in numbers, such as age, income. For this type the different categories are associated with quantifiable numbers that show a progression from smaller values to larger values. Cross tabulation is used on both types of categorical variables. However when construction across tabulation is done using type B categorical variables, researchers find it helpful to use several special steps to make such cross tabulations more effective analysis tools. 1. If certain variable is believed to be influenced by some other variable, the former can be considered to be a dependent variable and the later is called as independent variable. 2. Cross tabulate an important dependent variable with one or more 'explaining' independent variables. Researchers typically cross tabulate a dependent variable of importance to the objectives of the research project (such as heavy user versus light user or positive attitude versus negative attitude) with one or more independent variables that the researchers believe
can help explain the variation observed in the dependent variable. Any two variables can be used in a cross tabulation so long as they both are in categorical form, and they both appear to be logically related to one another as dependent and independent variables consistent with the purpose and objectives of the research project. 3. Show percentage in a cross tabulation In a cross tabulation researchers typically show the percentage as well as the actual count s of the number of responses falling into the different cells of the table. The percentages more effectively reveal the relative sizes of the actual counts associated with the different cells and make it easier for researchers to visualize the patterns of differences that exist in the data. Constructing and Interpreting a Cross Tabulation After drawing the cross table the interpretations has to be drawn from the table. It should convey the meaning and findings from the table. In management research interpretations has more value. From the interpretations and findings managers take decisions. 2x2 Tables The simplest form of cross tabulation is the 2 by 2 table where two variables are "crossed," and each variable has only two distinct values. For example, suppose we conduct a simple study in which males and females are asked to choose one of two different brands of soda pop (brand A and brand B); the data file can be arranged like this:
The resulting cross tabulation could look as follows.
Each cell represents a unique combination of values of the two cross tabulated variables (row variable Gender and column variable Soda), and the numbers in each cell tell us how many observations fall into each combination of values. In general, this table shows us that more females than males chose the soda pop brand A, and that more males than females chose soda B. Thus, gender and preference for a particular brand of soda may be related (later we will see how this relationship can be measured). Marginal Frequencies The values in the margins of the table are simply one-way (frequency) tables for all values in the table. They are important in that they help us to evaluate the arrangement of frequencies in individual columns or rows. For example, the frequencies of 40% and 60% of males and females (respectively) who chose soda A (see the first column of the above table), would not indicate any relationship between Gender and Soda if the marginal frequencies for Gender were also 40% and 60%; in that case they would simply reflect the different proportions of males and females in the study. Thus, the differences between the distributions of frequencies in individual rows (or columns) and in the respective margins inform us about the relationship between the cross tabulated variables. Column, Row, and total Percentages. The example in the previous paragraph demonstrates that in order to evaluate relationships between cross tabulated variables we need to compare the proportions of marginal and individual column or row frequencies. Such comparisons are easiest to perform when the frequencies are presented as percentages. Evaluating the Cross Table Researchers find it useful to answer the following three questions when evaluating cross tabulation that appears to explain differences in a dependent variable. 1. Does the cross tabulation show a valid or a spurious relationship? 2. How many independent variables should be used in the cross tabulation? 3. Are the differences seen in the cross tabulation statistically significant, or could they have occurred by chance due to sampling variation?
Each of this is discussed below. Does the cross tabulation show a valid explanation? If it is logical to believe that changes in the independent variables can cause changes in the dependent variables, then the explanation revealed by the cross tabulation is thought to be a valid one. Does the cross tabulation show a valid or a spurious relationship? An explanation is thought to be a spurious one if the implied relationship between the dependent and independent variables does not seem to be logical. Example: family size, income seem appear to be logically related to the household consumption of certain basic food products. However it may not be logical to relate the number of automobiles owned with the brand of toothpaste preferred, or to relate the type of family pet with the occupation of the head of the family. If the independent variable does not logically have an effect or influence on the dependent variable, the relationship that a cross tabulation seems to show may not be a valid cause and effect relationship, and therefore may be a spurious relationship. How many independent variables should be used? When cross tabulating an independent variable that seems logically related to the dependent variable, what should researchers do if the results do not reveal a clear-cut relationship? Two possible courses of actions are available. 1. Try another cross tabulation, but this time using one of the other independent variable hypothesized to be important when the study was designed. 2. A preferred course of action is to introduce each additional independent variable simultaneously with rather than as an alternative to the first independent variable tried in the cross tabulation. By doing so it is possible to study the interrelationship between the dependent variable and two or more independent variables. SUMMARY The data can be summarized in the form of table. Cross table given the meaning full information from the raw data. The way of constructing cross tables and interpreting and evaluating is very important. KEY WORDS · ·
Class Frequency
· · · ·
Relations frequency Cumulative frequency Marginal frequency Cross table
REVIEW QUESTIONS 1. Why do we use cross tables? 2. How do you evaluate the cross table? 3. Define the guidelines for constructing the cross table
- End of Chapter LESSON – 19 STATISTICAL SOFTWARE
OBJECTIVES · ·
To learn the application of various statistical packages used for the management research process To understand the procedures for performing the tests using SPSS
STRUCTURE · · · ·
Statistical packages Statistical analysis using SPSS t-test, F-test, chi-square test, Anova Factor analysis
Statistical Packages The following statistical software packages are widely used: · · ·
STATA SPSS SAS
STATA
Stata, created in 1985 by Statacorp, is a statistical program used by many businesses and academic institutions around the world. Most of its users work in research, especially in the fields of economics, sociology, political science, and epidemiology. Stata's full range of capabilities includes: · · · · ·
Data management Statistical analysis Graphics Simulations Custom programming
SPSS The computer program SPSS (originally, Statistical Package for the Social Sciences) was released in its first version in 1968, and is among the most widely used programs for statistical analysis in social science. It is used by market researchers, health researchers, survey companies, government, education researchers' and others. In addition to statistical analysis, data management (case selection, file reshaping, creating derived data) and data documentation are features of the base software. The many features of SPSS are accessible via pull-down menus (see image) or can be programmed with a proprietary 4GL "command syntax language". Command syntax programming has the benefits of reproducibility and handling complex data manipulations and analyses Solve business and research problems using SPSS for Windows, a statistical and data management package for analysts and researchers. SPSS for Windows provides you with a broad range of capabilities for the entire analytical process. With SPSS, you can generate decision-making information quickly using powerful statistics, understand and effectively present the results with highquality tabular and graphical output, and share the results with others using a variety of reporting methods, including secure Web publishing. Results from the data analysis enable you to make smarter decisions more quickly by uncovering key facts, patterns, and trends. An optional server version delivers enterprise-strength scalability, additional tools, security, and enhanced performance. SPSS can be used for Windows in a variety of areas, including: · · · · · · ·
Survey and market research and direct marketing Academia Administrative research Medical, scientific, clinical, and social science research Planning and forecasting Quality improvement Reporting and ad hoc decision making
·
Enterprise-level analytic application development
In particular, apply SPSS statistics software to gain greater insight into the actions, attributes, and attitudes of peopleth - e customers, employees, students, or citizens. Add more functionality as you need it SPSS for Windows is a modular, tightly integrated, full-featured product line for the analytical process — planning, data collecting, data access, data management and preparation, data analysis, reporting, and deployment. Using a combination of add-on modules and stand-alone software that work seamlessly with SPSS Base enhances the capabilities of this statistics software. The intuitive interface makes it, easy to use - yet it gives you all of the data management, statistics, and reporting methods you need to do a wide range of analysis. Gain unlimited programming capabilities Dramatically increase the power and capabilities of SPSS for Windows by using the SPSS Programmability Extension. This feature enables analytic and application developers to extend the SPSS command syntax language to create procedures and applications - and perform even the most complex jobs within SPSS. The SPSS Programmability Extension is included with SPSS Base, making this statistics software an even more powerful solution. Maximize market opportunities The more competitive and challenging the business environment, the more you need market research. Market research is the systematic and objective gathering, analysis, and interpretation of information. It helps the organization identify problems and opportunities and allows for better-informed, lower-risk decisions. For decades, solutions from SPSS Inc. have added value for those involved in market research. SPSS solutions support the efficient gathering of market research information through many different methods, and make it easier to analyze and interpret this information and provide it to decision makers. We offer solutions both to companies that specialize in providing market research services and to organizations that conduct their own market research. SPSS market research solutions help you: · · · ·
Understand the market perception of the brand Conduct effective category management Confidently develop product features Perform competitive analysis
With this insight, you or the clients can confidently make decisions about developing and marketing the products and enhancing the brand.
The heart of SPSS market research solution is dimensions product family. Through Dimensions, the organization can centralize the creation and fielding of surveys in any mode and in any language, as well as the analysis and reporting phases of the research. Dimensions data can be directly accessed using SPSS for Windows, which enables the analysts to use SPSS' advanced statistical and graphing capabilities to explore the survey data. Add-on modules and integrated stand-alone products extend SPSS' analytical and reporting capabilities. For example, analyze responses to open-ended survey questions with SPSS Text Analysis for Surveys. Maximize the value the organization receives from its Dimensions data by using an enterprise feedback management (EFM) solution from SPSS. EFM provides you with a continuous means of incorporating regular customer insight into the business operations. Engage with current or prospective customers through targeted feedback programs or by asking questions during naturally occurring events. Then use the resulting insights to drive business improvement across the organization. SPSS' EFM solution also enables you to integrate the survey data with transactional and operational data, so you gain a more accurate, complete understanding of customer preferences, motivations, and intentions. Thanks to the integration among SPSS offerings, you can incorporate insights gained through survey research in the predictive models created by the data mining tools. You can then deploy predictive insight and recommendations to people and to automated systems through any of the predictive analytics applications. SAS The SAS System, originally Statistical Analysis System, is an integrated system of software products provided by SAS Institute that enables the programmer to perform: · · · · · · · · ·
Data entry, retrieval, management, and mining Report writing and graphics Statistical and mathematical analysis Business planning, forecasting, and decision support Operations research and project management Quality improvement Applications development Warehousing (extract, transform, load} Platform independent and remote computing
In addition, the SAS System integrates with many SAS business solutions that enable large scale software solutions for areas such as human resource management, financial management, business intelligence, customer relationship management and more. Statistical analyses using SPSS Introduction
This section shows how to perform a number of statistical tests using SPSS. Each section gives a brief description of the aim of the statistical test, when it is used, an example showing the SPSS commands and SPSS (often abbreviated} output with a brief interpretation of the output. In deciding which test is appropriate to use, it is important to consider the type of variables that you have (i.e., whether your variables are categorical, ordinal or interval and whether they are normally distributed). Statistical methods using SPSS One sample t-test A one sample t-test allows us to test whether a sample mean (of a normally distributed interval variable) significantly differs from a hypothesized value. For example, using the data file, say we wish to test whether the average writing score (write) differs significantly from 50. We can do this as shown below: t-test /testval = 50 /variable = write.
The mean of the variable write for this particular sample of students is 52.775, which is statistically significantly different from the test value of 50. We would conclude that this group of students has a significantly higher mean on the writing test than 50. One sample median test
A one sample median test allows us to test whether a sample median differs significantly from a hypothesized value. We will use the same variable, write, as we did in the one sample t-test example above, but we do not need to assume that it is interval and normally distributed (we only need to assume that write is an ordinal variable). However, we are unaware of how to perform this test in SPSS. Binomial test A one sample binomial test allows us to test whether the proportion of successes on a two-level categorical dependent variable significantly differs from a hypothesized value. For example, using the, say we wish to test whether the proportion of females (female) differs significantly from 50%, i.e., from 0.5. We can do this as shown below: npar tests /binomial (.5) = female.
a. Based on Z Approximation. The results indicate that there is no statistically significant difference (p =0.229). In other words, the proportion of females in this sample does not significantly differ from the hypothesized value of 50%. Chi-square goodness of fit A chi-square goodness of fit test allows us to test whether the observed proportions for a categorical variable differ from hypothesized proportions. For example, let's suppose that we believe that the general population consists of 10% Hispanic, 10% Asian, 10% African American and 70% White folks. We want to test whether the observed proportions from our sample differ significantly from these hypothesized proportions. npar test /chisquare = race /expected = 10 10 10 70.
a. 0 cells (0%) have expected count less than 5. The minimum expected cell frequency is 20.0 These results show that racial composition in our sample does not differ significantly from the hypothesized values that we supplied (chi-square with three degrees of freedom = 5.029, p = 0.170). Two independent samples t-test An independent samples t-test is used when you want to compare the means of a normally distributed interval dependent variable for two independent groups. For example, using the, say we wish to test whether the mean for write is the same for males and females. t-test groups = female(0 1) /variables = write.
The results indicate that there is a statistically significant difference between the mean writing score for males and females (t = -3.734, p = 0.000). In other words, females have a statistically significantly higher mean score on writing (54.99) than males (50.12). Chi-square test A chi-square test is used when you want to see if there is a relationship between two categorical variables. In SPSS, the chi2 option is used with the tabulate command to obtain the test statistic and its associated p-value. Let's see if there is a relationship between the type of school attended (schtyp) and student’s gender (female). Remember that the chi-square test assumes that the expected value for each cell is five or higher. This assumption is easily met in the examples below. However, if this assumption' is not met in your data, please see the section on Fisher's exact test below.
a. Computed only for a 2x2 table b. 0 cells (.0%) have expected count less than 5. The minimum expected count is 14.56.
These results indicate that there is no statistically significant relationship between the type of school attended and gender (chi-square with one degree of freedom = 0.047, p = 0.828). Let's look at another example, this time looking at the linear relationship between gender (female) and socio-economic status (ses). The point of this example is that one (or both) variables may have more than two levels, and that the variables do not have to have the same number of levels. In this example, female has two levels (male and female) and ses has three levels (low, medium and high).
a. 0 cells (.0%) have expected count less than 5. The minimum expected count is 21.39.
Again we find that there is no statistically significant relationship between the variables (chi-square with two degrees of freedom = 4.577, p = 0.101). Fisher's exact test The Fisher's exact test is used when you want to conduct a chi-square test but one or more of your cells have an expected frequency of five or less. Remember that the chisquare test assumes that each cell has an expected frequency of five or more, but the Fisher's exact test has no such assumption and can be used regardless of how small the expected frequency is. In SPSS unless you have the SPSS Exact Test Module, you can only perform a Fisher's exact test on a 2x2 table, and these results are presented by default. Please see the results from the chi squared example above. One-way ANOVA
A one-way analysis of variance (ANOVA) is used when you have a categorical independent variable (with two or more categories) and a normally distributed interval dependent variable and you wish to test for differences in the means of the dependent variable broken down by the levels of the independent variable. For example, using the data file, say we wish to test whether the mean of write differs between the three program types (prog).
The mean of the dependent variable differs significantly among the levels of program type. However, we do not know if the difference is between only two of the levels or all three of the levels. (The F test for the Model is the same as the F test for prog because prog was the only variable entered into the model. If other variables had also been entered, the F test for the Model would have been different from prog). To see the mean of write or each level of program type
From this we can see that the students in the academic program have the highest mean writing score, while students in the vocational program have the lowest. Discriminant analysis Discriminant analysis is used when you have one or more normally distributed interval independent variables and a categorical dependent variable It is a multivariate technique that considers the latent dimensions in the independent variables for predicting group membership in the categorical dependent variable.
Pooled within-groups correlations between discriminating variables and standardized canonical discriminant functions. Variables ordered by absolute size of correlation within function. *Largest absolute correlation between each variable and any discriminant function
Clearly, the SPSS output for this procedure is quite lengthy, and it is beyond the scope of this page to explain all of it. However, the main point is that two canonical variables are identified by the analysis, the first of which seems to be more related to program type than the second. Factor analysis Factor analysis is a form of exploratory multivariate analysis that is used to either reduce the number of variables in a model or to detect relationships among variables. All variables involved in the factor analysis need to be interval and are assumed to be normally distributed. The goal of the analysis is to try to identify factors which underlie the variables. There may be fewer factors than variables, but there may not be more factors than variables. For our example, let's suppose that we think that there are some common factors underlying the various test scores. We will include subcommands for varimax rotation and a plot of the eigenvalues. We will use a principal components extraction and will retain two factors. (Using these options will make our results compatible with those from SAS and Stata and are not necessarily the options that you will want to use).
Communality (which is the opposite of uniqueness) is the proportion of variance of the variable (i.e., read) that is accounted for by all of the factors taken together, and a very low communality can indicate that a variable may not belong with any of the factors. The screen plot may be useful in determining how many factors to retain. From the component matrix table, we can see that all five of the test scores load onto the first factor, while all five tend to load not so heavily on the second factor. The purpose of rotating the factors is to get the variables to load either very high or very low on each factor. In this example, because all of the variables loaded onto factor 1 and not on factor 2, the rotation did not aid in the interpretation. Instead, it made the results even more difficult to interpret. SUMMARY The statistical packages applied for the management research process are SPSS, SAS and STATA. This software makes the research process effective. It reduces the time of doing analysis. Large data also can be easily analyzed using these softwares. The chapter also given the detailed procedures and interpretation of SPSS output for statistical tests. KEY TERMS · · · · ·
SPSS SAS STATA Dependent variable Independent variable
REVIEW QUESTIONS 1. Define SPSS 2. What do you mean by STATA and SAS
3. List out the application of statistical software to the market research 4. Define dependent variable 5. Define independent variable 6. Differentiate the relative frequency and cumulative frequency with suitable
- End of Chapter -
LESSON – 20 ADVANCED DATA TECHNIQUES
OBJECTIVES To understand the procedures and applications of following statistical analysis o o o o
Discriminant analysis ANOVA Multi dimensional Scaling Cluster analysis
STRUCTURE · · · · · · ·
Analysis of variance Condition for ANOVA ANOVA model Discriminant analysis Factor analysis Cluster analysis Ward's method
Analysis of Variance (ANOVA) Analysis of variance (ANOVA) is used to test hypotheses about differences between two or more means. The t-test based on the standard error of the difference between two
means can only be used to test differences between two means. When there are more than two means, it is possible to compare each mean with each other mean using t-tests. However, conducting multiple t-tests can lead to severe inflation of the Type I error rate. Analysis of variance can be used to test differences among several means for significance without increasing the Type I error rate. This chapter covers designs with betweensubject variables. The next chapter covers designs with within-subject variables. The statistical method for testing the null hypothesis that the means of several populations are equal is analysis of variance. It uses a single factor, fixed -effects model to compare the effects of one factor (brands of coffee, varieties if residential housing, types if retail stores) on a continuous dependent variable. In a fixed effects model, the levels of the factor are established in advance and the results are not generalizable to other levels of treatment. Consider a hypothetical experiment on the effect of the intensity of distracting background noise on reading comprehension. Subjects were randomly assigned to one of three groups. Subjects in Group 1 were given 30 minutes to read a story without any background noise. Subjects in Group 2 read the story with moderate background noise, and subjects in Group 3 read the story in the presence of loud background noise. The first question the experimenter was interested in was whether background noise has any effect at all. That is, whether the null hypothesis: μ1 = μ2 = μ3 is true where μ1 is the population mean for the "no noise" condition, μ2 is the population mean for the "moderate noise" condition, and μ3 is the population mean for the "loud noise" condition. The experimental design therefore has one factor (noise intensity) and this factor has three levels: no noise, moderate noise, and loud noise. Analysis of variance can be used to provide a significance test of the null hypothesis that these three population means are equal. If the test is significant, then the null hypothesis can be rejected and it can be concluded that background noise has an effect. In a one-factor between subjects ANOVA, the letter "a" is used to indicate the number of levels of the factor (a = 3 for the noise intensity example). The number of subjects assigned to condition 1 is designated as n1; the number of subjects assigned to condition 2 is designated by n2, etc. If the sample size is the same for all of the treatment groups, then the letter "n" (without a subscript) is used to indicate the number of subjects in each group. The total number of subjects across all groups is indicated by "N". If the sample sizes are equal, then N = (a)(n); otherwise, N = n1 + n2 + ... + na Some experiments have more than one between-subjects factor. For instance, consider a hypothetical experiment in which two age groups (8-year olds and 12-year olds) are asked to perform a task either with or without distracting background noise. The two factors are age and distraction.
Assumptions Analysis of variance assumes normal distributions and homogeneity of variance. Therefore, in a one-factor ANOVA, it is assumed that each of the populations is normally distributed with the same variance (σ2). In between-subjects analyses, it is assumed that each score is sampled randomly and independently. Research has shown that ANOVA is "robust" to violations of its assumptions. This means that the probability values computed in an ANOVA are satisfactorily accurate even if the assumptions are violated. Moreover, ANOVA tends to be conservative when its assumptions are violated. This means that although power is decreased, the probability of a Type I error is as low or lower than it would be if its assumptions were met. There are exceptions to this rule. For example, a combination of unequal sample sizes and a violation of the assumption of homogeneity of variance can lead to an inflated Type I error rate. Conditions for ANOVA 1. The sample must be randomly selected from normal populations 2. The populations should have equal variances 3. The distance from one value to its group's mean should be independent of the distances of other values to that mean (independence of error). 4. Minor variations from normality and equal variances are tolerable. Nevertheless, the analyst should check the assumptions with the diagnostic techniques. Analysis of variance breaks down or partitions total variability into component parts. Unlike the 't' test, which uses the sample standard deviations, ANOVA uses squared deviations of the variance so computation of distances of the individual data points from their own mean or from the grand mean can be summed. In ANOVA model, each group has its own mean and values that deviate from that mean. Similarly, all the data points from all of the groups produce an overall grand mean. The total deviation is the sum of the squared differences between each data point and the overall grand mean. The total deviation of any particular data point may be partitioned into between-groups variance and within-group variance. The between-group variance represents the effect of the treatment or factor. The difference of between-groups means imply that each group was treated differently. The treatment will appear as deviations of the sample means from the grand mean. Even if this were not so, there would still be some natural variability among subjects and some variability attributable to sampling. The withingroups variance describes the deviations of the data points within each group from the sample mean. This results from variability among subjects and from random variation. It is often called error.
When the variability attributable to the treatment exceeds the variability arising from error and random fluctuations, the viability of the null hypothesis begins to diminish. And this is exactly the way the test static for analysis of variance works. The test statistics for ANOVA is the F ratio. It compares the variance from the last two sources:
To compute the F ratio, the sum of the squared deviations for the numerator and denominator are divided by their respective degrees of freedom. By dividing, computing the variances as an average or mean, thus the term mean square. The number of degrees of freedom for the numerator, the mean square between groups, is one less than the number of groups (k-1). The degree of freedom for the denominator, the mean square within groups, is the total number of observations minus the number of groups (n-k). If the null hypothesis is true, there should be no difference between the populations, and the ratio should be close to 1. If the population means are not equal, the numerator should manifest this difference. The F ratio should be greater than 1. The f distribution determines the size of ratio necessary to reject the null hypothesis for a particular sample size and level of significance. ANOVA model To illustrate reports one way ANOVA, consider the following hypothetical example. To find out the number one best business school in India, 20 business magnets were randomly selected and asked to rate the top 3 B-schools. The ratings are given below: Data
Let's apply the one way ANOVA test on this example. Step 1: null hypothesis H0: μA1 = μA2 = μA3 HA: μA1 ≠ μA2 ≠ μA3 Step 2: statistical test The F test is chosen because the example has independent samples, accept the assumptions of analysis of variance and have interval data. Step 3: significance level Let a = 0.05 and degree of freedom = [numerator (k-1) = (3-1) = 2], [denominator (n-k) = (60-3) = 57] = 2, 57
Step 4: calculated value F = Mean Square between / Mean square within F= 5822.017/ 205.695 = 28.304 degree of freedom (2, 57) Step 5: critical test value From the F-distribution table with degree of freedom (2, 57), α = 0.05, the critical value is 3.16. Step 6: decision Since the calculated value is greater than the critical value (28.3 > 3.16), the null hypothesis is rejected. The conclusion is: there is statistically significant difference between two or more pairs of means. The following table shows that the p value equals 0.0001. Since the p value (0.0001) is less than the significance level (0.05), this is the second method for rejecting the null hypothesis. The ANOVA model summary given in the following table is the standard way of summarizing the results of analysis of variance. It contains the sources of variation, degrees of freedom, sum of squares, mean squares and calculated F value. The probability of rejecting the null hypothesis is computed up to 100 percent α -- that is, the probability value column reports the exact significance for the F ratio being tested.
S = Significantly different at this level. Significance level: 0.05
All data are hypothetical Figures on One-way analysis of variance plots
Discriminant analysis
Discriminant function analysis is used to determine which variables discriminate between two or more naturally occurring groups. For example, an educational researcher may want to investigate which variables discriminate between high school graduates who decide (1) to go to college, (2) to attend a trade or professional school, or (3) to seek no further training or education. For that purpose the researcher could collect data on numerous variables prior to students’ graduation. After graduation, most students will naturally fall into one of the three categories. Discriminant Analysis could then be used to determine which variable(s) are the best predictors of students’ subsequent educational choice. A medical researcher may record different variables relating to patients’ backgrounds in order to learn which variables best predict whether a patient is likely to recover completely (group 1), partially (group 2), or not at all (group 3). A biologist could record different characteristics of similar types (groups) of flowers, and then perform a Discriminant function analysis to determine the set of characteristics that allows for the best discrimination between the types. Computational Approach Let us consider a simple example. Suppose we measure height in a random sample of 50 males and 50 females. Females are, on the average, not as tall as males, and this difference will be reflected in the difference in means (for the variable Height). Therefore, variable height allows us to discriminate between males and females with a better than chance probability: if a person is tall, it is likely to be a male, if a person is short, it is likely to be a female. We can generalize this reasoning to groups and variables that are less "trivial." For example, suppose we have two groups of high school graduates: Those who choose to attend college after graduation and those who do not. We could have measured students' stated intention to continue on to college one year prior to graduation. If the means for the two groups (those who actually went to college and those who did not) are different, then we can say that intention to attend college as stated one year prior to graduation allows us to discriminate between those who are and are not college bound (and this information may be used by career counselors to provide the appropriate guidance to the respective students). To summarize the discussion so far, the basic idea underlying discriminant function analysis is to determine whether groups differ with regard to the mean of a variable, and then to use that variable to predict group membership (e.g., of new cases). Stepwise Discriminant Analysis Probably the most common application of Discriminant function analysis is to include many measures in the study, in order to determine the ones that discriminate between groups. For example, an educational researcher interested in predicting high school graduates' choices for further education would probably include as many measures of
personality, achievement motivation, academic performance, etc. as possible in order to learn which one(s) offer the best prediction. Model. Put another way, we want to build a "model" of how we can best predict to which group a case belongs. In the following discussion wc will use the term "in the model" in order to refer to variables that are included in the prediction of group membership, and we will refer to variables as being "not in the model" if they are not included. Forward stepwise analysis. In stepwise discriminant function analysis, a model of discrimination is built step-by-step. Specifically, at each step all variables are reviewed and evaluated to determine which one will contribute most to the discrimination between groups. That variable will then be included in the model, and the process starts again. Backward stepwise analysis. One can also step backwards; in that case all variables are included in the model and then, at each step, the variable that contributes least to the prediction of group membership is eliminated. Thus, as the result of a successful discriminant function analysis, one would only keep the "important" variables in the model, that is, those variables that contribute the most to the discrimination between groups. F to enter, F to remove. The stepwise procedure is "guided" by the respective F to enter and F to remove values. The F value for a variable indicates its statistical significance in the discrimination between groups, that is, it is a measure of the extent to which a variable makes a unique contribution to the prediction of group membership. Capitalizing on chance. A common misinterpretation of the results of stepwise discriminant analysis is to take statistical significance levels at face value. By nature, the stepwise procedures will capitalize on chance because they "pick and choose" the variables to be included in the model so as to yield maximum discrimination. Thus, when using the stepwise approach the researcher should be aware that the significance levels do not reflect the true alpha error rate, that is, the probability of erroneously rejecting H0 (the null hypothesis that there is no discrimination between groups). Multi Dimensional Scaling Multidimensional scaling (MDS) is a set of related statistical techniques often used in data visualisation for exploring similarities or dissimilarities in data. An MDS algorithm starts with a matrix of item-item similarities, and then assigns a location of each item in a low-dimensional space, suitable for graphing or 3D visualisation. Categorization of MDS MDS algorithms fall into a taxonomy, depending on the meaning of the input matrix:
♦ Classical multidimensional scaling also often called Metric multidimensional scaling -- assumes the input matrix is just an item-item distance matrix. Analogous to Principal components analysis, an eigenvector problem is solved to find the locations that minimize distortions to the distance matrix. Its goal is to find a Euclidean distance approximating a given distance. It can be generalized to handle 3-way distance problems (the generalization is known as DISTATIS). ♦ Metric multidimensional scaling -- A superset of classical MDS that assumes a known parametric relationship between the elements of the item-to-item dissimilarity matrix and the Euclidean distance between the items. ♦ Generalized multidimensional scaling (GMDS) -- A superset of metric MDS that allows for the target distances to be non-Euclidean. ♦ Non-metric multidimensional scaling -- In contrast to metric MDS, non-metric MDS both finds a non-parametric monotonic relationship between the dissimilarities in the item-item matrix and the Euclidean distance between items, and the location of each item in the low-dimensional space. The relationship is typically found using isotonic regression. Multidimensional Scaling Procedure There are several steps in conducting MDS research: 1. Formulating the problem - What brands do you want to compare? How many brands do you want to compare? More than 20 would be cumbersome. Less than 8 (4 pairs) will not give valid results. What purpose is the study to be used for? 2. Obtaining Input Data - Respondents are asked a series of questions. For each product pair they are asked to rate similarity (usually on a 7 point Likert scale from very similar to very dissimilar). The first question could be for Coke/Pepsi for example, the next for Coke/Hires rootbeer, the next for Pepsi/Dr Pepper, the next for Dr Pepper/Hires rootbeer, etc. The number of questions is a function of the number of brands and can be calculated as Q = N (N - 1) / 2 where Q is the number of questions and N is the number of brands. This approach is referred to as the "Perception data: direct approach". There are two other approaches. There is the "Perception data: derived approach" in which products are decomposed into attributes which are rated on a semantic differential scale. The other is the "Preference data approach" in which respondents are asked their preference rather than similarity. 3. Running the MDS statistical program - Software for running the procedure is available in most of the better statistical applications programs. Often there is a choice between Metric MDS (which deals with interval or ratio level data), and Non-metric MDS (which deals with ordinal data). The researchers must decide on the number of dimensions they want the computer to create. The more dimensions, the better the statistical fit, but the more difficult it is to interpret the results.
4. Mapping the results and defining the dimensions - The statistical program (or a related module) will map the results. The map will plot each product (usually in two dimensional space). The proximity of products to each other indicates either how similar they are or how preferred they are, depending on which approach was used. The dimensions must be labelled by the researcher. This requires subjective judgment and is often very challenging. The results must be interpreted. 5. Test the results for reliability and validity - Compute R squared to determine what proportion of variance of the scaled data can be accounted for by the MDS procedure. An R-square of 0.6 is considered the minimum acceptable level. Other possible tests are Kruskal's Stress, split data tests, data stability tests (i.e. eliminating one brand), and test-retest reliability. Input Data The input to MDS is a square, symmetric 1-mode matrix indicating relationships among a set of items. By convention, such matrices are categorized as either similarities or dissimilarities, which are opposite poles of the same continuum. A matrix is a similarity matrix if larger numbers indicate more similarity between items, rather than less. A matrix is a dissimilarity matrix if larger numbers indicate less similarity. The distinction is somewhat misleading, however, because similarity is not the only relationship among items that can be measured and analyzed using MDS. Hence, many input matrices are neither similarities nor dissimilarities. However, the distinction is still used as a means of indicating whether larger numbers in the input data should mean that a given pair of items should be placed near each other on the map, or far apart. Calling the data "similarities" indicates a negative or descending relationship between input values and corresponding map distances, while calling the data "dissimilarities" or "distances" indicates a positive or ascending relationship. A typical example of an input matrix is the aggregate proximity matrix derived from a pile-sort task. Each cell xij of such a matrix records the number (or proportion) of respondents who placed items i and j into the same pile. It is assumed that the number of respondents placing two items into the same pile is an indicator of the degree to which they are similar. An MDS map of such data would put items close together which were often sorted into the same piles. Another typical example of an input matrix is a matrix of correlations among variables. Treating these data as similarities (as one normally would), would cause the MDS program to put variables with high positive correlations near each other, and variables with strong negative correlations far apart. Another type of input matrix is a flow matrix. For example, a dataset might consist of the number of business transactions occurring during a given period between a set of corporations. Running this data through MDS might reveal clusters of corporations that whose members trade more heavily with one another than other than with outsiders.
Although technically neither similarities nor dissimilarities, these data should be classified as similarities in order to have companies who trade heavily with each other show up close to each other on the map. Dimensionality Normally, MDS is used to provide a visual representation of a complex set of relationships that can be scanned at a glance. Since maps on paper are two-dimensional objects, this translates technically to finding an optimal configuration of points in 2dimensional space. However, the best possible configuration in two dimensions may be a very poor, highly distorted, representation of your data. If so, this will be reflected in a high stress value. When this happens, you have two choices: you can either abandon MDS as a method of representing your data, or you can increase the number of dimensions. There are two difficulties with increasing the number of dimensions: The first is that even 3 dimensions are difficult to display on paper and are significantly more difficult to comprehend. Four or more dimensions render MDS virtually useless as a method of making complex data more accessible to the human mind. The second problem is that with increasing dimensions, you must estimate an increasing number of parameters to obtain a decreasing improvement in stress. The result is model of the data that is nearly as complex as the data itself. On the other hand, there are some applications of MDS for which high dimensionality are not a problem. For instance, MDS can be viewed as a mathematical operation that converts an item-by-item matrix into an item-by-variable matrix. Suppose, for example, that you have a person-by-person matrix of similarities in attitudes. You would like to explain the pattern of similarities in terms of simple personal characteristics such as age, sex, income and education. The trouble is that these two kinds of data are not conformable. The person-by-person matrix in particular is not the sort of data you can use in a regression to predict age (or vice-versa). However, if you run the data through MDS (using very high dimensionality in order to achieve perfect stress), you can create a person-by-dimension matrix which is similar to the person-by-demographics matrix that you are trying to compare it to. MDS and Factor Analysis Even though there are similarities in the type of research questions to which these two procedures can be applied, MDS and factor analysis are fundamentally different methods. Factor analysis requires that the underlying data are distributed as multivariate normal, and that the relationships are linear. MDS imposes no such restrictions. As long as the rank-ordering of distances (or similarities) in the matrix is meaningful, MDS can be used. In terms of resultant differences, factor analysis tends to extract more factors (dimensions) than MDS; as a result, MDS often yields more readily,
interpretable solutions. Most importantly, however, MDS can be applied to any kind of distances or similarities, while factor analysis requires us to first compute a correlation matrix. MDS can be based on subjects' direct assessment of similarities between stimuli, while factor analysis requires subjects to rate those stimuli on some list of attributes (for which the factor analysis is performed). In summary, MDS methods are applicable to a wide variety of research designs because distance measures can be obtained in any number of ways. Applications Marketing In marketing, MDS is a statistical technique for taking the preferences and perceptions of respondents and representing them on a visual grid. These grids, called perceptual maps, are usually two-dimensional, but they can represent more than two. Potential customers are asked to compare pairs of products and make judgments about their similarity. Whereas other techniques obtain underlying dimensions from responses to product attributes identified by the researcher, MDS obtains the underlying dimensions from respondents' judgments about the similarity of products. This is an important advantage. It does not depend on researchers' judgments. It does not require a list of attributes to be shown to the respondents. The underlying dimensions come from respondents’ judgments about pairs of products. Because of these advantages, MDS is the most common technique used in perceptual mapping. The "beauty" of MDS is that we can analyze any kind of distance or similarity matrix. These similarities can represent people's ratings of similarities between objects, the percent agreement between judges, the number of times a subjects fails to discriminate between stimuli, etc. For example, MDS methods used to be very popular in psychological research on person perception where similarities between trait descriptors were analyzed to uncover the underlying dimensionality of people's perceptions of traits (see, for example Rosenberg, 1977). They are also very popular in marketing research, in order to detect the number and nature of dimensions underlying the perceptions of different brands or products & Carmone, 1970). In general, MDS methods allow the researcher to ask relatively unobtrusive questions ("how similar is brand A to brand B") and to derive from those questions underlying dimensions without the respondents ever knowing what is the researcher's real interest. Cluster Analysis The term cluster analysis (first used by Tryon, 1939) encompasses a number of different algorithms and methods for grouping objects of similar kind into respective categories. A general question facing researchers in many areas of inquiry is how to organize observed data into meaningful structures, that is, to develop taxonomies. In other words cluster analysis is an exploratory data analysis tool which aims at sorting different
objects into groups in a way that the degree of association between two objects is maximal if they belong to the same group and minimal otherwise. Given the above, cluster analysis can be used to discover structures in data without providing an explanation/interpretation. In other words, cluster analysis simply discovers structures in data without explaining why they exist. We deal with clustering in almost every aspect of daily life. For example, a group of diners sharing the same table in a restaurant may be regarded as a cluster of people. In food stores items of similar nature, such as different types of meat or vegetables are displayed in the same or nearby locations. There is countless number of examples in which clustering plays an important role. For instance, biologists have to organize the different species of animals before a meaningful description of the differences between animals is possible. According to the modern system employed in biology, man belongs to the primates, the mammals, the amniotes, the vertebrates, and the animals. Note how in this classification, the higher the level of aggregation the less similar are the members in the respective class. Man has more in common with all other primates (e.g., apes) than it does with the more "distant" members of the mammals (e.g., dogs), etc. For a review of the general categories of cluster analysis methods, see Joining (Tree Clustering), Two-way Joining (Block Clustering), and k-Means Clustering. Cluster Analysis (CA) is a classification method that is used to arrange a set of cases into clusters. The aim is to establish a set of clusters such that cases within a cluster are more similar to each other than they are to cases in other clusters. Cluster analysis is an exploratory data analysis tool for solving classification problems. Its object is to sort cases (people, things, events, etc) into groups, or clusters, so that the degree of association is strong between members of the same cluster and weak between members of different clusters. Each cluster thus describes, in terms of the data collected, the class to which its members belong; and this description may be abstracted through use from the particular to the general class or type. Cluster analysis is thus a tool of discovery. It may reveal associations and structure in data which, though not previously evident, nevertheless are sensible and useful once found. The results of cluster analysis may contribute to the definition of a formal classification scheme, such as taxonomy for related animals, insects or plants; or suggest statistical models with which to describe populations; or indicate rules for assigning new cases to classes for identification and diagnostic purposes; or provide measures of definition, size and change in what previously were only broad concepts; or find exemplars to represent classes. Whatever business you're in, the chances are that sooner or later you will run into a classification problem. Cluster analysis might provide the methodology to help you solve. Procedure for Cluster Analysis
1. Formulate the problem - select the variables that you wish to apply the clustering technique to 2. Select a distance measure - various ways of computing distance: · · ·
Squared Euclidean distance - the square root of the sum of the squared differences in value for each variable Manhattan distance - the sum of the absolute differences in value for any variable Chebychev distance - the maximum absolute difference in values for any variable
3. Select a clustering procedure (see below) 4. Decide on the number of clusters 5. Map and interpret clusters - draw conclusions - illustrative techniques like perceptual maps, icicle plots, and dendrograms are useful 6. Assess reliability and validity - various methods: · · · · ·
repeat analysis but use different distance measure repeat analysis but use different clustering technique split the data randomly into two halves and analyze each part separately repeat analysis several times, deleting one variable each time repeat analysis several times, using a different order each time
Figure showing three pairs of clusters: ♦ (AB) ♦ (DE) ♦ (FG)
Beyond these we can see that (AB) & (C) and (DE) are more similar to each other than to (FG). Hence we could construct the following dendrogram (hierarchical classification).
Note that the clusters are joined (fused) at increasing levels of 'dissimilarity'. The actual measure of dissimilarity will depend upon the method used. It may be a similarity measure or a distance measure. Distances between points can be calculated by using an extension of Pythagorus (these are euclidean distances). These measures of 'dissimilarity' can be extended to more than 2 variables (dimensions) without difficulty. Clustering Algorithms Having selected how we will measure similarity (the distance measure) we must now choose the clustering algorithm, i.e. the rules which govern between which points distances are measured to determine cluster membership. There are many methods available, the criteria used differ and hence different classifications may be obtained for the same data. This is important since it tells us that although cluster analysis may provide an objective method for the clustering of cases there can be subjectivity in the choice of method. Five algorithms, available within SPSS, are described. 1. 2. 3. 4. 5.
Average Linkage Clustering Complete Linkage Clustering Single Linkage Clustering Within Groups Clustering Ward's Method
1. Average Linkage Clustering The dissimilarity between clusters is calculated using cluster average values; of course there are many ways of calculating an average. The most common (and recommended if there is no reason for using other methods) is UPGMA - Unweighted Pair Groups
Method Average. SPSS also provides two other methods based on averages, CENTROID and MEDIAN. Centroid or UPGMC (Unweighted Pair Groups Method Centroid) uses the group centroid as the average. The centroid is defined as the centre of a cloud of points. A problem with the centroid method is that some switching and reversal may take place, for example as the agglomeration proceeds some cases may need to be switched from their original clusters. 2. Complete Linkage Clustering (Maximum or furthest-neighbour method): The dissimilarity between two groups is equal to the greatest dissimilarity between a member of cluster i and a member of cluster j. This method tends to produce very tight clusters of similar cases. 3. Single Linkage Clustering (Minimum or nearest-neighbour method): The dissimilarities between two clusters is the minimum dissimilarity between members of the two clusters. This method produces long chains which form loose, straggly clusters. This method has been widely used in numerical taxonomy.
4. Within Groups Clustering This is similar to UPGMA except clusters are fused so that within cluster variance is minimized. This tends to produce tighter clusters than the UPGMA method. 5. Ward's Method Cluster membership is assessed by calculating the total sum of squared deviations from the mean of a cluster. The criterion for fusion is that it should produce the smallest possible increase in the error sum of squares. Cluster analysis is the statistical method of partitioning a sample into homogeneous classes to produce an operational classification. Such a classification may help: · ·
Formulate hypotheses concerning the origin of the sample, e.g. in evolution studies Describe a sample in terms of a typology, e.g. for market analysis or administrative purposes
· · · ·
Predict the future behaviour of population types. e.g. in modeling economic prospects for different industry sectors Optimize functional processes, e.g. business site locations or product design Assist in identification, e.g. in diagnosing diseases Measure the different effects of treatments on classes within the population, e.g. with analysis of variance
SUMMARY The complete process of generalized hierarchical clustering can be summarized as follows: 1. Calculate the distance between all initial clusters. In most analyses initial clusters will be made up of individual cases. 2. Fuse the two most similar clusters and recalculate the distances. 3. Repeat step 2 until all cases are in one cluster. One of the biggest problems with this Cluster Analysis is identifying the optimum number of clusters. As the fusion process continues increasingly dissimilar clusters must be fused, i.e. the classification becomes increasingly artificial. Deciding upon the optimum number of clusters is largely subjective, although looking at a graph of the level of similarity at fusion versus number of clusters may help. There will be sudden jumps in the level of similarity as dissimilar groups are fused. KEY TERMS · · · · · · · · ·
SPSS Tabulation Cross-tabulation ANOVA Discriminant analysis Factor analysis Conjoint analysis MDS Cluster analysis
REVIEW QUESTIONS 1. What do you mean by cross-tabulation? 2. Write short notes on statistical packages 3. Explain the step wise procedure for doing Discriminant Analysis. 4. Write short notes on ANOVA.
5. Explain the application of Factor analysis in Marketing. 6. What do you mean by conjoint analysis? 7. Explain the procedure of performing Multi Dimensional Scaling. 8. What are the applications of MDS? 9. Describe the different types of cluster analysis. 10. Explain the marketing situations in which the above said tools will be used.
- End of Chapter LESSON – 21 FACTOR ANALYSIS
OBJECTIVES · · ·
To learn the basic concepts of Factor. To understand procedures of performing Factor. To identify the applications of factor.
STRUCTURE · · · · ·
Evaluation of factor analysis Steps involved in conducting the factor analysis Process involved in factor analysis Output of factor analysis Limitation of factor analysis
INTRODUCTION Factor analysis is a general name denoting a class of procedures primarily used for data reduction and summarization. In marketing research, there may be a large number of variables, most of which are correlated and which must be reduced to a manageable level. Relationships among sets of many interrelated variables are examined and represented in terms of a few underlying factors. For example, store image may be measured by asking respondents to evaluate stores on a series of items on a semantic
differential scale. The item evaluations may then be analyzed to determine the factors underlying store image. In analysis of variance, multiple regression, and discriminant analysis, one variable is considered as the dependent or criterion variable, and the others as independent or predictor variables. However, no such distinction is made in factor analysis. Rather, factor analysis is an interdependence technique in that an entire set of interdependent relationships is examined. Factor analysis is used in the following circumstances: - To identify underlying dimensions, or factors that explain the correlations among a set of variables. For example, a set of lifestyle statements may be used to measure the psychographic profiles of consumers. These statements may then be factor analyzed to identify) the underlying psychographic factors, as illustrated in the department store example. - To identify a new, smaller set of uncorrelated variables to replace the original set of correlated variables in subsequent multivariate analysis (regression or discriminant analysis). For example, the psychographic factors identified may be used as independent variables to explaining the differences between loyal and non loyal consumers. - To identify a smaller set of salient variables from a larger set for use in subsequent multivariate analysis. For example, a few of the original lifestyle statements that correlate highly with the identified factors may be used as independent variables to explain the differences between the loyal and non-loyal users. Definition Factor analysis is a class of procedures primarily used for data reduction and summarization. Factors analysis is an interdependence technique in that an entire set of interdependent relationships is examined. Factors are defined as an underlying dimension that explains the correlation among a set of variables Evolution of Factor Analysis Charles Spearman first used the factor analysis as a technique of indirect measurement. When they test human personality and intelligence, a set of questions and tests are developed for this purpose. They believe that a person gives this set of questions and tests would respond on the basis of some structure that exists in his mind. Thus, his responses would form a certain pattern. This approach is based on the assumption that the underlying structure in answering the questions would be the same in the case of different respondents. Even though it is in the field of psychology that factor analysis has its beginning, it has since been applied to problems in different areas including marketing. Its use has
become far more frequent as a result of the introduction Specialized software packages such as SPSS, SAS. Application of Factor Analysis ·
·
·
·
It can be used in market segmentation for identifying the underlying variables on which to group the customers. New-car buyers might be grouped based on the relative emphasis place on economy, convenience, performance, comfort, and luxury. This might result in five segments: economy seekers, convenience seekers, performance seekers, fort seekers, and luxury seekers. In product research, factor analysis can be employed to determine the brand attributes that influence consumer choice. Toothpaste brand-might be evaluated in terms of protection against cavities, whiteness of teeth, taste, fresh breath, and price. In advertising studies, factor analysis can be used to understand the media consumption habits of the target market. The users of frozen foods may be heavy viewers of cable TV, see a lot at movies, and listen to country music. In pricing studies, it can be used to identify the characteristics of price-sensitive consumers. For example, these consumers might be methodical, economy minded, and home centered.
It can bring out the hidden or latent dimensions relevant in the relationships among product preferences. Factor analysis is typically used to study a complex product or service in order to identify the major characteristics (or factors) considered to be important by consumers of the product or service Example: Researchers for an automobile (two wheeler) company may ask a large sample of potential buyers to report (using rating scales) the extent of their agreement or disagreement with the number of statements such as "A motor bike’s breaks are its most crucial part", "Seats should be comfortable for two members". Researchers apply factor analysis to such a set of data to identity, which factors such as "safety", "exterior styling", "economy of operations" are considered important by potential customers. If this information is available, it can be used to guide the overall characteristics to be designed into the product or to identify advertising themes that potential buyers aid consider important. Steps Involved In Conducting the Factor Analysis:
Statistics Associated with Factor Analysis The key statistics associated with factor analysis are as follows: » Bartlett's test of sphericity: A test statistic used to examine the hypothesis that the variables are uncorrelated in the population. In other words, the population correlation matrix is an identity matrix; each variable correlates perfectly with itself (r = 1) but has no correlation with the other variables (r = 0). » Correlation matrix: A lower triangle matrix showing the simple correlations, r, between all possible pairs of variables included in the analysis. The diagonal elements, which are all I, are usually omitted. » Communality: The amount of variance a variable shares with all the other variables being considered. This is also the proportion of variance explained by the common factors.
» Eigenvalue: Represents the total variance explained by each factor. » Factor loadings: Simple correlations between the variables and the factors. » Factor loading plot: A plot of the original variables using the factor loadings as coordinates. » Factor matrix: Contains the factor loadings of all the variables on all the factors extracted. » Factor scores: Composite scores estimated for each respondent on the derived factors. » Kaiser-Meyer-Olkin (KMO) measure of sampling adequacy. An index used to examine the appropriateness of factor analysis. High values (between 0.5 and 1.0) indicate factor analysis is appropriate. Values below 0.5 imply that factor analysis may not be appropriate. » Percentage of variance: The percentage of the total variance attributed to each factor. » Residuals: The differences between the observed correlations, as given in the input correlation matrix, and the reproduced correlations, as estimated from the factor matrix. » Scree plot: A plot of the eigenvalues against the number of factors in order of extraction. We describe the uses of these statistics in the next section, in the context of the procedure for conducting factor analysis. Process Involved In Factor Analysis Factor analysis applies an advanced form of correlation analysis to the responses to number of statements. The purpose of this analysis is to determine if the responses to several of statements are highly correlated .If the responses to three or more statements are highly correlated, it is believed that the statements measure some factor common to all of them. The statements in any one set are highly correlated with each other but are not highly correlated with the statements in any of the other sets. For each set of highly correlated statements, the researchers use their own judgment to determine what the single "theme" or "factor" is that ties the statements together in the minds of the respondents. For example, regarding the automobile study mentioned above, researchers may find high correlations among the responses to the following three statements:
i. Mileage per liter should be high; ii. Maintenance cost should be low; iii. Mileage should be consistent in all types of roads. The researcher may then make the judgment that agreement with these set of statements indicates an underlying concern with the factor of "Economy of operation". Determine the Method of Factor Analysis Once it has been determined that factor analysis is an appropriate technique for analyzing data, an appropriate method must be selected. The approach used to derive the weights factor score coefficients differentiates the various methods of factor analysis. The two basic approaches are principal components analysis and common factor analysis. In principal components analysis, the total variance in the data is considered. The diagonal of correlation matrix consists of unities, and full variance is brought into the factor matrix. Principal components analysis is recommended when the primary concern is to determine the minimum number of factors that will account for maximum variance in the data for use in subsequent multivariate analysis. The factors are called principal components. In common factor analysis, the factors are estimated based only on the common variance. Communalities are inserted in the diagonal of the correlation matrix. This method is appropriate when the primary concern is to identify the underlying dimensions and the common variance is of interest. This method is also known as principal axis factoring. Other approaches for estimating the common factors are also available. These include methods of unweighted least squares, generalized least squares, maximum likelihood, HA method, and image factoring. These methods are complex and are not recommended for experienced users. Determine the Number of Factors It is possible to compute as many principal components as there are variables, but in doing so, no parsimony is gained. In order to summarize the information contained in the original variables, a smaller number of factors should be extracted. The question is, how many? Several procedures have been suggested for determining the number of factors. These include a priori determination and approaches based on eigenvalues, scree plot, percentage variance accounted for, split-half reliability, and significance tests. A Prior Determination. Sometimes, because of prior knowledge, the researcher knows many factors to expect and thus can specify the number of factors to be extracted beforehand. The extraction of factors ceases when the desired number of factors have been extracted. Most computer programs allow the user to specify the number of factors, allowing for an easy implementation of this approach.
Determination Based on Eigenvalues. In this approach, only factors with eigenvalues greater than 1.0 are retained; the other factors are not included in the model. An eigenvalue sends the amount of variance associated with the factor. Hence, only factors with a variance greater than 1.0 are included. Factors with variance less than 1.0 are no better than a single variable, because, due to standardization, each variable has a variance of 1.0. If the number of variables is less than 20, this approach will result in a conservative number of factors. Determination Based on Scree Plot. A scree plot is a plot of the eigenvalues against the number of factors in order of extraction. The shape of the plot is used to determine the number of factors. Typically, the plot has a distinct break between the steep slope of factors with large eigenvalues and a gradual trailing off associated with the rest of the factors. This gradual trailing off is referred to as the scree. Experimental evidence indicates that point at which the scree begins denotes the true number of factors. Generally, the number of factors determined by a scree plot will be one or a few more than that determined by the eigenvalue criterion. Determination Based on Percentage of Variance. In this approach the number of factors extracted is determined so that the cumulative percentage of variance extracted by the factors reaches a satisfactory level. What level of variance is satisfactory depends upon the problem. However, it is recommended that the factors extracted should account for at least o percent of the variance. Determination Based on Split-Half Reliability. The sample is split in half and factor analysis is performed on each half. Only factors with high correspondence of factor loadings across the two subsamples are retained. Determination Based on Significance Tests. It is possible to determine the Statistical significance of the separate eigenvalues and retain only those factors that are statistically, significant. A drawback is that with large samples (size greater than 200), many factors likely to be statistically significant, although from a practical viewpoint many of the count for only a small proportion of the total variance. Illustration A manufacturer of motorcycles wanted to know which motorcycle characteristics were considered very important by the customers. The company identified 100 statements that related to all characteristics of motorcycles that they believed important. 300 potential customers of motorcycles were selected on a probability basis and were asked to rate the 100 statements, five of which are listed below. They were then asked to report on a 5-point scale the extent to which they agreed or disagreed with the statement. · · · · ·
Breaks are the most important parts for motorcycles. Appearance of motorcycle should be masculine Mileage per liter should be high Maintenance cost should be low Mileage should be consistent in all types of roads.
This resulted in a set of data in which each of 300 individuals gave a response to each of 100 statements. For any given statement, some individuals were found to agree strongly, some were found to disagree slightly, some neither agreed nor disagreed with the statement, and so on. Thus, for each statement, there was a distribution of 300 responses on a 5-point scale. Three Important Measures There are three important measures used in the factor analysis: 1. Variance 2. Standardized scores of an individual's responses 3. Correlation coefficient. 1. Variance A factor analysis is somewhat like regression analysis in that it tries to "best fit" factors to a scatter diagram of the data in such a way that the factors explain the variance associated with the responses to each statement. 2. Standardized Scores of an Individual's Responses To facilitate comparisons of the responses from such different scales, researchers standardize all of the answers from all of the respondents on all statements and questions.
Thus, an individual's standardized score is nothing more than an actual response measured in terms of the number of standard deviations (+ or -1) it lies away from the mean. Therefore, each standardized score is likely to be a value somewhere in the range of +3.00 and -3.00, with +3.00 typically being equated to the "agree very strongly" response and -3.00 typically being equated to the "disagree very strongly" response. 3. Correlation Coefficient The third measure used is the correlation coefficient associated with the standardized scores of the responses to each pair of statements. The matrix of correlation coefficients is very important part of factor analysis. The factor analysis searches through a large set of data to locate two or more sets of statements, which have highly correlated responses. The responses to the statements in
one set will all be highly correlated with each other, but they will also be quite uncorrelated with the responses to the statements in other sets. Since the different sets of statements are relatively uncorrelated with each other, a separate and distinct factor relative to motorcycles is associated with each set. It is already noted that Variance is one of the three important measures used in factor analysis with standardized responses to each statement used in the study. Factor analysis selects one factor at a time using procedures that "best fit" each other to the data. The first factor selected is one that fits the data in such a way that it explains more of the variance in the entire set of standardized scores than any other possible factor. Each factor selected after the first factor must be uncorrelated with the factors already selected. This process continues until the procedures cannot find additional factors that significantly reduce the unexplained variance in the standardized scores Output of Factor Analysis Here, only following six statements (or variables) for simplicity will be used to explain the output of a factor analysis XI - Mileage per liter should be high X2 = Maintenance cost should be low X3 = Mileage should be consistent in all types of roads. X4 = Appearance of motorcycle should be masculine X5 = multiple colors should be available X6 = Breaks are the most important parts for motorcycles The results a factor analysis of these six statements will appear in the form shown m the following table which can be used to illustrate the three important outputs from a factor analysis and how they can be of use to researchers. Table 1: Factor Analysis Output of the Motorcycle Study
Factor Loadings The top six rows of the table are associated with the six statements listed above. The table shows that the factor analysis has identified three factors (F1, F2 and F3) and the first three columns are associated with those factors. For example' the first factor can be written as F1= 0.86 x 1+ 0.84 x 2 + 0.68 x 3 + 0.10 x 4 + 0.06 x 5 + 0.12 x 6 The 18 numbers located in the six rows and three columns are called as factor loadings and they are one of the 3 useful outputs obtained from a factor analysis. As shown in table, each statement has a factor loading associated with a specific factor and a specific statement is simply the correlation between that factor and that statements standardized response scores. Thus, table 1 shows that factor 1 is highly correlated with the responses to statement 1 (0.86 correlation) and also with the response to statement 2 (0.84 correlation). Table 1 shows that statements 1 and 2 are not highly correlated (0.12 and 0.18 respectively) with factor 2. Thus a factor loading is a measure of how well the factor "fits" the standardized responses to Statement. Naming the Factors From the table, it is clear that factor F1 is good fit on the data from statements 1, 2 and 3, but a poor fit on the other statements. This indicates that statements 1, 2 and 3 are probably measuring the same basic attitude or value system; and it is this finding that provides the researchers with evidence that factor exists.
By using the knowledge on the industry and the contents of statements 1, 2, and 3 researchers from Motorcycle Company subjectively concluded from these results that "Economy of operation" was the factor that tied these statements together in the minds of the respondents. Researchers next want to know if the 300 respondents participating in the study mostly agreed with or disagreed with statements 1, 2 and 3. To answer this question, the researchers had to look at the 300 standardized responses to each of the statements 1, 2 and 3. They found that the means of these responses to each of statements 1, 2 and 3. They found that the means of these responses were +0.97, + 1.32, and +1.18, respectively, for statements 1, 2 and 3, indicating that most respondents agreed with the three statements as per above discussion on "standardized scores". Since a majority of respondents had agreed with these statements, the researchers also concluded that the factor of "economy of operation" was important in the minds of potential motorcycle customers. Table also shows that F2 is good fit on statements 4 and 5 but a poor fit on the other statements. This factor is clearly measuring something different from statements 1, 2, 3 and 6. Factor F3 is good fit only on statement 6 and so it is clearly measuring something not being statements 1 to 5. Researchers again subjectively concluded that the factor underlying statements 4 and 5 was "Comfort" and that statement 6 was related to "safety". Fit between Data and Factor Researcher has to find how well all of the identified factors fit the data obtained from all of the respondents on any given statement. Communalities for each statement indicate the proportion of the variance in the responses to the statement which is explained by the three identified factors. For example, Three factors explain 0.89 (89%) of the variance in all of the responses to statement 5, but only 0.54 (54%) of the variance in all of the responses to statement 3. It shows that three factors explain 75% or more of the variance associated with statements 1, 2, 4, 5 and 6, but only about half of statement 3's variance .Researchers can use these communalities to make a judgment about for most of the variance associated with each of the six statements in this example, the three factors fit the data quite well. How well any given factor fits the data from all of the respondents on all of the statements can be determined by eigenvalue. There is an eigenvalue associated with each of the factors. When a factor's eigenvalue is divided by the number of statements used in factor analysis the resulting figure is the proportion of the variance in the entire set of standardized response scores which is explained by that factor i.e. each eigenvalue is divided by 6 – the number of statements. For example factor F1 explains 0.3226 (32%) of the variance of the standardized response scores from all of the respondents on all six statements. By adding these figures for the three factors, three factors together explain
0.3226+0.3090+0.1391 = 0.7707 (77.07%) of the variance in the entire set of response data. This figure can be used as a measure of how well, over all, the identified factors fit the data. In general, a factor analysis that accounts for 60-70% or more of the total variance can be considered a good fit to the data. Limitations The utility of this technique largely depends to a large extent on the judgment of the researcher. He has to make number of decisions as to how the factor analysis will come out. Even with a given set of decisions, different results will emerge from different groups of respondents, different mixes of data as also different ways of getting data. In other words, factor analysis is unable to give a unique solution or result. As any other method of analysis, a factor analysis will be of little use if the appropriate variable has not been measured, or if the measurements are inaccurate, or if the relationships in the data are non linear In the view of ongoing limitations, the exploratory nature of factor analysis becomes clear. As Thurston mentions, the use of factor analysis should not be made where fundamental and fruitful concepts are already well formulated and tested. It may be used especially in those domains where basic and fruitful concepts are essentially lacking and where crucial experiments have been difficult to conceive. SUMMARY This chapter has overview of factor analysis in detail. Factor analysis is used to find latent variable or factor among observed variables. With factor analysis you can produce a small number of factors frame large number of variables. The induced factor can also be used for further analysis. KEY WORDS · · · · ·
Factor analysis Communalities Factor loading Correlation matrix Eigen value
IMPORTANT QUESTIONS 1. What are the applications of factor analysis? 2. What is the significance of factor loading in factor analysis? 3. What do you mean by Eigen value?
- End of Chapter LESSON – 22 CONJOINT ANALYSIS
STRUCTURE · · · ·
Conjoint analysis Basics of conjoint analysis Steps involved in conjoint analysis Application of conjoint analysis
Conjoint analysis, also called multi attribute compositional models, is a statistical technique that originated in mathematical psychology and was developed by marketing professor Paul Green at the Wharton School of the University of Pennsylvania. Today it is used in many of the social sciences and applied sciences including marketing, product management, and operations research. The objective of conjoint analysis is to determine what combination of a limited number of attributes is most preferred by respondents. It is used frequently in testing customer acceptance of new product designs and assessing the appeal of advertisements. It has been used in product positioning, but there are some problems with this application of the technique. Recently, new alternatives such as Genetic Algorithms have been used in market research. The Basics of Conjoint Analysis The basics of conjoint analysis are easy to understand. It should only take about 20 minutes to introduce this topic so you can appreciate what conjoint analysis has to offer. In order to understand conjoint analysis, let's look at a simple example. Suppose you wanted to book an airline flight and you had a choice of spending Rs.400 or Rs.700 for a ticket. If this were the only consideration then the choice is clear: the lower priced ticket is preferable. What if the only consideration in booking a flight was sitting in a regular or extra-wide seat? If seat size was the only consideration then you would probably prefer an extra-wide seat. Finally, suppose you can take either a direct flight which takes three hours or a flight that stops once and takes five hours. Virtually everyone would prefer the direct flight. Conjoint analysis attempts to determine the relative importance consumers attach to salient attributes and the utilities they attach to the levels of attributes. This information is derived from consumers' evaluations of brands, or brand profiles composed of these attributes and their levels. The respondents are presented with stimuli that consist of combinations of attribute levels. They are asked to evaluate these stimuli in terms of
their desirability. Conjoint procedures attempt to assign values to the levels of each attribute, so that the resulting values or utilities attached to the stimuli match, as closely as possible the input evaluations provided by the respondents. The underlying assumption is that any set of stimuli, such as products, brands, or stores, is evaluated as a bundle of attributes. Conjoint Analysis is a technique that attempts to determine the relative importance consumers attach to salient attributes and the utilities they attach to the levels of attributes. In a real purchase situation, however, consumers do not make choices based on a single attribute like comfort. Consumers examine a range of features or attributes and then make judgments or trade-offs to determine their final purchase choice. Conjoint analysis examines these trade-offs to determine the combination of attributes that will be most satisfying to the consumer, in other words, by using conjoint analysis a company can determine the optimal features for their product or service. In addition, conjoint analysis will identify the best advertising message by identifying the features that are most important in product choice. Like multidimensional scaling, conjoint analysis relies on respondents' subjective evaluations. However, in MDS the stimuli are products or brands. In conjoint analysis, the stimuli are combinations of attribute levels determined by the researcher. The goal in MDS is to develop a spatial map depicting the stimuli in a multidimensional perceptual or preference space. Conjoint analysis, on the other hand, seeks to develop the part-worth or utility functions describing the utility consumers attach to the levels of each attribute. The two techniques are complementary. In sum, the value of conjoint analysis is that it predicts what products or services people will choose and assesses the weight people give to various factors that underlie their decisions. As such, it is one of the most powerful, versatile and strategically important research techniques available. Statistics and Terms Associated with Conjoint Analysis The important statistics and terms associated with conjoint analysis include: · · · · · ·
Part-worth functions. Also called utility functions, these describe the utility consumers attach to the levels of each attribute. Relative importance weights. Indicate which attributes are important in influencing consumer choice. These weights are estimated. Attribute levels. Denote the values assumed by the attributes. Full profiles. Full profiles or complete profiles of brands are constructed in terms of all the attributes by using the attribute levels specified by the design. Pair-wise tables. The respondents evaluate two attributes at a time until all the required pairs of attributes have been evaluated. Cyclical designs. Designs employed to reduce the number of paired comparisons.
· · ·
Fractional factorial designs: Designs employed to reduce the number of stimulus profiles to be evaluated in the full-profile approach. Orthogonal arrays. A special class of fractional designs that enable the efficient estimation of all main effects. Internal validity: This involves correlations of the predicted evaluations for the holdout or validation stimuli with those obtained from the respondents.
Conducting Conjoint Analysis The following chart lists the steps in conjoint analysis. Formulating the problem involves identifying the salient attributes and their levels. These attributes and levels are used for constructing the stimuli to be used in a conjoint evaluation task. Formulate the Problem ↓ Construct the Stimuli ↓ Decide on the Form of Input Data ↓ Select a Conjoint Analysis Procedure ↓ Interpret the Results ↓ Assess Reliability and Validity
Steps involved in conjoint analysis The basic steps are: · · ·
Select features to be tested Show product feature combinations to potential customers Respondents rank, rate, or choose between the combinations
·
·
Input the data from a representative sample of potential customers into a statistical software program and choose the conjoint analysis procedure. The software will produce utility functions for each of the features. lncorporate the most preferred features into a new product or advertisement.
Any number of algorithms may be used to estimate utility functions. The original methods were monotonic analysis of variance or linear programming techniques, but these are largely obsolete in contemporary marketing research practice. Far more popular are the Hierarchical Bayesian procedures that operate on choice data. These utility functions indicate the perceived value of the feature and how sensitive consumer perceptions and preferences are to changes in product features. A Practical Example of Conjoint Analysis Conjoint analysis presents choice alternatives between products/services defined by sets of attributes. This is illustrated by the following choice: would you prefer a flight with regular seats, that costs Rs.400 and takes 5 hours, or a flight which costs Rs.700, has extra-wide seats and takes 3 hours? Extending this, we see that if seat comfort, price and duration are the only relevant attributes, there are potentially eight flight choices.
Given the above alternatives, product 4 is very likely the most preferred choice, while product 5 is probably the least preferred product. The preference for the other choices is determined by what is important to that individual. Conjoint analysis can be used to determine the relative importance of each attribute, attribute level, and combinations of attributes. If the most preferable product is not feasible for some reason (perhaps the airline simply cannot provide extra-wide seats and a 3 hour arrival time at a price of Rs400) then the conjoint analysis will identify the next most preferred alternative. If you have other information on travelers, such as background demographics, you might be able to identify market segments for which
distinct products may be appealing. For example, the business traveller and the vacation traveller may have very different preferences which could be met by distinct flight offerings. You can now see the value of conjoint analysis. Conjoint analysis allows the researcher to examine the trade-offs that people make in purchasing a product. This allows the researcher to design products/services that will be most appealing to a specific market. In addition, because conjoint analysis identifies important attributes, it can be used to create advertising messages that will be most persuasive. In evaluating products, consumers will always make trade-offs. A traveller may like the comfort and arrival time of a particular flight, but reject purchase due to the cost. In this case, cost has a high utility value. Utility can be defined as a number which represents the value that consumers place on an attribute. In other words, it represents the relative "worth" of the attribute. A low utility indicates less value; a high utility indicates more value. The following figure presents a list of hypothetical utilities for an individual consumer:
Based on these utilities, we can make the following conclusions: · · · ·
This consumer places a greater value on a 3 hour flight (the utility is 42) than on a 5 hour flight (utility is 22). This consumer does not differ much in the value that he or she places on comfort. That is, the utilities are quite close (12 vs.15). This consumer places a much higher value on a price of Rs.400 than a price of Rs.700. The preceding example depicts an individual's utilities. Average utilities can be calculated for all consumers or for specific subgroups of consumers.
These utilities also tell us the extent to which each of these attributes drives the decision to choose a particular flight. The importance of an attribute can be calculated by examining the range of utilities (that is, the difference between the lowest and highest utilities) across all levels of the attribute. That range represents the maximum impact that the attribute can contribute to a product. Using the hypothetical utilities presented earlier, we can calculate the relative importance of each of the three attributes. The range for each attribute is given below: · · ·
Duration: Range = 20 (42-22) Comfort: Range = 3 (15-12) Cost: Range = 56 (61-5)
These ranges tell us the relative importance of each attribute. Cost is the most important factor in product purchase as it has the highest range of utility values. Cost is followed in importance by the duration of the flight. Based on the range and value of the utilities, we can see that seat comfort is relatively unimportant to this consumer. Therefore, advertising which emphasizes seat comfort would be ineffective. This person will make his or her purchase choice based mainly on cost and then on the duration of the flight. Marketers can use the information from utility values to design products and/or services which come closest to satisfying important consumer segments. Conjoint analysis will identify the relative contributions of each feature to the choice process. This technique, therefore, can be used to identify market opportunities by exploring the potential of product feature combinations that are not currently available. Choice Simulations In addition to providing information on the importance of product features, conjoint analysis provides the opportunity to conduct computer choice simulations. Choice simulations reveal consumer preference for specific products defined by the researcher. In this case, simulations will identify successful and unsuccessful flight packages before they are introduced to the market! For example, let's say that the researcher defined three flights as follows:
The conjoint simulation will indicate the percentage of consumers that prefer each of the three flights. The simulation might show that consumers are willing to travel longer if they can pay less and are provided a meal. Simulations allow the researcher to estimate preference, sales and share for new flights before they come to market.
Simulations can be done interactively on a microcomputer to quickly and easily look at all possible options. The researcher may, for example, want to determine if a price change of Rs.50, Rs.100, or Rs.150 will influence consumer's choice. Also, conjoint will let the researcher look at interactions among attributes. For example, consumers may be willing to pay Rs.50 more for a flight on the condition that they are provided with a hot meal rather than a snack. Data Collection Respondents are shown a set of products, prototypes, mock-ups or pictures. Each example is similar enough that consumers will see them as close substitutes, but dissimilar enough that respondents can clearly determine a preference. Each example is composed of a unique combination of product features. The data may consist of individual ratings, rank-orders, or preferences among alternative combinations. The latter is referred to as "choice based conjoint" or "discrete choice analysis". In order to conduct a conjoint analysis, information must be collected from a sample of consumers. This data can be conveniently collected in locations such as shopping centers or by the Internet. In the previous example, data collection could take place at a booth located in an airport or in the office of a travel agent. A sample size of 400 is generally sufficient to provide reliable data for consumer products or services. Data collection involves showing respondents a series of cards that contain a written description of the product or service. If a consumer product is being tested then a picture of the product can be included along with a written description. A typical card examining the business traveller might look like the following:
Readers might be worried at this point about the total number of cards that need to be rated by a single respondent. Fortunately, we are able to use statistical manipulations to cut down on the number of cards. In a typical conjoint study, respondents only need to rate between 10-20 cards. This data would be input to the conjoint analysis. Utilities can then be calculated and simulations can be performed to identify which products will be successful and which should be changed. Price simulations can also be conducted to determine sensitivity of the consumer to changes in prices. A wide variety of companies and service organizations have successfully used conjoint analysis A conjoint analysis was developed using a number of attributes such as saving on energy bills, efficiency rating of equipment, safety record of energy source, and dependability of energy source. The conjoint analysis identified that cost savings and efficiency were the main reasons for converting appliances to gas. The third most important reason was cleanliness of energy source. This information was used in marketing campaigns in order to have the greatest effect. A natural gas utility used conjoint analysis to evaluate which advertising message would be most effective in convincing consumers to switch from other energies to natural gas. Previous research failed to discover customer's specific priorities - it appeared that the trade-offs that people made were quite subtle. Advantages · · ·
Able to use physical objects Measures preferences at the individual level Estimates psychological tradeoffs that consumers make when evaluating several attributes together
Disadvantages · · ·
·
Only a limited set of features can be used because the number of combinations increases very quickly as more features are added. Information gathering stage is complex. Difficult to use for product positioning research because there is no procedure for converting perceptions about actual features to perceptions about a reduced set of underlying features Respondents are unable to articulate attitudes toward new categories
Applications of conjoint analysis Conjoint analysis has been used in marketing for a variety of purposes, including:
·
·
·
·
Determining the relative importance of attributes in the consumer choice process. A standard output from conjoint analysis consists of derived relative importance weights for all the attributes used to construct the stimuli used in the evaluation task. The relative importance weights indicate which attributes are important in influencing consumer choice. Estimating market share of brands that differ in attribute levels. The utilities derived from conjoint analysis can be used as input into a choice simulator to determine the share of choices, and hence the market share, of different brands. Determining the composition of the most-preferred brand. The brand features can be varied in terms of attribute levels and the corresponding utilities determined. The brand features that yield the highest utility indicate the composition of the most-preferred brand. Segmenting the market based on similarity of preferences for attribute levels. The part worth functions derived for the attributes may be used as a basis for clustering respondents to arrive at homogenous preference segments.
Applications of conjoint analysis have been made in consumer goods, industrial goods, financial and other services. Moreover, these applications have spanned all areas of marketing. A recent survey of conjoint analysis reported applications in the areas of new product/concept identification, competitive analysis, pricing, market segmentation, advertising, and distribution. SUMMARY This chapter has given over view conjoint analysis in detail. Conjoint analysis, also called multi-attribute compositional models. Today it is used in many of the social sciences and applied sciences including marketing, product management, and operations research. KEY TERMS · · ·
Kaiser-Meyer-Olkin (KMO) measure of sampling adequacy Residuals Conjoint analysis
IMPORTANT QUESTIONS 1. What are the applications of conjoint analysis? 2. Explain the procedure of performing Conjoint Analysis with one practical example. REFERENCE BOOKS 1. Robert Ferber, Marketing Research, New York: McGraw Hill, Inc. 1948 2. Dennis, Child, The Essentials of Factor Analysis, New York 1973
3. Cooley, William, W., and Lohnes, Paul R., Multivariate Data Analysis, New York: John Wiley and Sons. 1971.
- End of Chapter -
LESSON – 23 APPLICATION OF RESEARCH TOOLS
OBJECTIVES · ·
o o o
To identify the areas of management in which Research tools can be used To understand the application of various research tools in the following domains of management: Marketing Management Operations Management Human Resources Management
STRUCTURE · · · · · ·
Application of marketing research Concept of market potential Techniques of perceptual mapping Limitation of Marketing Research Methods of Forecasting Statistical Methods
INTRODUCTION Research Methodology is becoming a necessary tool for its application of all functional areas of management such as Marketing, Human Resources and Operations Management. There is an increasing realisation of the importance of research methodology in various quarters. This is reflected in the increasing use of research methodology in various domains of management. Here, a brief description of typical application of research methodology is given below: APPLICATIONS OF MARKETING RESEARCH
Applications of marketing research can be divided into two broad areas: · ·
Strategic Tactical
Among the strategic areas, marketing research applications would be demand forecasting, sales forecasting, segmentation studies, identification of target markets for a given product, and positioning strategies identification. In the second area of tactical applications, we would have applications such as product testing, pricing research, advertising research, promotional research, distribution and logistics related research. In other words, it would include research related to all the 'P's of marketing: how much to price the product, how to distribute it, whether to package it in one way or another, what time to offer a service, consumer satisfaction with respect to the different elements of the marketing mix (product, price, promotion, distribution), and so on. In general, we would find more tactical applications than strategic applications because these areas can be fine-tuned more easily, based on the marketing research findings. Obviously, strategic changes are likely to be fewer than tactical changes. Therefore, the need for information would be in proportion to the frequency of changes. The following list is a snapshot of the kind of studies that have actually been done in India: · · · · · · · · · ·
A study of consumer buying habits for detergents-frequency, pack size, effect of promotions, brand loyalty and so forth. To find out the potential demand for ready-to-eat chapattis in Mumbai city. To determine which of the three proposed ingredients – tulsi, coconut oil, or neem – that the consumer would like to have in a toilet soap To find out what factors would affect the sales of Flue Gas Desulphurization equipment {industrial pollution control equipment) To find out the effectiveness of the advertising campaign for a car brand To determine brand awareness and brand loyalty for a branded PC (Personal Computer) To determine appropriate product mix, price level, and target market for a new restaurant To find the customer satisfaction level among consumers of an Internet service provider To determine factors which influenced consumers in choosing a brand of cellular phone handset To find out the TV viewing preferences of the target audience in specific time slots in early and late evenings
As the list shows, marketing research tackles a wide variety of subjects. The list is only indicative, and the applications of marketing research in reality can be useful for almost
any major decision related to marketing. The next sections discuss some typical application areas. Concept Research During a new product launch, there would be several stages-for example, concept development, concept testing, prototype development and testing, test marketing in a designated city or region, estimation of total market size based on the test marketing, and then a national rollout or withdrawal of the product based on the results. The first stage is the development of a concept and its testing. The concept for a new product may come from several sources-the idea may be from a brain storming session consisting of company employees, a focus group conducted among consumers, or the brainwave of a top executive. Whatever may be its source, it is generally researched further through what is termed as concept testing, before it goes into prototype or product development stages. A concept test takes the form of developing a description of the product, its benefits, how to use it, and so on, in about a paragraph, and then asking potential consumers to rate how much they like the concept, how much they would be willing to pay for the product if introduced, and similar questions. As an example, the concept statement for a fabric softener may read as follows. This fabric softener cum whitener is to be added to the wash cycle in a machine or to the bucket of detergent in which clothes are soaked. Only a few drops of this liquid will be needed per wash to whiten white clothes and also soften them by eliminating static charge. It will be particularly useful for woolens, undergarments and baby’s or children’s clothes. It will have a fresh fragrance, and will be sold in handy 200 ml, bottles to last about a month. It can also replace all existing blues with the added benefit of a softener. This statement can be used to survey existing customers of 'blues' and whiteners, and we could ask customers for their reactions on pack size, pricing, colour of the liquid, ease of use, and whether or not they would buy such a product. More complex concept tests can be done using Conjoint Analysis where specific levels of price or product/service features to be offered are pre-determined and reactions of consumers are in the form of ratings given to each product concept combining various features. This is then used to make predictions about which product concepts would provide the highest utility to the consumer, and to estimate market shares of each concept. The technique of Conjoint Analysis is discussed with an example in Part II of the book. Product Research Apart from product concepts, research helps to identify which alternative packaging is most preferred, or what drives a consumer to buy a brand or product category itself, and specifics of satisfaction or dissatisfaction with elements of a product. These days, service elements are as important as product features, because competition is bringing most products on par with each other.
An example of product research would be to find out the reactions of consumers to manual cameras versus automatic cameras. In addition to specific likes or dislikes for each product category, brand preferences within the category could form a part of the research. The objectives may be to find out what type of camera to launch and how strong the brand salience for the sponsor's brand is. Another example of product research could be to find out from existing users of photocopiers (both commercial and corporate), whether after-sales service is satisfactory, whether spare parts are reasonably priced and easily available, and any other service improvement ideas - for instance, service contracts, leasing options or buybacks and trade-ins. The scope of product research is immense, and includes products or brands at various stages of the product life cycle-introduction, growth, maturity, and decline. One particularly interesting category of research is into the subject of brand positioning. The most commonly used technique for brand positioning studies (though not the only one) is called Multidimensional Scaling. This is covered in more detail with an example and case studies in Part II as a separate chapter. Pricing Research Pricing is an important part of the marketing plan. In the late nineties in India, some interesting changes have been tried by marketers of various goods and services. Newer varieties of discounting practices including buy-backs, exchange -offers, and straight discounts have been offered by many consumer durable manufacturers-notably AKAI and AIWA brands of TVs. Most FMCG (fast moving consumer goods) manufacturers/marketers of toothpaste, toothbrush, toilet soap, talcum powder have offered a variety of price-offs or premium-based offers which affect the effective consumer price of a product. Pricing research can delve into questions such as appropriate pricing levels from the customers' point of view, or the dealer's point of view. It could try to find out how the current price of a product is perceived, whether it is a barrier for purchase, how a brand is perceived with respect to its price and relative to other brands' prices (price positioning). Here, it is worth remembering that price has a functional role as well as a psychological role. For instance, high price may be an indicator of high quality or high esteem value for certain customer segments. Therefore, questions regarding price may need careful framing and careful interpretation during the analysis. Associating price with value is a delicate task, which may require indirect methods of research at times. A bland question such as - "Do you think the price of Brand A of refrigerators is appropriate?" may or may not elicit true responses from customers. It is also not easy for a customer to articulate the price he would be willing to pay for convenience of use, easy product availability, good after-sales service, and other elements of the marketing mix. It may require experience of several pricing-related studies before one begins to appreciate the nuances of consumer behaviour related to price as a functional and psychological measure of the value of a product offering.
An interesting area of research into pricing has been determining price elasticity at various price points for a given brand through experiments or simulations. Price framing, or what the consumer compares (frames) price against, is another area of research. For example, one consumer may compare the price of a car against an expensive two-wheeler (his frame of reference), whereas another may compare it with an investment in the stock market or real estate. Another example might be the interest earned from a fixed deposit, which serves as a benchmark for one person before he decides to invest in a mutual fund, whereas for another, the investment may be a substitute for buying gold, which earns no interest. In many cases, therefore, it is the frame of reference used by the customer which determines 'value' for him of a given product. There are tangible as well as intangible (and sometimes not discernible) aspects to a consumer's evaluation of price. Some of the case studies at the end of Part I include pricing or price-related issues as part of the case. Distribution Research Traditionally, most marketing research focuses on consumers or buyers. Sometimes this extends to potential buyers or those who were buyers but have switched to other brands. But right now, there is a renewed interest in the entire area of logistics, supply chain, and customer service at dealer locations. There is also increasing standardisation from the point of view of brand building, in displays at the retail level, promotions done at the distribution points. Distribution research focuses on various issues related to the distribution of products including service levels provided by current channels, frequency of sales persons’ visits to distribution points, routing/transport related issues for deliveries to and from distribution points throughout the channel, testing of new channels, channel displays, linkages between displays and sales performance, and so on. As an example, a biscuit manufacturer wanted to know how it could increase sales of a particular brand of biscuits in cinema theatres. Should it use existing concessionaires selling assorted goods in theatres, or work out some exclusive arrangements? Similarly, a soft drink manufacturer may want to know where to set up vending machines. Potential sites could include roadside stalls, shopping malls, educational institutions, and cinema theatres. Research would help identify factors that would make a particular location a success. In many service businesses where a customer has to visit the location, it becomes very important to research the location itself. For example, a big hotel or a specialty restaurant may want to know where to locate themselves for better visibility and occupancy rates. Distribution research helps answer many of these questions and thereby make better marketing decisions. Advertising Research The two major categories of research in advertising are: 1. Copy 2. Media
Copy Testing This is a broad term that includes research into all aspects of advertising-brand awareness, brand recall, copy recall (at various time periods such as day after recall, week after recall), recall of different parts of the advertisement such as the headline for print ads, slogan or jingle for TV ads, the star in an endorsement and so on. Other applications include testing alternative ad copies (copy is the name given to text or words used in the advertisement, and the person in the advertising agency responsible for writing the words is known as the copy writer) :or a single ad, alternative layouts (a layout is the way all the elements of, the advertisement are laid out in a print advertisement) with the same copy, testing of concepts or storyboards (a storyboard is a scene-by-scene drawing of a TV commercial which is like a rough version before the ad is actually shot on film) of TV commercials to test for positive/negative reactions, and many others. Some of these applications appear in our discussion of Analysis of Variance (ANOVA) in Part II and some case studies elsewhere in the book. A particular class of advertising research is known as Tracking Studies. When an advertising campaign is running, periodic sample surveys known as tracking studies can be conducted to evaluate the effect of the campaign over a long period of time such as six months or one year, or even longer. This may allow marketers to alter the advertising theme, content, media selection or frequency of airing / releasing advertisements and evaluate the effects. As opposed to a snapshot provided by a one-time survey, tracking studies may provide a continuous or near-continuous monitoring mechanism. But here, one should be careful in assessing the impact of the advertising on sales, because other factors could change along with time. For example, the marketing programmes of the sponsor and the competitors could vary over time. The impact on sales could be due to the combined effect of several factors. Media Research The major activity under this category is research into viewership of specific television programmes on various TV channels. There are specialised agencies like A.C. Nielsen worldwide which offer viewership data on a syndicated basis (i.e., to anyone who wants to buy the data). In India, both ORG-MARG and IMRB offer this service. They provide people meter data with brand names of TAM and INTAM which is used by advertising agencies when they draw up media plans for their clients. Research could also focus on print media and their readership. Here again, readership surveys such as the National Readership Survey (NRS) and the Indian Readership Survey (IRS) provide syndicated readership data. These surveys are now conducted almost on a continuous basis in India and are helpful to find out circulation and readership figures of major print media. ABC (Audit Bureau of Circulations) is an autonomous body which provides audited figures on the paid circulation (number of copies printed and sold) of each newspaper and magazine, which is a member of ABC. Media research can also focus on demographic details of people reached by each medium, and also attempt to correlate consumption habits of these groups with their media preferences. Advertising research is used at all stages of advertising, from
conception to release of ads and thereafter to measure advertising effectiveness based on various parameters. It is a very important area of research for brands that rely a lot on advertising. The top rated programmes in India are usually cricket matches and film based programmes. Sales Analysis by Product Sales analysis by product will enable a company to identify its strong or weak products. It is advisable to undertake an analysis on the basis of a detailed break-up of -products such as product variation by size, colour, etc. This is because if an analysis is based on a broad break-up, it may not reveal important variations. When a company finds that a particular product is doing poorly, two Options are open to it. One is, it may concentrate on that product to ensure improved sales. Or alternatively, it may gradually withdraw the product and eventually drop it altogether. However, it is advisable to decide on the latter course on the basis of additional information such as trends in the market share, contribution margin, and effect of sales volume on product profitability, etc. In case the product in question has complementarity with other items sold by the company, the decision to abandon the product must be made with care and caution. Combining sales analysis by product with that by territory will further help in providing information on which products are doing better in which areas. Sales Analysis by Customers Another way to analyse sales data is by customers. Such an analysis would normally indicate that a relatively small number of customers accounts for a large proportion of sales. To put it differently: a large percentage of customers accounts for a relatively small percentage of aggregate sales. One may compare the data with the proportion of time spent on the customers, i.e. the number of sales calls. An analysis of this type will enable the company to devote relatively more time to those customers who collectively account for proportionately larger sales. Sales analysis by customer can also be combined with analysis both by area and product. Such an analysis will prove to be more revealing. For example, it may indicate that in some areas sales are not increasing with a particular type of customer though they have grown fast in other areas. Information of this type will be extremely useful to the company as it identifies the weak spots where greater effort is called for. Sales Analysis by Size of Order Sales analysis by size of order may show that a large volume of sales is accompanied by low profit and vice versa. In case cost accounting data are available by size of order, this would help in identifying sales where the costs are relatively high and the company is incurring a loss. Sales analysis by size of order can also be combined with that by products, areas and types of customers. Such a perceptive analysis would reveal useful
information to the company and enable it to make a more rational and effective effort in maximizing its return from sales. THE CONCEPT OF MARKET POTENTIAL Market potential has been defined as "the maximum demand response possible for a given group of customers within a well-defined geographic area for a given product or service over a specified period of time under well-defined competitive and environmental conditions". We will elaborate this comprehensive definition. First, market potential is the maximum demand response under certain assumptions. It denotes a meaningful boundary condition on ultimate demand. Another condition on which the concept of market potential depends is a set of relevant consumers of the product or service. It is not merely the present consumer who is to be included but also the potential consumer as maximum possible demand is to be achieved. Market potential will vary depending on which particular group of consumers is of interest. Further, the geographic area for which market potential is to be determined should be well-defined. It should be divided into mutually exclusive subsets of consumers so that the management can assign a sales force and supervise and control the activities in different territories without much difficulty. Another relevant aspect in understanding the concept of market potential is to clearly know the product or service for which market potential is to be estimated. Especially in those cases where the product in question can be substituted by another, it is desirable to have market potential for the product class rather than that particular product. For example, tea is subjected to a high degree of cross-elasticity of demand with coffee. It is necessary to specify the time period for which market potential is to be estimated. The time period should be so chosen that it coincides with planning periods in a firm. Both short and long-time periods can be used depending on the requirements of the firm. Finally, a clear understanding of environmental and competitive conditions relevant in case of a particular product or service is necessary if market potential is to be useful. What is likely to be the external environment? What is likely to be the nature and extent of competition? These are relevant questions in the context of any estimate of market potential since these are the factors over which the firm has no control. It may be emphasised that market potential is not the same thing as sales potential and sales forecast. It is only when "a market is saturated can the industry sales forecast be considered equivalent to market potential". Such a condition is possible in case of well established and mature products. Generally, the industry sales forecast will be less than the market potential. Likewise, a company's sales forecast will be less than its sales
potential. The former is a point estimate of the future sales, while the latter represents a boundary condition which the sales might reach in an ideal situation. "In the latter sense, sales potential is to a firm what market potential is to an industry or product class: both represent maximum demand response and are boundary conditions". Brand Positioning Brand positioning is a relatively new concept in marketing. The concept owes its origin to the idea that each brand occupies a particular space in the consumer's mind, signifying his perception of the brand in question in relation to other brands. While product or brand positioning has been defined by various authors in different ways, the underlying meaning conveyed through these definitions seems to be the same. Instead of giving several definitions, we may give one here. According to Green and Tull, "Brand positioning and market segmentation appear to be the hallmarks of today's marketing research. Brand (or service) positioning deals with measuring the perceptions that buyers hold about alternative offerings". From this definition it is evident that the term 'position' reflects the essence of a brand as perceived by the target consumer in relation to other brands. In view of this the management's ability to position its product or brand appropriately in the market can be a major source of company's profits. This seems to be an important reason for the emergence of product or brand positioning as a major area in marketing research. Components of Positioning Positioning comprises four components. The first component is the product class or the structure of the market in which a company's brand will compete. The second component is consumer segmentation. One cannot think of positioning a brand without considering the segment m which it is to be offered. Positioning and segmentation are inseparable. The third component is the consumers’ perception of the company's brand in relation to those of the competitors. Perceptual mapping is the device by which the company can know this. Finally, the fourth component positioning is the benefit offered by the company's brand. A consumer can allot a position in his mind to a brand only when it is beneficial to him. The benefits may be expressed as attributes or dimensions in a chart where brands are fitted to indicate the consumer's perceptions. As perceptual maps are used to indicate brand positioning, blank spaces in such maps show that a company can position its brand in one or more of such spaces. TECHNIQUES FOR PERCEPTUAL MAPPING There are a number of techniques for measuring product positioning. Some of these which are important are: · ·
Factor analysis Cluster analysis Multi-dimensional scaling.
We will not go into the detailed mechanism of these techniques. All the same, we will briefly explain the techniques. Image profile analysis This technique is the oldest and most frequently used for measuring the consumer’s perceptions of competitive brands or services. Normally, a 5 or 7 point numerical scale used. A number of functional and psychological attributes are selected. The respondent is asked to show his perception of each brand in respect of each attribute on the 5 or 7 point scale. It will be seen that the figures provides some insight as to which brands are competing with each other and on what attribute(s). This technique has some limitations. First, if the number of brands is large, it may not be possible to plot all the brands in a single figure. Second, there is an implicit assumption in this technique that all attributes are equally important and independent of each other. This is usually not true. However, this limitation can be overcome by using the technique of factor analysis. Factor analysis As regards factor analysis, it may be pointed out that its main object is to reduce a large number of variables into a small number of factors or dimensions. In Chapter 17, two examples have been given to illustrate the use of factor analysis. The discussion also brings out some major limitations of the method. Cluster analysis Cluster analysis is used to classify consumers or objects into a small number of mutually exclusive and exhaustive groups. With, the help of cluster analysis it is possible to separate brands into clusters or groups so that the brand within a cluster is similar to other brands belonging to the same cluster and is very different from brands included in other clusters. This method has been discussed in Chapter Multi-dimensional scaling Multi-dimensional scaling too has been discussed in Chapter 17, pointing out how perceptual maps can be developed on the basis of responses from consumers. In this connection, two illustrations of perceptual maps were given. The first illustration related to select Business Schools based on hypothetical data On the basis of two criteria, viz. how prestigious and quantitative an MBA course is different Business Schools have been shown in the map. It will be seen that the MBA course of Business School 'C is extremely different from that offered by Business School Q Points which are close to each other indicate similarity of the MBA courses in the students perception. The second illustration related to four brands of washing soaps based on a survey data from Calcutta. This is a non-attribute based example where a paired comparison for four high- and-medium-priced detergents Surf, Sunlight, Gnat and Key was undertaken. As mentioned there, Sunlight and Surf are closest and Surf and Key are farthest. In other
words the first two brands are most similar and the remaining two are most dissimilar. How the points in the figures for the four brands have been arrived at has been explained at length in that chapter and so is not repeated here. Subroto Sengupta has discussed at length product positioning in his book. While explaining different techniques of product positioning, he has shown how the concept of positioning can be used to improve the image of the concerned product or brand. He has given a number of examples covering a wide variety of products such as coffee, soft drinks, washing soaps, toilet soaps, shampoos and magazines As Sengupta points out the perceptual maps of product class also indicate holes or vacant positions in the market. These open spaces can be helpful to the management in suggesting new product opportunities as also possibilities for repositioning of old products. While it is true that the management does get the clues on preferred attributes of the product in question, it is unable to know all the relevant features of the new product such as its form, package and price. This problem can be overcome through the application of the conjoint analysis. In addition, Sengupta has discussed some research studies in respect of advertising positioning. We now give a detailed version of a study indicating how a brand which was putting up a poor performance in the market was repositioned. As a result, it improved its image and contributed to increased market share and profits. When to do Marketing Research? Marketing research can be done when: There is an information gap which can be filled by doing research. The cost of filling the gap through marketing research is less than the cost of taking a wrong decision without doing the research. o The time taken for the research does not delay decision-making beyond reasonable limits. o A delay can have many undesirable effects, like competitors becoming aware of strategies or tactics being. Contemplated, consumer opinion changing between the beginning and end of the study, and so forth. o o
LIMITATIONS OF MARKETING RESEARCH It must be kept in mind that marketing research, though very useful most of the time, is not the only input for decision-making. For example, many small businesses work without doing marketing research, and some of them are quite successful. It is obviously some other model of informal perceptions about consumer behaviour, needs, and expectations that is at work in such cases. Many businessmen and managers base their work on judgment, intuition, and perceptions rather than numerical data. There is a famous example in India, where a company commissioned a marketing research company to find out if there was adequate demand for launching a new camera. This was in pre-liberalised India, of the early 1980s. The finding of the research
study was that there was no demand, and that the camera would not succeed, if launched. The company went ahead and launched it anyway, and it was a huge success. The camera was Hot Shot. It was able to tap into the need of consumers at that time for an easy-to-use camera at an affordable price. Thus marketing research is not always the best or only source of information to be used for making decisions. It works best when combined with judgment, intuition, experience, and passion. For instance, even if marketing research were to show there was demand for a certain type of product, it still depends on the design and implementation of the appropriate marketing plans to make it succeed. Further, competitors could take actions which were not foreseen when marketing research was undertaken. This also leads us to conclude that the time taken for research should be the minimum possible, if we expect the conditions to be dynamic, or fast changing. Differences in Methodology The reader may be familiar with research studies or opinion polls conducted by different agencies showing different results. One of the reasons why results differ is because the methodology followed by each agency is usually different. The sampling method used, the sample size itself, the representativeness of the population, the quality of field staff who conduct interviews, and conceptual skills in design and interpretation all differ from agency to agency. Minor differences are to be expected in sample surveys done by different people, hut major differences should be examined for the cause, which will usually lead us to the different methodologies adopted by them. Based on the credibility of the agency doing the research and the appropriateness of the methodology followed, the user decides which result to rely upon. A judgment of which methodology is more appropriate for the research on hand comes from experience of doing a variety of research. To summarise, it is important to understand the limitations of marketing research, and to use it in such a way that we minimise its limitations. Complementary Inputs for Decision-Making Along with marketing research, marketing managers may need to look into other information while making a decision. For example, our corporate policy may dictate that a premium image must be maintained in all activities of our company. On the other hand, marketing research may tell us that consumers want a value-for-money product. This creates a dilemma for the basic corporate policy, which has to be balanced with consumer perception as measured by marketing research. Other inputs for decision-making could be growth strategies for the brand or product, competitors' strategies, and regulatory moves by the government and others. Some of these are available internally-for example, corporate policy and growth plans may be documented internally. Some other inputs may come from a marketing intelligence cell if the company has one. In any case, marketing decisions would be based on many of these complementary inputs, and not on the marketing research results alone.
Secondary and Primary Research One of the most basic differentiations is between secondary and primary research. Secondary research is any information we may use, but which has not been specifically collected for the current marketing research. This includes published sources of data, periodicals, newspaper reports, and nowadays, the Internet. It is sometimes possible to do a lot of good secondary research and get useful information. But marketing research typically requires a lot of current data which is not available from secondary sources. For example, the customer satisfaction level for a product or brand may not be reported anywhere. The effectiveness of a particular advertisement may be evident from the sales which follow. But why people liked the advertisement may not be obvious, and can only be ascertained through interviews with consumers. Also, the methodology for the secondary data already collected may be unknown, and therefore we may be unable to judge the reliability and validity of the data. Primary research is what we will be dealing with throughout this book. It can be defined as research which involves collecting information specifically for the study on hand, from the actual sources such as consumers, dealers or other entities involved in the research. The obvious advantages of primary research are that it is timely, focused, and involves no unnecessary data collection, which could be a wasted effort. The disadvantage could be that it is expensive to collect primary data. But when an information gap exists, the cost could be more than compensated by better decisions, which are taken with the collected data. Thus Research has become a significant element of successful Marketing System. The following section speaks about application of Research in Operation/Production Management domains. Applications of Research Methodology in Operations Management Methods of Estimating Current Demand There are two types of estimates of current demand which may be helpful to a company. These are: total market potential and territory potential. "Total market potential is the maximum amount of sales that might be available to all the firms in an industry during a given period under a given level- of industry marketing effort and given environmental conditions". Symbolically, total market potential is: Q=n x q x p where Q = total market potential n = number of buyers in the specific product/market under the given assumptions
q = quantity purchased by an average buyer p = price of an average unit Of the three components n, q, and p in the above formula, the most difficult component to estimate is q. One can start with a broad concept of q, gradually reducing it. For example, if we are thinking of readymade shirts for home consumption, we may first take the total male population eliminating that in rural areas. From the total male urban population, we may eliminate the age groups which are not likely to buy readymade shirts. Thus, the number of boys below 20 may be eliminated. Further eliminations on account of low income may be made. In this way we can arrive at the 'prospect pool' of those who are likely to buy shirts. The concept of market potential is helpful to the firm as it provides a benchmark against which actual performance can be measured. In addition, it can be used as a basis for allocation decisions regarding marketing effort. The estimate of total market potential is helpful to the company when it is in a dilemma whether to introduce a new product or drop an existing one. Such an estimate will indicate whether the prospective market is large enough to justify the company's entering it. Since it is impossible for a company to have the global market exclusively to itself, it has to select those territories where it can sell its products well. This means that companies should know the territorial potentials so that they can select markets most suited to them, channelise their marketing effort optimally among these markets and also evaluate their sale performance in such markets. There are two methods for estimating territorial potentials: (i) market-buildup method, and (ii) index-of-buying-power method. In the first method, several steps are involved. First, identify all the potential buyers for the product in each market. Second, estimate potential purchases by each potential buyer. Third, sum up the individual figures in step (ii) above. However, in reality the estimation is not that simple as it is difficult to identify all potential buyers. When the product in question is an industrial product, directories of manufacturers of a particular product or group of products are used. Alternatively, the Standard Industrial Classification of Manufacturers of a particular product or group of products is used. The second method involves the use of a straight forward index. Suppose a textile manufacturing company is interested in knowing the territorial potential for its cloth in a certain territory. Symbolically, Bi = 0.5Yi + 0.2ri + 0.3Pi where Bi = percentage of total national buying power in territory i
Yi = percentage of national disposable personal income originating in territory i ri = percentage of national retail sales in territory i Pi = percentage of national population living in territory i It may be noted that such estimates indicate potential for the industry as a whole rather than for individual company. In order to arrive at a company potential, the concerned company has to make certain adjustments in the above estimate on the basis of one or more other factors that have not been covered in the estimation of territorial potential. These factors could be the company's brand share, lumber of salespersons, number and type of competitors, etc. Forecasting Process After having described the methods of estimating the current demand, we now turn to forecasting. There are five steps involved in the forecasting process. These are mentioned below. First, one has to decide the objective of the forecast. The marketing researcher should know as to what will be the use of the forecast he is going to make. Second, the time period for which the forecast is to be made should be selected. Is the forecast short-term, medium-term or long-term? Why should a particular period of forecast be selected? Third, the method or technique of forecasting should be selected. One should be clear as to why a particular technique from amongst several techniques should be used. Fourth, the necessary data should be collected. The need for specific data will depend on the forecasting technique to be used. Finally, the forecast is to be made. This will involve the use of computational procedures. In order to ensure that the forecast is really useful to the company, there should be good understanding between management and research. The management should clearly spell out the purpose of the forecast and how it is going to help the company. It should also ensure that the researcher has a proper understanding of the operations of the company, its environment, past performance in terms of key indicators and their relevance to the future trend. If the researcher is well-informed with respect to these aspects, then he is likely to make a more realistic and more useful forecast for the management. Methods of Forecasting
The methods of forecasting can be divided into two broad categories, viz. subjective or qualitative methods and objective or quantitative methods. These can be further divided into several methods. Each of these methods is discussed below. I. Subjective / Qualitative Methods There are four subjective methods - field sales force, jury of executives, users' expectations and delphi. These are discussed here briefly. a. Field sales force Some companies ask their salesmen to indicate the most likely sales for a specified period in the future. Usually the salesman is asked to indicate anticipated sales for each account in his territory. These forecasts are checked by district managers who forward them to the company's head office. Different territory-forecasts are then combined into a composite forecast at the head office. This method is more suitable when a short-term forecast is to be made as there would be no major changes in this short period affecting the forecast. Another advantage of this method is that it involves the entire sales force which realises its responsibility to achieve the target it has set for itself. A major limitation of this method is that sales force would not take an overall or broad perspective and hence may overlook some vital factors influencing the sales. Another limitation is that salesmen may give somewhat low figures in their forecasts thinking that it may be easier for them to achieve those targets. However, this can be offset to a certain extent by district managers who are supposed to check the forecasts. b. Jury of executives Some companies prefer to assign the task of sales forecasting to executives instead of a sales force. Given this task each executive makes his forecast for the next period. Since each has his own assessment of the environment and other relevant factors, one forecast is likely to be different from the other. In view of this it becomes necessary to have an average of these varying forecasts. Alternatively, steps should be taken to narrow down the difference in the forecasts. Sometimes this is done by organising a discussion between the executives so that they can arrive at a common forecast. In case this is not possible, the chief executive may have to decide which of these forecasts is acceptable as a representative one. This method is simple. At the same time, it is based on a number of different viewpoints as opinions of different executives are sought. One major limitation of this method is that the executives' opinions are likely to be influenced in one direction on the basis of general business conditions. c. Users' expectations Forecasts can be based on users' expectations or intentions to purchase goods and services. It is difficult to use this method when the number of users is large. Another limitation of this method is that though it indicates users' 'intentions' to buy, the actual
purchases may be far less at a subsequent period. It is most suitable when the number of buyers is small such as in case of industrial products. d. The Delphi method This method too is based on the experts' opinions. Here, each expert has access to the same information that is available. A feedback system generally keeps them informed of each others' forecasts but no majority opinion is disclosed to them. However, the experts are not brought together. This is to ensure that one or more vocal experts do not dominate other experts. The experts are given an opportunity to compare their own previous forecasts with those of the others and revise them. After three or four rounds, the group of experts arrives at a final forecast. The method may involve a large number of experts and this may delay the forecast considerably. Generally it involves a small number of participants ranging from 10 to 40. It will be seen that both the jury of executive opinion and the Delphi method are based on a group of experts. They differ in that in the former, the group of experts meet, discuss the forecasts, and try To arrive at a commonly agreed forecast while in the latter the group of experts never meet. As mentioned earlier, this is to ensure that no one person dominates the discussion thus influencing the forecast. In other words, the Delphi method retains the wisdom of a group and at the same time reduces the effect of group pressure. An approach of this type is more appropriate when long-term forecasts are involved. In the subjective methods, judgments are an important ingredient. Before attempting a forecast, the basic assumptions regarding environmental conditions as also competitive behaviour must be provided people involved in forecasting. An important advantage of subjective methods is that they are easily understood. Another advantage is that the cost involved in forecasting is quite low. As against these advantages, subjective methods have certain limitations also. One major limitation is the varying perceptions of people involved in forecasting. As a result, wide variance is found in forecasts. Subjective methods are suitable when forecasts are to be made for highly technical products which have a limited number of customers. Generally, such methods are used for industrial products. Also when cost of forecasting is to be kept to a minimum, subjective methods may be more suitable. II. Objective / Quantitative or Statistical Methods Based on statistical analysis, these methods enable the researcher to make forecasts on a more objective basis. It is difficult to make a wholly accurate forecast for there is always an element of uncertainty regarding the future'. Even so, statistical methods are likely to be more useful as they are more scientific and hence more objective.
a. Time Series In time-series forecasting, the past sales data are extrapolated as a linear or a curvilinear trend. Even if such data are plotted on a graph, one can extrapolate for the desired time period. Extrapolation can be made with the help of statistical techniques. It may be noted that time-series forecasting is most suitable to stable situations where the future ids will largely be an extension of the past. Further, the past sales data should have distinctive ids from the random error component for a time-series forecasting to be suitable. Before using the time-series forecasting one has to decide how far back in the past one can go. It may be desirable to use the more recent data as conditions might have been different in the remote past. Another issue pertains to weighting of time-series data. In other words, should equal weight be given to each time period or should greater weightage be given to more recent data? Finally, should data be decomposed into different components, viz. trend, cycle, season and error? We now discuss methods, viz. moving averages, exponential smoothing and decomposition of time series. b. Moving average This method uses the last 'n' data points to compute a series of average in such a way that each time latest figure is used and the earliest one dropped. For example, when we have to calculate a five monthly moving average, we first calculate the average of January, February, March, April and May by adding the figures of these months, and dividing the sum by five. This will give one figure. In the next calculation, the figure for June will be included and that for January dropped thus giving a new average. Thus a series of averages is computed. The method is called as 'moving' average as it uses w data point each time and drops the earliest one. In a short-term forecast, the random fluctuations in the data are of major concern. One method of minimizing the influence of random error is to use an average of several past data points. This is achieved by the moving average method. It may be noted that in a 12-month moving average, the effect of seasonality is removed from the forecast as data points for every season are included before computing the moving average. c. Exponential smoothing A method which has been receiving increasing attention in recent years is known as exponential smoothing. It is a type of moving average that 'smoothens' the time-series. When a large number of forecasts are to be made for a number of items, exponential smoothing is particularly suitable as it combines the advantages of simplicity of computation and flexibility. d. Time-series decomposition
This method consists of measuring the four components of a time-series (i) trend, (ii) cycle, (iii) season, and (iv) erratic movement. (i) The trend component indicates long-term effects on sales that are caused by such factors as income, population, industrialisation and technology. The time period a trend function varies considerably from product to product However, it is usually taken as any period in excess of the time period required for a business cycle (which averages at 4-5 years). (ii) The cyclical component indicates some sort of a periodicity in the general economic activity. When data are plotted, they yield a curve with peaks and troughs, indicating increases and falls in the trend series with a certain periodicity. A careful study of the impact of a business cycle must be made on the sale of each product. Cyclical forecasts are likely to be more accurate for the longterm than for the short term. (iii) The seasonal component reflects changes in sales levels due to factors such as weather, festivals, holidays, etc. There is a consistent pattern of sales for period within a year. (iv) Finally, the erratic movements in data arise on account of events such as strikes, lockouts, price wars, etc. The decomposition of time-series enables identification of the error component from the trend, cycle and season which are systematic components. Casual or Explanatory Methods Causal or explanatory methods are regarded as the most sophisticated methods of forecasting sales. These methods yield realistic forecasts provided relevant data are available on the- major variables influencing changes in sales. There are three distinct advantages of causal methods. First, turning points in sales can be predicted more accurately by these methods than by time-series methods. Second, the use of these methods reduces the magnitude of the random component far more than it may be possible with the time-series methods. Third, the use of such methods provides greater insight to causal relationships. This facilitates the management in marketing decision making. Isolated sales recasts on the basis of time-series methods would not be helpful in this regard. Causal methods can be either (i) leading indicators or (ii) regression models. These are briefly discussed here. (i) Leading indicators Sometimes one finds that changes in sales of a particular product or service are preceded by changes one or more leading indicators. In such cases, it is necessary to identify leading indicators and to closely observe changes in them. One example of a leading indicator is the demand for various household appliances which follows the
construction of new houses. Likewise, the demand for many durables is preceded by an increase in disposable income. Yet another example is of number of births. The demand for baby food and other goods needed by infants can be ascertained by the number of births in a territory. It may be possible to include leading indicators in regression models. (ii) Regression models Linear regression analysis is perhaps the most frequently used and the most powerful method among casual methods. As we have discussed regression analysis in detail in the preceding chapters on Bivariate Analysis and Multivariate Analysis, we shall only dwell on a few relevant points. First, regression models indicate linear relationships within the range of observations and at the les when they were made. For example, if a regression analysis of sales is attempted on the basis of independent variables of population sizes of 15 million to 30 million and per capita income of Rs 1000 to Rs.2500, the regression model shows the relationships that existed between these extremes in the two independent variables. If the sales forecast is to be made on the basis of values of independent variables falling outside the above ranges, then the relationships expressed by the regression model may not hold good. Second, sometimes there may be a lagged relationship between the dependent and independent variables. In such cases, the values of dependent variables are to be related to those of independent variables for the preceding month or year as the case may be. The search for factors with a lead-lag relationship to the sales of a particular product is rather difficult. One should tryout several indicators before selecting the one which is most satisfactory. Third, it may happen that the data required to establish the ideal relationship, do not exist or are inaccessible or, if available, are not useful. Therefore, the researcher has to be careful in using the data. He should be quite familiar with the varied sources and types of data that can be used in forecasting. He should also know about their strengths and limitations. Finally, regression model reflects the association among variables. The causal interpretation is done by the researcher on the basis of his understanding of such an association. As such, he should be extremely careful in choosing the variables so that a real causative relationship can be established among the variables chosen. Input-Output Analysis Another method that is widely used for forecasting is the input-output analysis. Here, the researcher takes into consideration a large number of factors, which affect the outputs he is trying to forecast. For this purpose, an input-output table is prepared, where the inputs are shown horizontally as the column headings and the outputs vertically as the stubs. It may be mentioned that by themselves input-output flows are of little direct use to the researcher. It is the application of an assumption as to how the output of an industry is related to its use of various inputs that makes an input-output analysis a good method of forecasting. The assumption states that as the level of an industry's output changes, the use of inputs will change proportionately, implying that
there is no substitution in production among the various inputs. This may or may not hold good. The use of input-output analysis in sales forecasting is appropriate for products sold to governmental, institutional and industrial markets as they have distinct patterns of usage. It is seldom used for consumer products and services. It would be most appropriate when the levels and kinds of inputs required achieving certain levels of outputs need to be known. A major constraint in the use of this method is that it needs extensive data for a large number of items which may not be easily available. Large business organisations may be in a position to collect such data on a continuing basis so that they can use input-output analysis for forecasting. However, this is not possible m case of small industrial organisations on account of excessive costs involved in the collection of comprehensive data. It is for this reason that input, output analysis is less widely used than most analysts initially expected. A detailed discussion of input-output analysis is beyond the scope of this book. Econometric Model Econometrics is concerned with the use of statistical and mathematical techniques to verify hypotheses emerging in economic theory. "An econometric model incorporates functional relationships between estimated techniques into an internally consistent and logically self-contained framework". Econometrics use both exogenous and endogenous variables. Exogenous variables are used as inputs into, but they themselves are determined outside the model. These variables include policy variables controlled events. In contrast, endogenous variables are those which are determined within the system. The use of econometric models is generally found at the macro level such as forecasting national e and its components. Such models show how the economy or any of its specific segments operates. As compared to an ordinary regression equation they bring out the causalities involved more distinctly. This merit of econometric models enables them to predict turning points more accurately. However, their use at the micro-level for forecasting has so far been extremely limited. Applications of research methodology in human resources management Research methodology widely used in the domains of Human resources Management. It is called as Human resources Metrics (HR Metrics). To move to the center of the organization, HR must be able to talk in quantitative, objective terms. Organizations are managed by data. Unquestionably, at times, managers make decisions based on emotions as facts. Nevertheless, day-to-day operations are discussed, planned and evaluated in hard data terms.
Perhaps the most crucial advantage of a sound HR metrics programme is that it enables HR to converse with senior management in the language of business. Operational decisions taken by HR are then based on cold, hard facts rather than gut feeling, the figures being used to back up business cases and requests for resource. The HR function is transformed from a bastion of 'soft' intangibles into something more 'scientific', better able to punch its weight in the organisation. In addition, the value added by HR becomes more visible. This will become increasingly important as more and more functions attempt to justify their status as strategic business partners rather than merely cost centres. The five key practices of the Human Capital Index are as follows: 1. Recruiting excellence 2. Clear rewards and accountability, 3. Prudent use of resources, 4. Communications integrity 5. Collegial flexible workplace · · · ·
They require the capture of metrics for their very definition. Metrics help quantify and demonstrate the value of HR Metrics can help guide workforce strategies and maximize return on HR investments Metrics provide measurement standards Metrics help show what HR contributes to overall business results
SUMMARY The above lesson has given brief introduction to various applications of various research techniques in management. It has identified the appropriate tools for applications of various domains of Management. KEY TERMS · · · · · · · · ·
Marketing research Brand positioning Image profile analysis Market Potential Demand Measurement Delphi Method Time Series analysis Moving average HR Metrics
IMPORTANT QUESTIONS 1. 2. 3. 4. 5. 6. 7. 8.
What are all the various domains in which research tools can be used? Explain the application of Image profile analysis with example. Differentiate between Primary and secondary research. What are the limitations of marketing research? Describe the method for finding out the Market potential. Explain the Various methods to estimate the Demand. What do you mean by HR Metrics? Note down the five key practices of the Human Capital Index.
- End of Chapter LESSON - 24 REPORT PREPARATIONS
OBJECTIVES · · o o o o o o o
To learn the structure of Professional Research report To understand the application of following diagrams Area Chart Line Graph Bar chart Pie chart Radar diagram Surface diagram Scatter diagram
STRUCTURE · · · ·
Research report Report format Data presentation Pareto chart
RESEARCH REPORT
The final step in the research process is the preparation and presentation of the research report. A research report can be defined as the presentation of the research findings directed to a specific audience to accomplish a specific purpose. Importance of report The research report is important for the following reasons: 1. The results of research can be effectively communicated to management. 2. The report is the only aspect of the study, which executives are exposed to and their consecutive evaluation of the project rests with the effectiveness of the written and oral presentation. 3. The report presentations are typically the responsibility of the project worthiness. So, the communication effectiveness and usefulness of the information provided plays a crucial role in determining whether that project will be continued in future. Steps in report preparation Preparing a research report involves three steps 1. Understanding the research 2. Organizing the information 3. Writing with effectiveness Guidelines The general guidelines that should be followed for any report are as follows: 1. Consider the audience: The information resulting from research is ultimately important to the management, who will use the results to make decisions. Decision makers are interested in a clear, concise, accurate and interesting report, which directly focuses on their information needs with a minimum of technological jargons. Thus, the report has to be understood by them; the report should not be too technical and not too much jargon should be used. This is a particular difficulty when reporting the results of statistical analysis where there is a high probability that few of the target audience have a grasp of statistical concepts. Hence, for example, there is a need to translate such terms as standard deviation, significance level, confidence interval etc. into everyday language. 2. Be concise, but precise: The real skill of the researcher is tested in fulfilling this requirement. The report must be concise and must focus on the crucial elements of the project. It should not include the unimportant issues. Researcher should know how much emphasis has to be given to each area.
3. Be objective, yet effective: The research report must be an objective presentation of the research findings. The researcher violates the standard of objectivity if the findings are presented in a distorted or slanted manner. The findings can be presented in a manner, which is objective, yet effective. The writing style of the report should be interesting, with the sentence structure short and to the point. 4. Understand the results and draw conclusions: The managers who read the report are expecting to see interpretive conclusions in the report. The researcher should understand the results and be able to interpret it effectively to management. Simply reiterating the facts will not do, implications has to be drawn using the "so what" questions on the results. REPORT FORMAT Every person has a different style of writing. There is not really one right style of writing, but the following outline is generally accepted as the basis format for most research projects. 1. Title Page 2. Table of contents 3. Executive summary 4. Introduction 5. Problem statement 6. Research objective 7. Background 8. Methodology 9. Sampling design 10. Research design 11. Data collection 12. Data analysis 13. Limitation 14. Findings 15. Conclusions 16. Summary and conclusions 17. Recommendations 18. Appendices 19. Bibliography 1. Title page The title page should contain a title which conveys the essence of the study, the date, the name of the organization submitting the report, and the organization for whom there is prepared. If the research report is confidential, the name of those individuals to receive the report should be specified on the title page. 2. Table of contents
As a rough guide, any report of several sections that totals more than 6/to 10 pages should have a table of contents. The table of contents lists the essence of topics covered in the report, along with page references. Its purpose is to aid readers in finding a particular section in the report. If there are many tables, charts, or other exhibits, they should also be listed after the table of contents in a separate table of illustrations. 3. Executive summary An executive summary can serve two purposes. It may be a report in miniature covering all the aspects in the body of the report, but in abbreviated form, or it may be a concise summary of major findings and conclusions including recommendations. Two pages are generally sufficient for executive summaries. Write this section after the report is finished. It must exclude the new information but may require graphics to present a particular conclusion. Expect the summary to contain a high density of significant terms since it is repeating the highlights of report. A good summary should help the decision make and it is designed to be action oriented. 4. Introduction The introduction prepares the reader for the report by describing the parts of the project: the problem statement, research objectives and background material. The introduction must clearly explain the nature of decision problem. It should review the previous research done on the problem. 5. Problem statement The problem statement contains the need for the research project. The problem is usually represented by a management questions. It is followed by a more detailed set of objectives. 6. Research objectives The research objectives address the purpose of the project. These may be research question(s) and associated investigative questions. 7. Background The Background material may be of two types. It may be the preliminary results of exploration from an experience survey, focus group, or another source. Alternatively it could be secondary data from the literature review. Background material may be placed before the problem statement or after the research objectives. It contains information pertinent to the management problem or the situation that led to the study.
8. Methodology The purpose of the methodology section is to describe the nature of the research design, the sampling plain, data collection and analysis procedure. Enough details must be conveyed so that the reader can appreciate the nature the methodology used, yet the presentation must not be boring overpowering. The use of technical jargon must be avoided. 9. Research design The coverage of the design must be adapted to the purpose. The type of research adapted and reason for adapting that particular type should d be explained. 10. Sampling design The research explicitly defines the target population being studied and the sampling methods used. It has to explain the sampling frame, sampling method adapted and sample size. Explanation of the sampling method, uniqueness of the chosen parameters or other relevant points that need explanation should be covered with brevity. Calculation of sample size can be placed either in this part or can be placed in an appendix. 11. Data collection This part of report describes the specifics of gathering the data. Its content depends on the selected design. The data collection instruments (questionnaire or interview schedule) field instructions can be placed in the appendix. 12. Data analysis This section summarizes the methods used to analyze the data. Describe data handling, preliminary analysis, statistical tests, computer programs and other technical information. The rationale for the choice of analysis approaches should be clear. A brief commentary on assumptions and appropriateness of use should be presented. 13. Limitations Every project has weakness, which need to be communicated in a clear and concise manner. In this process, the researcher should avoid belaboring minor study weakness. The purpose of this section is not to disparage the quality of the research project, but rather to enable the reader to judge the validity of the study results. Generally the limitations will occur in sampling, no response inadequacies and methodological weakness. It is the researcher's professional responsibility to clearly inform the reader of these limitations. 14. Findings
The objective of this part is to explain the data rather than draw conclusions. When quantitative data can be presented, this should be done as simply as possible with charts, graphics and tables. The findings can be presented in a small table or chart on the same page. While this arrangement adds to the bulk of the report, it is convenient for the reader. 15. Conclusions It can be further divide into two parts as summary and recommendations. 16. Summary and conclusions The summary is brief statement of the essential findings. The conclusion should clearly link the research findings with the information needs, and based on this linkage recommendation for action can be formulated. In some research works the conclusions were presented in a tabular form for easy reading and reference. The research questions /objectives will be answered sharply in this part. 17. Recommendations The researcher's recommendations may be weighed more heavily in favor of the research findings. There are few ideas about corrective actions. The recommendations are given for managerial actions rather than research action. The researcher may offer several alternatives with justifications. 18. Appendices The purpose of appendix is to provide a place for material, which is not absolutely essential to the body of the report. The material is typically more specialized and complex than material presented in the main report, and it is designed to serve the needs of the technically oriented reader. The appendix will frequently contain copies of the data collection forms, details of the sampling plan, estimates of statistical error, interviewer instructions and detailed statistical tables associated with the data analysis process. The reader who wishes to learn the technical aspects of the study and to look at statistical breakdowns will want a complete appendix. 19. Bibliography The use of secondary data requires a Bibliography. Proper citation, style and formats are unique to the purpose of the report. The instructor, program, institution, or client often specifies style requirements. It will be given as footnote or endnote format. The author name, title, publication, year, pager number are the important elements of bibliography. DATA PRESENTATION The research data can be presented in Tabular & Graphic form.
Tables The tabular form consists of the numerical presentation of the data. Tables should contain the following elements: 1. Table number, this permits easy location in the report 2. Title: the title should clearly indicate the contents of the table or figure. 3. Box head and sub head: the box head contains the captions or labels to the column in a table, while the sub head contains the labels for the rows. 4. Footnote: footnote explains the particular section or item in the table or figure.
Graphics The graphical form involves the presentation of data in terms of visually interpreted sizes. Graphs should contain the following elements: 1. Graph or figure number 2. Title 3. Footnote 4. Sub heads in the axis Bar chart A bar chart depicts magnitudes of the data by the length of various bars which have been laid out with reference to a horizontal or vertical scale. The bar chart is easy to construct and can be readily interpreted.
Column chart
These graphs compare the sizes and amounts of categories usually for the same time. Mostly places the categories on X-axis and values on Y-axis.
Pie chart The pie chart is a circle divided into sections such that the size of each section corresponds to a portion of the total. It shows the relationship of parts to the whole. Wedges are row values of data. It is one form of area chart. This type is often used with business data.
Line Graph Line graphs are used chiefly for time series and frequency distribution. There are several guidelines for designing the line graph. · · · ·
Put the independent variable in the horizontal axis When showing more than one line, use different line types Try not to put more than four lines on one chart Use a solid line for the primary data.
I. Radar diagram In this the radiating lines are categories; values are distances from the center. It can be applied where multiple variables used.
II. Area (surface) diagram An area chart is also used for a time series. Like line charts it compares changing values but emphasis relative values of each series
III. Scatter diagram This shows the values if there is relationship between variables follows a pattern; may be used with one variable at different times.
Purpose of a Histogram A histogram is used to graphically summarize and display the distribution of a process data set. The purpose of a histogram is to graphically summarize the distribution of a uni-variate data set. The histogram graphically shows the following: 1. center (i.e. the location) of the data; 2. spread (i.e. the scale) of the data; 3. skewness of the data; 4. presence of outliers; and 5. presence of multiple modes in the data. These features provide strong indications of the proper distributional model for the data. The probability plot or a goodness-of-fit test can be used to verify the distributional model. Sample Bar Chart Depiction
How to construct a Histogram A histogram can be constructed by segmenting the range of the data into equal sized bins (also called segments, groups or classes). For example, if your data ranges from 1.1 to 1.8, you could have equal bins of 0.1 consisting of 1 to 1.1, 1.2 to 1.3, 1.3 to 1.4, and so on. The vertical axis of the histogram is labeled Frequency (the number of counts for each bin), and the horizontal axis of the histogram is labeled with the range of your response variable. The most common form of the histogram is obtained by splitting the range of the data into equal-sized bins (called classes). Then for each bin, the numbers of points from the data set that fall into each bin are counted. That is, - Vertical axis: Frequency (i.e., counts for each bin) - Horizontal axis: Response variable The classes can either be defined arbitrarily by the user or via some systematic rule. A number of theoretically derived rules have been proposed by Scott (Scott 1992). The cumulative histogram is a variation of the histogram in which the vertical axis gives not just the counts for a single bin, but rather gives the counts for that bin plus all bins for smaller values of the response variable. Both the histogram and cumulative histogram have an additional variant whereby the counts are replaced by the normalized counts. The names for these variants are the relative histogram and the relative cumulative histogram.
There are two common ways to normalize the counts. 1. The normalized count is the count in a class divided by the total number of observations. In this case the relative counts are normalized to sum to one (or 100 if a percentage scale is used). This is the intuitive case where the height of the histogram bar represents the proportion of the data in each class. 2. The normalized count is the count in the class divided by the number of observations times the class width. For this normalization, the area (or integral) under the histogram is equal to one. From a probabilistic point of view, this normalization results in a relative histogram that is most akin to the probability density function and a relative cumulative histogram that is most akin to the cumulative distribution function. If you want to overlay a probability density or cumulative distribution function on top of thehistogram, use this normalization. Although this normalization is less intuitive (relative frequencies greater than 1 are quite permissible), it is the appropriate normalization if you are using the histogram to model a probability density function. What questions does Histogram Answer? · · · ·
What is the most common system response? What distribution (center, variation and shape) does the data have? Does the data look symmetric or is it skewed to the left or right? Does the data contain outliers?
PARETO CHART Purpose of a Pareto Chart A Pareto Chart is used to graphically summarize and display the relative importance of the differences between groups of data. A bar graph used to arrange information in such a way that priorities for process improvement can be established. Sample Pareto Chart Depiction
Purposes - To display the relative importance of data. - To direct efforts to the biggest improvement opportunity by highlighting the vital few in contrast to the useful many. Pareto diagrams are named after Vilfredo Pareto, an Italian sociologist and economist, who invented this method of information presentation toward the end of the 19th century. The chart is similar to the histogram or bar chart, except that the bars are arranged in decreasing order from left to right along the abscissa. The fundamental idea behind the use of Pareto diagrams for quality improvement is that the first few (as presented on the diagram) contributing causes to a problem usually account for the majority of the result. Thus, targeting these "major causes" for elimination results in the most cost-effective improvement scheme. How to Construct · · ·
· ·
Determine the categories and the units for comparison of the data, such as frequency, cost, or time. Total the raw data in each category, then determine the grand total by adding the totals of each category. Re-order the categories from largest to smallest. Determine the cumulative percent of each category (i.e., the sum of each category plus all categories that precede it in the rank order, divided by the grand total and multiplied by 100). Draw and label the left-hand vertical axis with the unit of comparison, such as frequency, cost or time. Draw and label the horizontal axis with the categories. List from left to right in rank order.
· · · ·
Draw and label the right-hand vertical axis from 0 to 100 percent. The 100 percent should line up with the grand total on the left-hand vertical axis. Beginning with the largest category, draw in bars for each category representing the total for that category. Draw a line graph beginning at the right-hand corner of the first bar to represent the cumulative percent for each category as measured on the right-hand axis. Analyze the chart. Usually the top 20% of the categories will comprise roughly 80% of the cumulative total.
Guidelines for Effective applications of Pareto analysis: · · · · ·
Create before and after comparisons of Pareto charts to show impact of improvement efforts. Construct Pareto charts using different measurement scales, frequency, cost or time. Pareto charts are useful displays of data for presentations. Use objective data to perform Pareto analysis rather than team members' opinions. If there is no clear distinction between the categories - if all bars are roughly the same height or half of the categories are required to account for 60 percent of the effect - consider organizing the data in a different manner and repeating Pareto analysis.
What questions does the Pareto chart answer? · · ·
What are the largest issues facing our team or business? What 20% of sources are causing 80% of the problems (80/20 Rule)? Where should we focus our efforts to achieve the greatest improvements?
Example of constructing the Pareto Chart The following table shows the reasons for failure of patients in a Hospital:
Pareto chart for above details can be drawn as follows:
When to Use a Pareto Chart Pareto charts are typically used to prioritize competing or conflicting "problems," so that resources are allocated to the most significant areas. In general, though, they can be used to determine which of several classifications have the most "count" or cost associated with them. For instance, the number of people using the various ATM's vs. each of the indoor teller locations, or the profit generated from each of twenty product lines. The important limitations are that the data must be in terms of either counts or costs. The data cannot be in terms that can't be added, such as percent yields or error rates. PICTOGRAPH A pictograph is used to present statistics in a popular yet less statistical way to those who are not familiar with charts that contain numerical scales. This type of chart presents data in the form of pictures drawn to represent comparative sizes, scales or areas. Again as with every chart, the pictograph needs a title to describe what is being presented and how the data are classified as well as the time period and the source of the data. Examples of these types of charts appear below:
A pictograph uses picture symbols to convey the meaning of statistical information. Pictographs should be used carefully because the graphs may, either accidentally or deliberately, misrepresent the data. This is why a graph should be visually accurate. If not drawn carefully, pictographs can be inaccurate Stem Plots In statistics, a Stemplot (or stem-and-leaf plot) is a graphical display of quantitative data that is similar to a histogram and is useful in visualizing the shape of a distribution. They are generally associated with the Exploratory Data Analysis (EDA) ideas of John Tukey and the course Statistics in Society (NDST242) of the Open University, although in fact Arthur Bowley did something very similar in the early 1900s. Unlike histograms, Stemplots: - retain the original data (at least the most important digits) - put the data in order - thereby easing the move order-based inference and nonparametric statistics. A basic Stemplot contains two columns separated by a vertical line. The left column contains the stems and the right column contains the leaves. The ease with which
histograms can now be generated on computers has meant that Stemplots are less used today than in the 1980's, when they first became widely used. To construct a Stemplot, the observations must first be sorted in ascending order. Here is the sorted set of data values that will be used in the example: 54 56 57 59 63 64 66 68 68 72 72 75 76 81 84 88 106 Next, it must be determined what the stems will represent and what the leaves will represent. Typically, the leaf contains the last digit of the number and the stem contains all of the other digits. In the case of very large or very small numbers, the data values may be rounded to a particular place value (such as the hundreds place) that will be used for the leaves. The remaining digits to the left of the rounded place value are used as the stems. In this example, the leaf represents the ones place and the stem will represent the rest of the number (tens place and higher). The Stemplot is drawn with two columns separated by a vertical line. The stems are listed to the left of the vertical line. It is important that each stem is listed only once and that no numbers are skipped, even if it means that some stems have no leaves. The leaves are listed in increasing order in a row to the right of each stem.
Double Stem Plots (Stem-and-leaf Plot) Splitting stems and the back-to-back stem plot are two distinct types of double stem plots, which are a variation of the basic stem plot. Splitting Stems On the data set, splitting each of the stems into two or five stems may better illustrate the shape of the distribution. When splitting stems, it is important to split all stems and to split the stems equally. When splitting each stem into two stems, one stem contains leaves from 0-4 and leaves from 5-9 are contained in the other stem. When splitting each stem into five stems, one stem contains leaves 0-1, the next 2-3, the next 4-5, the
next 6-7, and the last leave 8-9. Here is an example of a split stem plot (using the same data set from the example above) in which each stem is split into two:
SUMMARY A research report can be defined as the presentation of the research findings directed to a specific audience to accomplish a specific purpose. General guidelines followed to write the report are 1. 2. 3. 4.
Consider the audience, Be concise, but precise Be objective, yet effective & Understand the results and draw conclusions.
The main elements of report are title page, table of contents, executive summary, introduction, methodology, findings, conclusions, appendices, and bibliography. Tables and graphs are used for the presentation of data. Different types of graphs are available like bar chart, pie chart, line chart, area diagram, radar diagram, scatter diagram, etc. According to the nature of data and requirement the type of graph can be selected and used effectively. KEY TERMS
· · · · · · · · · · · · · · ·
Executive summary Sampling Bibliography Appendix Interview schedule Area chart Line graph Bar chart Pie chart Scatter diagram Radar diagram Surface diagram Pareto Chart Pictograph Stem-graph
IMPORTANT QUESTIONS 1. 2. 3. 4. 5. 6. 7. 8. 9.
What do you mean by research report? Why is the research report important? Explain the general guidelines exist for writing a report? What are the preparations required for writing the report? What components are typically included in a research report? What are the alternative means of displaying data graphically? Explain the application of Pareto Chart with example What are the applications of Pictograph? What are the procedures to draw Stem Graph or Stem Plot?
REFERENCE BOOKS 1. Ramanuj Majumdar, Marketing research, Wiley Estern Ltd., New Delhi, 1991. 2. Harper W Boyd, Jr etal, Marketing Research, Richard D. Irevin Inc. USA 1990. 3. Paul E. Green et al, Research for Marketing Decisions, Prentice Hall of India Pvt. Ltd., New Delhi, 2004.
- End of Chapter -
View more...
Comments