Only follow what is highlighted in red Description You have probably made or heard the popular comment “Is it just me, or [fill in the blank here]?” This question is really a survey that asks others to indicate their level of agreement with some viewpoint—for example, “Is it just me, or is it hot in here?” or “Is it just me, or was this exam difficult?” We largely ask such questions to gauge the opinions of others. Many examples likely occur every day, from completing a customer satisfaction survey to asking your friends what they plan to order at a restaurant to get a better idea of what you might want to order. Really, we could survey people to measure all sorts of constructs, including love, attachment, personality, motivation, cognition, and many other constructs studied by behavioral scientists. We can also identify how constructs such as love, attachment, personality, motivation, and cognition are related to other factors or behaviors such as the likelihood of depression, emotional well-being, and physical health. In everyday situations, you may notice relationships between temperature and aggression (e.g., the hotter it is outside, the more often you see people fighting at a sports stadium) or between class participation and grades (e.g., students with higher grades tend to also participate more in class). Hence, there is a natural tendency for us to engage the world under the assumption that behavior does not occur in isolation. Instead, behavior is related to or influenced by other factors in the environment. It is therefore not uncommon at all for humans to observe the world by asking people to answer questions about themselves or by observing how human behavior is related to other factors such as health and well-being. The same is true in science. In this chapter, we describe how we can use the scientific method to evaluate or survey participant responses and identify relationships between factors. SURVEY DESIGNS Many research designs can be used to test the same hypotheses. This chapter is separated into two major sections; each section describes a nonexperimental research design: survey designs and correlational designs. To introduce how each design can be used to test the same hypothesis, we begin each major section by developing a new research design to test the same hypothesis. Suppose we hypothesize that texting while driving is more prevalent among younger age groups, as has been tested in the published literature (Hayashi, Rivera, Modico, Foreman, & Wirth, 2017; Quisenberry, 2015; Srinivas, White, & Omar, 2014). We could use a survey research design by asking a sample of young college students who drive to indicate in a questionnaire how often they use text messaging while driving (per month). If the hypothesis is correct and we set up this study correctly, we should find that a high percentage of young drivers use text messaging while driving. We will return to this hypothesis with a new way to answer it when we introduce correlational designs. We begin this chapter with an introduction to the research design that was illustrated here: the survey research design. 8.1 A1 AN OVERVIEW OF SURVEY DESIGNS A nonexperimental research design used to describe an individual or a group by having participants complete a survey or questionnaire is called the survey research design. A survey, which is a common measurement tool in the behavioral sciences, is a series of questions or statements to which participants indicate responses. A survey can also be called a questionnaire or self-report because many surveys specifically include questions in which participants report about themselves—their attitudes, opinions, beliefs, activities, emotions, and so on. The survey research design is the use of a survey, administered either in written form or orally, to quantify, describe, or characterize an individual or a group. A survey can be administered in printed form, or it can be distributed orally in an interview. While a survey can be used as a measurement tool in many research designs, the survey research design specifically refers to the use of surveys to quantify, describe, or characterize an individual or a group. In this chapter, we introduce the types and writing of questions included in surveys, describe how to administer surveys, and discuss some limitations associated with using surveys in the behavioral sciences. A survey is a series of questions or statements, called items, used in a questionnaire or an interview to measure the self-reports or responses of respondents. 8.2 TYPES OF SURVEY ITEMSN OVERVIEW OF SURVEY DESIGNS 2 TYPES OF SURVEY ITEMS A survey consists of many questions or statements to which participants respond. A survey is sometimes called a scale, and the questions or statements in the survey are often called items. As an example of a scale with many items, the Estimated Daily Intake Scale for Sugar (EDIS-S) was developed as an 11-item scale (Privitera & Wallace, 2011). Hence, the scale or survey has 11 items or statements to which participants respond on a 7-point scale from 1 (completely disagree) to 7 (completely agree). Notice that each item, listed in Table 8.1, is a statement about how much sugar participants consume in their diets. There are three types of questions or statements that can be included in a survey: open-ended items, partially open-ended items, and restricted items. Each type of item is described here. Open-Ended Items When researchers want participants to respond in their own words to a survey item, they include an open-ended item in the survey. An open-ended item is a question or statement that is left completely “open” for response. It allows participants to give any response they feel is appropriate with no limitations. For example, the following four items are open-ended questions that were asked in a focus group, which is largely a guided discussion among a targeted group of participants to explore a topic. In this study, the aim was to understand the reasons or nature of binge-watching television shows (Flayelle, Maurage, & Billieux, 2017, p. 471): An open-ended item is a question or statement in a survey that allows the respondent to give any response in their own words, without restriction. table 8.1 ⦁ The 11 Items for the EDIS-S Item Statement 1.I tend to eat cereals that have sugar in them. 2.I tend to put a lot of syrup on my pancakes or waffles. 3.I often eat candy to snack on when I am hungry. 4.I tend to crave foods that are high in sugar. 5.I tend to snack on healthier food options. 6.I tend to consume a low-sugar diet. 7.I often snack on sugary foods when I am hungry. 8.When I crave a snack, I typically seek out sweet-tasting foods. 9.I tend to eat foods that are most convenient, even if they contain a lot of sugar. 10.I like consuming sweet-tasting foods and drinks each day. 11.I tend to avoid consuming a high-sugar diet. 1. What are your watching habits and practices regarding TV series? 2. Why do you indulge in TV series watching? 3. How do you feel right after watching an episode? 4. Do you sometimes consider yourself as a “TV series addict”? Open-ended items can also be given as a statement and not a question. For example, the researchers could have asked participants in the focus group to respond to the following survey item: “Describe an experience you had while binge-watching a TV series and describe the emotions you felt during that experience.” In this example, the open-ended item is phrased as a statement and not a question; however, the response will still be open ended. Open-ended items are most often used with the qualitative research design because the responses in the survey are purely descriptive. For example, one descriptive result from the study by Flayelle et al. (2017) was that that while most participants acknowledged that binge-watching TV shows can be addicting, they did not recognize themselves as being “addicted” in terms of a diagnosis. The focus group study by Flayelle et al. (2017) was purely descriptive—it was a qualitative research study. For quantitative research designs, the challenge with using open-ended questions or items is in coding the open-ended responses of participants. It is difficult to anticipate how participants will respond to an open-ended item, so the researcher must develop methods to code patterns or similarities in participant responses. Coding the responses to open-ended items, however, requires researchers to do both of the following: Tediously anticipate and list all possible examples of potential responses in terms of how participants might write or express their responses. Use multiple raters and additional statistical analyses to make sure the coding is accurate. r the reasons listed here, open-ended survey items are not often used in quantitative research, with partially open-ended or restricted items being favored among quantitative researchers. That said, when the response to an open-ended item is numeric, these questions can be readily used with quantitative research. For example, a survey item can ask for demographic information (age in years, income in dollars, height in inches or centimeters, weight in pounds or kilograms) or behavioral patterns. A survey used to assess social media and website behavior could ask the following: “Not counting e-mail, about how many minutes or hours per week do you use the web?” (Stern, Bilgen, McClain, & Hunscher, 2017, p. 718). As long as the unit of measurement is defined for the respondent (e.g., years, dollars, minutes), such answers will be readily quantified. Partially Open-Ended Items Researchers can include items, called partially open-ended items, which give participants a few restricted answer options and then a last one that allows participants to respond in their own words in case the few restricted options do not fit with the answer they want to give. The open-ended option is typically stated as “other” with a blank space provided for the participant’s open-ended response. For example, another focus group study evaluated what educators and students thought about using patients as teachers in medical education (Jha, Quinton, Bekker, & Roberts, 2009). Their study included the following partially open-ended item: A partially open-ended item is a question or statement in a survey that includes a few restricted answer options and then a last one that allows participants to respond in their own words in case the few restricted options do not fit with the answer they want to give. In what capacity do you (students or faculty or other) view the role of patients as teachers? 1. Teaching 2. Assessment 3. Curriculum development 4. Other ____ (Jha et al., 2009, p. 455) In this item, participants either chose an option provided (teaching, assessment, or curriculum development) or provided their own open-ended response (other ___). For the researchers, it is easier to manage the participant responses, or data, when an open-ended item includes a few restricted options. To enter participant responses, researchers can code each answer option as a number. The last open-ended option could be coded further, or just analyzed without further coding. For example, we could report only the percentage of participants choosing the last open-ended option, without analyzing the specific open-ended responses given. In this way, coding and Open-ended items are largely used in qualitative research; restricted items are largely used in quantitative research. Restricted Items The most commonly used survey item in quantitative research, called a restricted item, includes a restricted number of answer options. A restricted item does not give participants an option to respond in their own words; instead, the item is restricted to the finite number of options provided by the researcher. Restricted items are often given with a rating scale, which is often referred to as a Likert scale when the scale varies between 5 and 7 points. A Likert scale, named after Rensis Likert (1932) who was the first to use such a scale, is a finite number of points for which a participant can respond to an item in a survey, typically between 5 and 7 points. A restricted item, also called a closed-ended item, is a question or statement in a survey that includes a restricted number of answer options to which participants must respond. A Likert scale is a numeric response scale used to indicate a participant’s rating or level of agreement with a question or statement. Two common applications of rating scales are to have participants use the scale to describe themselves or to indicate their level of agreement. For example, Becker, Helseth, Frank, Escobar, and Weeks (2018) used the following 5-point rating scale to select participants to their study based on their level of concern for their child’s substance use (each number value is a “point” on the scale): analyzing partially open-ended items can be less tedious than for open-ended items. How concerned are you about your child’s substance use? (Circle one) 1 2 3 4 5 Not concerned at all Extremely concerned As another example of using a rating scale to indicate a level of agreement, Nagels, Kircher, Steines, Grosvald, and Straube (2015) used the following 5-point rating scale for an item included in an assessment of individual differences in gesture perception and production: I like talking to people who gesture a lot when they talk. (Circle one) 1 2 3 4 5 Not agree Fully agree rating scale itself does not necessarily have to be numeric. For example, many young children cannot count, let alone use numeric scales to describe themselves. In these cases, pictorial scales, such as the “self-assessment manikin” (SAM) shown in Figure 8.1 (Bradley & Lang, 1994), can be used. This type of scale can be used with children, such as in a study in which children used the SAM scale to indicate their mood or how they feel, or the how much they like foods or drinks they consume (Privitera, 2016). The scale can also be used with adults. For example, Kunze, Arntz, and Kindt (2015) used the SAM scale to evaluate participant responses to a variety of stimuli used in a study on fear conditioning. In each study, the researchers coded responses using a numeric scale from 1 (saddest) to 5 (happiest) so that responses with the images could be recorded as numeric values and then analyzed. Figure 8.1 ⦁ A Pictorial Rating Scale Used to Measure Emotion and Liking in Children Source: Adapted from Bradley and Lang (1994). Source: Adapted from Bradley and Lang (1994). The main advantage of using restricted items is that survey responses can be easily entered or coded for the purposes of statistical analysis. The main limitation of using restricted items is that the analysis is restricted to the finite number of options provided to participants. However, when the options available to participants are exhaustive of all options they could choose, this limitation is minimal. Learning Check 1 ✓ State the type of survey item for each of the following items: How do you feel about the effectiveness of your professor’s teaching style? On a scale from 1 (very ineffective) to 7 (very effective), how would you rate your professor? Is your professor’s greatest strength their (a) timeliness to class, (b) knowledge of the material, (c) concern for students, or (d) other ___ (please explain)? A __________ is a numeric response scale used to indicate a participant’s rating or level of agreement with a question or statement. Answers: 1. A. Open-ended item, B. Restricted item, C. Partially open-ended item; 2. rating scale or Likert scale. 8.3 RULES FOR WRITING SURVEY ITEMS Writing survey items is a thoughtful endeavor. The items that you write must be valid and reliable. In other words, the items must actually measure what you are trying to measure (valid), and the responses in the survey should be consistently observed across participants and over time (reliable). When an item or a measure is not valid and reliable, it is often due to a measurement error, or variability in responding due to poorly written survey items. In this section, we describe the following 10 rules used to write valid and reliable survey items that can minimize the likelihood of measurement error: 1. Keep it simple. 2. Avoid mis categorizing response items. 3. Avoid double-barreled items. 4. Use neutral or unbiased language. 5. Minimize the use of negative wording. 6. Avoid the response set pitfall. 7. Use rating scales consistently. 8. Limit the points on a rating scale. 9. Label or anchor the rating scale points. 10. Minimize item and survey length. Keep It Simple Everyone who takes a survey should be able to understand it. The best strategy is to use less than a high school–level vocabulary in writing the survey items. We use this strategy to make sure that participants’ responses reflect their actual responses and are not given because they are confused about what the question is asking. For example, we could have participants rate how full they feel by asking, “How satiated do you feel?” However, some participants may not know that satiated means to satisfy an appetite, so it would be better to plainly ask, “How full do you feel?” In sum, keep the language simple. Use simple words or language in a survey. Avoid Mis categorizing Response Items Mis categorizing response items can occur both in terms of (1) creating too many or too few categories and (2) using categories when you should avoid using them at all. Let us look at each way in which mis categorization can be avoided. The first way in which mis categorization can occur is the creation of too many or too few categories for a response. As an example of too few categories, suppose you ask participants to identify their political affiliation, and you allow them two options or categories to reply to: Republican, Democrat. However, this is too few categories. For example, other affiliations exist (e.g., Moderate, Independent) that should also be added as options. Likewise, you can have too many categories. As a general rule, you should try to limit responses to no more than nine categories, unless it is necessary. For example, creating categories for education level can certainly be captured well within nine categories, unless it is important to categorize beyond this (e.g., a survey evaluating each grade level from K−12). The second way mis categorization can occur is the use of categories that should have been avoided altogether. Ultimately, you need to be aware of the analysis options for a survey. Interval and ratio scale data (“scaled” data) are more informative and allow for greater flexibility in terms of analysis than do nominal data (“categorical” data). For this reason, it is often best to avoid using categories when a response otherwise could be numeric. In these cases, instead of listing categorical response options, it is often best to simply state the item as an open-ended item (e.g., age in years, income in dollars, height in inches or centimeters, weight in pounds or kilograms, or time in minutes) or give it as a Likert-type scale (e.g., political affiliation on a 7-point scale from 1 being extremely liberal to 7 being extremely conservative; for an example, see American National Election Studies, 2015). The critical step is to make sure to define the unit of measurement for the response. If you categorize response items first, then you never actually collect the data points within those ranges. For example, you cannot know the actual income of someone who selected a range within which their actual income falls. However, if you record the actual data for responses that do not need to be categorized (e.g., record the actual income of each respondent), then you can always go back and group or categorize the data later. But by having the original data, you have more flexibility in terms of options later for analyzing the results of your survey. Avoid listing too few or too many response categories and avoid using categories if responses could otherwise be scaled. Avoid Double-Barreled Items Ask only one question or give only one statement for each item. Double-barreled items are survey items that ask participants for one response to two different questions or statements. For example, to study relationship satisfaction, we could ask participants to indicate their level of agreement with the following statement: Double-barreled items are survey items that ask participants for one response to two different questions or statements. I enjoy the time we spend together and dislike the time we are apart. 1 2 3 4 5 Strongly disagree Strongly agree This item for relationship satisfaction is double-barreled. It is not necessarily true that people who enjoy the time they spend with their partner also dislike the time they are apart. Anytime a sentence uses a conjunction, such as and, it is likely that the item is double-barreled. The solution is to split the question into separate items. For example, we can change the double-barreled item into two separate items, each with a separate rating scale. We could write the first item as “I enjoy the time we spend together” and the second item as “I dislike the time we are apart” to allow participants to give a separate response to each individual item. Do not include predictable response patterns in a survey. Use Neutral or Unbiased Language Do not use loaded terms, or words that produce an emotional reaction, such as language that is offensive or could potentially be considered offensive by a respondent. Offensive language is not only inappropriate, but it can also lead people to respond in reaction to the language used. In other words, responses may be caused by the choice of wording in a survey item and may not reflect the honest response of the participant. To avoid the use of potentially offensive language, the American Psychological Association (APA; 2020) provides guidelines for using appropriate language. Some suggestions include explicitly reporting information about gender identities of participants, if known, rather than assuming cisgender identities, and capitalizing Black and White to identify those racial and ethnic groups. More guidelines are provided in the publication manual (APA, 2020) and in Table 15.2 in Chapter 15. Likewise, do not use leading terms or leading questions, or words or questions that indicate how people should respond to an item. For example, a leading question would be, “How bad are your problems with your boss?” In this example, it is implied that you have a problem, which may or may not be true. So, the use of the word bad is a leading term in this sentence and it should be removed. A better way to phrase this question would be, “What is the nature of your relationship with your boss?” In this case, you are not implying what the nature of that relationship is, and the respondent is not being led toward one response or another. Thus, the solution for fixing survey items with loaded terms or leading questions is often to simply rephrase or rewrite the item to avoid this pitfall. Use appropriate and unbiased language in surveys. Minimize the Use of Negative Wording The use of negative wording can trick participants into misunderstanding a survey item. Negative wording is the use of words in a sentence or an item that negates or indicates the opposite of what was otherwise described. The rule is to avoid asking participants in a survey item what they would not do, which can require rephrasing a sentence or survey item. For example, the survey item “How much do you not like working?” can be rephrased to “How much do you dislike working?” It may seem like a small change, but it can effectively reduce confusion. Avoid using negative wording in a survey item. Avoid the Response Set Pitfall When respondents notice an obvious pattern in the responses they provide, they will often use that same pattern to respond to future items in that survey. For example, suppose we ask participants to indicate their level of agreement with the following items on a 5-point scale to measure relationship satisfaction: I enjoy the time I spend with my partner. The time I spend with my partner makes me happy. I look forward to the time I spend with my partner. On the 5-point scale, suppose 1 indicates low satisfaction and 5 indicates high satisfaction. For each item, high satisfaction would be a rating of 5. If there were 20 questions like this, then participants would start to see a pattern, such as high ratings always indicate greater satisfaction. If participants are generally satisfied in their relationship, they may begin marking 5 for each item without reading many of the items because they know what the scale represents. However, the ratings participants give would reflect the fact that they saw a pattern and may not necessarily reflect their true ratings for each item. To avoid this problem, called a response set, mix up the items in a survey so that ratings are not all on the same end of the scale for a given measure. A response set is the tendency for participants to respond the same way to all items in a survey when the direction of ratings is the same for all items in the survey. To illustrate how to avoid the response set pitfall, the following are four items from the EDIS-S (Privitera & Wallace, 2011), which is used to estimate how much sugar participants consume in their diet. Participants rate each item on a 7-point scale from 1 (completely disagree) to 7 (completely agree). I tend to eat cereals that have sugar in them. I often eat candy to snack on when I am hungry. I tend to crave foods that are high in sugar. I tend to consume a low-sugar diet. Notice that the last item is flipped—if participants eat a lot of sugar, then they would rate on the low end of the scale for this last listed item only. Because a few items in the EDIS-S are flipped like this, responses on this scale are unlikely to result from a response set. However, for the scale to make sense, higher overall ratings must indicate greater daily intake of sugar. The first three items are stated such that higher scores do indicate greater daily intake of sugar. Suppose, for example, a participant rates the first three items a 7. The participant’s total score so far, then, is 7 × 3 items = 21. The fourth item is a reverse coded item, meaning that we need to code responses for the item in reverse order. The participant rates their response on the 1 to 7 scale, but when we score it, we will reverse it to a 7 to 1 scale. Hence, a 1 is scored as a 7, a 2 as a 6, a 3 as a 5, a 4 remains a 4, a 5 is scored as a 3, a 6 as a 2, and a 7 as a 1. By doing so, a 7 for the reverse coded item again indicates the highest intake of sugar, and a 1 indicates the lowest intake of sugar—consistent with the scale for the other items in the survey. Returning to our example, suppose that our participant rates the fourth item a 2. We reverse code this item and score it as a 6, then calculate the total score, which is 21 (first three items) + 6 (fourth item) = 27 (total score). Because the fourth item was reverse coded, the survey can now be scored such that higher scores indicate greater daily intake of sugar, and the survey can also be written so as to avoid a response set pitfall. A reverse coded item is an item that is phrased in the semantically opposite direction of most other items in a survey and is scored by coding or entering responses for the item in reverse order from how they are listed. Do not use double-barreled items in a survey. Use Rating Scales Consistently Another rule is to use only one rating scale at a time. In the simplest scenario, use only one scale if possible. The EDIS-S, for example, uses a level of agreement scale for all items in the survey from 1 (completely disagree) to 7 (completely agree). Having only one response scale makes it clear how respondents must respond to all items in a survey. If a survey must use two or more different scales, then the items in the survey should be grouped from one type of scale to the next. Begin with all items for one scale (e.g., items with a scale rated from very dissatisfied to very satisfied); then give directions to clearly indicate a change in the scale for the next group of items (e.g., items with a scale rated from not at all to all the time). Consistent use of rating scales in a survey ensures that participants’ responses reflect their true ratings for each item and not some confusion about the meaning of the scale used. Be clear about the rating scale(s) used in surveys. Limit the Points on a Rating Scale To construct a response scale, keep the scale between 3 and 10 points (Komorita & Graham, 1965; Matell & Jacoby, 1971). Experts in psychometrics, a field involved in the construction of measurement scales, suggest that response scales should have a midpoint or intermediate response level. Having fewer than 3 points on a response scale violates this suggestion, and response scales with more than 10 points can be too confusing. As a general rule, use 3 to 10 points on the rating scale for each item in a survey. There are two exceptions to the rule of limiting a rating scale to 3 to 10 points. One exception is that a 2-point scale is appropriate for dichotomous scales in which only two responses are possible. For example, dichotomous scales with true/false, yes/no, or agree/disagree as the response options are acceptable. A second exception is that bipolar scales, those that have points above and below a zero point, can be 3 to 10 points above and below the zero point. Hence, a bipolar scale, such as the one shown below with 11 points, can have up to 21 points, or 10 points above and 10 points below zero. Anchor or label the end points of a rating scale. Bipolar scales are response scales that have points above (positive values) and below (negative values) a zero point. How do you feel about your ability to find a job that will make you happy? −5 −4 −3 −2 −1 0 1 2 3 4 5 Extremely pessimistic No opinion Extremely optimistic abel or Anchor the Rating Scale Points To clearly indicate what a rating scale means, we can use anchors or adjectives given at the end points of a rating scale. Anchors are often listed below the end points on a rating scale, such as those given for the bipolar scale for the previous rule. Notice also in the bipolar scale that the midpoint is labeled. Indeed, we can include anchors for the end points and label every other point on a scale if we choose. Anchors are adjectives that are given to describe the end points of a rating scale to give the scale greater meaning. Minimize Item and Survey Length As a general rule, you want to make sure that each item in a survey is as brief as possible. Being concise is important to ensure that respondents read the full item before responding. Likewise, a survey itself can be too long, although it is difficult to determine or define what constitutes “too long.” The best advice is to write the survey to be as short and concise as possible, yet still able to convey or measure what it is intended to measure. Keep in mind that participants will fatigue or simply get tired of answering survey items. If this occurs, then a participant may start to “browse” survey items or even make up responses just to “get the survey over with.” It is not to say that all participants will do this, but some will do this, and we want to avoid this problem of fatigue. The obvious solution is to make the survey and the items in the survey as brief or concise as possible. A survey that is not longer than 10 to 15 minutes is typically preferred to one that takes an hour to complete. The time to complete a survey tends to be more important than the number of items in the survey. For example, a survey with a few open-ended items may take 15 to 20 minutes to complete, whereas a survey that has 30 restricted items may take only a minute or two to complete. To minimize survey length, then, the key goal is to minimize how long (over time) it takes a person to complete all items in a survey, not necessarily to minimize the number items in the survey per se. Minimize item and survey length. Learning Check 2 ✓ Which rule or rules for writing survey items does each of the following items violate? Note: Assume that each item is rated on a 5-point scale from 1 (completely false) to 5 (completely true): I am a likable person and enjoy the company of others. On a scale from 1 (very unlikely) to 13 (very likely), what are the chances you will win money in a casino? Reverse discrimination against White Americans is a big problem in America. Misogynistic men do not make good boyfriends. True or false: How an item is worded can affect the reliability and validity of responses given for that item. Answers: 1. A. Avoid double-barreled items, B. Limit the points on a rating scale, C. Use neutral or unbiased language, D. Keep it simple and minimize the use of negative wording; 2. True. 8.4 ADMINISTERING SURVEYS Once a survey is constructed, it is administered to participants who will respond to the survey. A survey can be written (in print or electronically) or spoken (such as in an interview). A written survey can be administered in person, by mail, or using the Internet. An interview survey can be administered face to face, by telephone, or in focus groups. Each method of administering a survey is described in this section. Written Surveys In-person surveys. A method that can effectively get participants to respond to a survey is to be physically present while participants complete the survey. The reason that more participants are willing to complete a survey administered in person is that you, the researcher, can be there to explain the survey, observe participants take the survey, and answer any questions they may have while they complete the survey. This method is more time-consuming, however, because it requires the researcher to be present while each and every participant completes the survey. Mail surveys. An alternative that can require less of the researcher’s time is to submit surveys in the mail. However, mail surveys are associated with higher rates of potential respondents choosing not to complete and return them to the researcher. Mail surveys can also be costly in terms of both the time it takes to prepare the surveys (e.g., printing and addressing surveys) and the money spent to send them out to potential respondents (e.g., postage stamps and envelopes). The following are four strategies that can increase how many people complete and return a mail survey: Include a return envelope with the return postage already paid. Let potential respondents know in advance that the survey is being sent. Include a cover letter detailing the importance of completing the survey. Include a gift for the potential respondent to keep, such as a pen or gift card. Internet surveys. A popular and cost-effective survey option is to administer surveys online. This option is inexpensive in that a survey can be administered to a large group of potential respondents with little more than a click of a button. Online surveys can be administered via links provided in an e-mail or using online survey construction sites, such as Qualtrics or SurveyMonkey.com (see Dillman, Smyth, & Christian, 2014). The main concern for using online surveys is that the results of these surveys may be limited to individuals who have access to computers with online capabilities, and to individuals who know enough about using computers that they can complete and submit the survey correctly. A written survey can be administered in person, by mail, or using the Internet. Interview Surveys Face-to-face interviews. A researcher could administer a survey orally to one participant at a time or to a small group. The advantage of a face-to-face interview is that the researcher can control how long it takes to complete the survey inasmuch as it is the researcher asking the questions. The drawback of face-to-face interviews is that they require the interviewer to be present for each survey and can be prone to interviewer bias, meaning that the interviewer’s demeanor, words, or expressions in an interview may influence the responses of a participant. For this reason, face-to-face interviews, while used in quantitative research, tend to be more commonly applied in qualitative research for which interviewing is a primary method used to describe an individual or a group. Interviewer bias is the tendency for the demeanor, words, or expressions of a researcher to influence the responses of a participant when the researcher and the participant are in direct contact. Telephone interviews. An interview can also be administered via the telephone. Phone interviews can be interpersonal (e.g., the researcher asks the questions) or automated (e.g., computer-assisted technology asks the questions). One advantage of automated telephone interviews is that they can save time and reduce the likelihood of interviewer bias. Another advantage is that telephone surveys can be administered at random by generating telephone numbers at random from within the area or region using random digit dialing. The key disadvantage of telephone interviews is that they often result in few people willingly agreeing to complete the survey. Also, the passage of new laws restricting telephone surveying has made this method of administering surveys less common in the behavioral sciences. Random digit dialing is a strategy for selecting participants in telephone interviews by generating telephone numbers to dial or call at random. Focus-group interviews. Sometimes researchers use surveys that are aimed at getting people to share ideas or opinions on a certain topic or issue. A survey that is structured to get participants to interact is called a focus group, which is a small group of about three to eight people. Questions or survey items in a focus group are mostly open ended, and the researcher plays more of a moderator role than an interviewer role. The goal of a focus group is to get participants talking to each other to get them to share their ideas and experiences on a predetermined topic. The conversations are typically recorded and then analyzed. While focus groups can reveal new directions and ideas for a given research topic, they are associated with the same problems mentioned for face-to-face interviews. Each survey administration method described here can vary substantially on how effectively researchers obtain representative samples. Obtaining representative samples is important because surveys are often used for the purpose of learning about characteristics in a population of interest. For example, we sample a few potential voters to identify the candidate who is likely to obtain the most votes in the population, not just among those sampled. Therefore, it is important that the sample we select to complete a survey is representative of the population. Administering a survey in person or face to face can make it more likely that we can obtain a representative sample. Administering the survey by mail, telephone, or Internet, on the other hand, can limit the representativeness of our sample because often only a small proportion of those who receive the survey will respond and actually complete the survey. Issues related to this problem of response rate are discussed in Section 8.5. An interview survey can be administered face-to-face, by telephone, or in focus groups. 8.5 SURVEYS, SAMPLING, AND NONRESPONSE BIAS When administering a survey, it is important to obtain a high survey response rate, which is the portion of participants who agree to complete a survey among all those who were asked to complete the survey. When the response rate is low, the concern is that any results from the survey will be limited to only those people who were actually willing to complete the survey. When the response rate is high, we can be more confident that the sample of those who completed the survey is representative of the larger population of interest. Response rate is the portion of participants who agree to complete a survey among all individuals who were asked to complete the survey. Issues related to response rates center on the possibility of a nonresponse bias (see Chapter 5), which occurs when participants choose not to complete a survey or choose not to respond to specific items in a survey. Although at least a 75% response rate should be obtained to minimize bias, the typical response rate to surveys in published peer-reviewed research is less than 50% (Baruch, 1999; Baruch & Holtom, 2008; Shih & Fan, 2008; Stoop, 2015). The problem of low response rates is that people who respond to surveys are probably different from those who do not respond. Because we cannot collect data from people who fail to respond, it is difficult to know the exact characteristics of this group of no responders. For this reason, we cannot know for sure whether survey results of those who do respond are representative of the larger population of interest, which includes those who do not respond to surveys. While the low response rates in published research can be problematic, there is good reason to publish the results from these journals. Although low response rates can limit the population validity (a subtype of external validity; see Chapter 6) of results from a survey, researchers are not always interested in generalizing results to a population. To establish some external validity, researchers often use survey results to instead generalize to a theory, called theoretical generalization, or generalize to other observations, called empirical generalization. Each type of generalization is illustrated in Figure 8.2, with an example given for each type. As long as survey results are rooted in existing theories and data, researchers “can afford to be lenient [to some extent] about sample quality in academic research” (Blair & Zinkhan, 2006, p. 6). Theoretical generalization is the extent to which results in a survey or another research study are consistent with predictions made by an existing theory. Empirical generalization is the extent to which results in a survey, or another research study are consistent with data obtained in previous research studies. Description Figure 8.2 ⦁ Two Types of Generalization for the Results in Survey Research 8.6 Ethics in Focus Handling and Administering Surveys To show respect for persons, which is a key principle in the Belmont Report and the APA (2020) code of conduct, the researcher has certain ethical responsibilities regarding how to handle and administer surveys in a research study. The following are four responsible and appropriate ways to handle and administer surveys: The survey itself should not be offensive or stressful to the respondents. The respondents should, under reasonable circumstances, be satisfied or comfortable with their survey experience such that they would not feel distress if asked to complete the survey again. If they would feel distress, then the survey may pose potential psychological risks to the respondents. Do not coerce respondents into answering questions or completing a survey. All respondents should be informed prior to completing the survey (typically in an informed consent form) that they can skip or choose not to answer any survey items, or the entire survey, without penalty or negative consequence. Do not harass respondents in any way for recruitment purposes. Because of high nonresponse rates, researchers often actively recruit potential respondents through e-mail or phone call reminders. The potential respondents must not view these recruitment efforts as harassing or intrusive. Protect the confidentiality or anonymity of respondents. Personally identifiable information of respondents should be protected at all times. If the researcher requires respondents to provide personally identifiable information, then such information should be safeguarded while in the possession of the researcher. Learning Check 3 ✓ State three ways that a written survey can be administered. State three ways that an interview survey can be administered. Is the typical response rate for survey research that is published in the behavioral sciences less than or greater than 50%? True or false: A survey should not be offensive or stressful to the respondent. Answers: 1. A written survey can be administered in person, by mail, and over the Internet. An interview survey can be administered face to face, by telephone, and in focus groups; 2. Less than 50%; 3. True. CORRELATIONAL DESIGNS In the chapter opening, we stated the following hypothesis: Texting while driving is more prevalent among younger age groups. To answer the hypothesis, we used a survey design to begin this chapter. However, we could use a correlational research design to test this hypothesis as well. To use the correlational design, for example, we could ask a sample of participants who drive to indicate in a questionnaire their age (in years) and how often they use text messaging while driving (per month). If the hypothesis is correct and we set up this study correctly, then we should expect to find that increased texting is associated with younger drivers. Notice that we used survey data (based on responses in a questionnaire) to record the data. Surveys are often used with a correlational research design. However, keep in mind that anytime we use data to determine whether two or more factors are related/correlated, we are using the correlational design, even if we used a survey or questionnaire to record the data. In this final section, we introduce the research design that was illustrated here: the correlational research design. 8.7 THE STRUCTURE OF CORRELATIONAL DESIGNS It is often difficult to determine that one factor causes changes in another factor. For example, in the texting-while-driving study used to introduce each major section in this chapter, we cannot reasonably determine that being younger causes drivers to text more while driving because we cannot control for other possible factors that can cause a change in texting behavior. Other possible factors include how often people drive, how busy their daily lives are, how many friends they have, how good of a driver they think they are, or whether they believe texting while driving is dangerous. In these situations, when it is difficult to control for other possible factors that could be causing changes in behavior, we use the correlational research design to determine the extent to which two factors are related, not the extent to which one factor causes changes in another factor. A correlational research design is the measurement of two or more factors to determine or estimate the extent to which the values for the factors are related or change in an identifiable pattern. To set up a correlational research design, we make two or more measurements for each individual observed. For the purposes of introducing the correlational research design, we will introduce situations in which only two measurements are made. Each measurement is for a different variable that we believe is related. For example, one economic factor, income, is related to obesity (Lovasi, Hutson, Guerra, & Neckerman, 2013; S. Newton, Braithwaite, & Akinyemiju, 2017; Su, Esqueda, Li, & Pagán, 2012) in that individuals with lower income tend to be more obese. The correlation establishes the extent to which two factors are related, such that values for one variable (income level) may predict changes in the values of a second variable (severity of obesity). correlation can be established in any setting. In a naturalistic setting, for example, we could measure the correlation between customer satisfaction in a restaurant and timeliness to serve patrons. In a laboratory setting, for example, we could expose participants to a fearful stimulus and record how fearful they rate the stimulus and their corresponding physiological stress response. Using existing data records, we could use legal documents to identify the correlation between duration of marriage (in years) and race, age, socioeconomic status, and any number of other demographic characteristics. In each example, we make two measurements for each individual (or document when using existing records), one measurement for each of the two variables being examined. Once we measure two variables, we then compute a statistical measure called the correlation coefficient to identify the extent to which the values of the two variables or factors are related or change in an identifiable pattern. The correlation coefficient ranges from −1.0 (the values for two factors change in opposite directions) to +1.0 (the values for two factors change in the same direction), and it is used to identify a pattern in terms of the direction and strength of a relationship between two factors—each way of describing the relationship between two factors is introduced in this section. The correlation coefficient is a statistic used to measure the strength and direction of the linear relationship, or correlation, between two factors. The value of r can range from −1.0 to +1.0. −1.0 to +1.0. In behavioral research, we mostly describe the linear (or straight-line) relationship between two factors. For this reason, we will limit this introduction to the direction and strength of a linear relationship between two factors. 8.8 DESCRIBING THE RELATIONSHIP BETWEEN VARIABLES The direction of a relationship between two factors is described as being positive or negative. The strength of a relationship between two factors is described by the value of the correlation coefficient, r, with values closer to r = ±1.0 indicating a stronger relationship between two factors. The direction and strength of correlation can be readily identified in a graph called a scatter plot. To construct a scatter plot (also called a scatter diagram or scatter gram), we plot each pair of values, called data points, along the x-axis and y-axis of a graph to see whether a pattern emerges. A scatter plot, also called a scatter diagram or scatter gram, is a graphical display of discrete data points (x, y) used to summarize the relationship between two factors. Data points are the x- and y-coordinates for each plot in a scatter plot. The extent to which two factors are related is determined by how far data points fall from a regression line when the data points are plotted in a graph. The regression line is the best-fitting or closest-fitting straight line to a set of data points. The best-fitting straight line is the one that minimizes the distance of all data points that fall from it. We will use the regression line to illustrate the direction and strength of the relationship between two factors using the correlational research design. Data points are the x- and y-coordinates for each plot in a scatter plot. The extent to which two factors are related is determined by how far data points fall from a regression line when the data points are plotted in a graph. The regression line is the best-fitting or closest-fitting straight line to a set of data points. The best-fitting straight line is the one that minimizes the distance of all data points that fall from it. We will use the regression line to illustrate the direction and strength of the relationship between two factors using the correlational research design. The regression line is the best-fitting straight line to a set of data points. A best-fitting line is the line that minimizes the distance that all data points fall from it. The Direction of a Relationship In a scatter plot, a positive correlation means that as values of one factor increase, values of a second factor also increase; as values of one factor decrease, values of a second factor also decrease. If two factors have values that change in the same direction, we can graph the correlation using a straight line. In Figure 8.3, values on the y-axis increase as values on the x-axis increase. A positive correlation is a positive value of r that indicates that the values of two factors change in the same direction: As the values of one factor increase, values of the second factor also increase; as the values of one factor decrease, values of the second factor also decrease. Figure 8.3a shows a perfect positive correlation, which occurs when each data point falls exactly on a straight line, although this is rare. More commonly, as shown in Figure 8.3b, a positive correlation is greater than 0 but less than 1.0, where the values of two factors change in the same direction but not all data points fall exactly on the regression line. Description Figure 8.3 ⦁ A Perfect Positive (a) and a Positive (b) Linear Correlation Both the table and the scatter plot show the same data for (a) and (b). A negative correlation means that as values of one factor increase, values of the second factor decrease. If two factors have values that change in the opposite direction, we can graph the correlation using a straight line. In Figure 8.4, values on the y-axis decrease as values on the x-axis increase. A negative correlation is a negative value of r that indicates that the values of two factors change in different directions, meaning that as the values of one factor increase, values of the second factor decrease. Figure 8.4a shows a perfect negative correlation, which occurs when each data point falls exactly on a straight line, although this is also rare. More commonly, as shown in Figure 8.4b, a negative correlation is greater than −1.0 but less than 0, where the values of two factors change in the opposite direction but not all data points fall exactly on the regression line. The Strength of a Relationship A zero correlation (r = 0) means that there is no linear pattern or relationship between two factors. This outcome is rare because usually by mere chance at least some values of one factor, X, will show some pattern or relationship with values of a second factor, Y. The closer a correlation coefficient is to r = 0, the weaker the correlation and the less likely that two factors are related; the closer a correlation coefficient is to r = ±1.0, the stronger the correlation and the more likely that two factors are related. The pattern of a set of data points can indicate the extent to which two factors are related. A positive correction is given with a plus (+) sign; a negative correlation is given with a minus (−) sign. Figure 8.4 ⦁ A Perfect Negative (a) and a Negative (b) Linear Correlation Both the table and the scatter plot show the same data for (a) and (b). The closer a set of data points falls to a regression line, the stronger the correlation. The strength of a correlation reflects how consistently values for each factor change. When plotted in a graph, a stronger correlation means that the values for each factor change in a related pattern—the data points fall closer to a regression line, or the straight line that best fits a set of data points. Figure 8.5 shows two positive correlations between exercise (Factor X) and body image satisfaction (Factor Y), and Figure 8.6 shows two negative correlations between absences in class (Factor X) and quiz grades (Factor Y). In both figures, the closer a set of data points falls to the regression line, the stronger the correlation; hence, the closer a correlation coefficient is to r = ±1.0. The Correlation Coefficient The most commonly used formula for computing r is the Pearson correlation coefficient, which is used to determine the strength and direction of the relationship between two factors on an interval or a ratio scale of measurement. Alternative formulas for computing a correlation with many scales of measurement exist, as identified in Table 8.2 (also given in Chapter 14, Table 14.2); however, each of these alternative formulas was derived from the formula for the Pearson correlation coefficient, so only the Pearson formula will be described in this section. The formula for the Pearson correlation coefficient is a measure of the variance of data points from a regression line that is shared by the values of two factors (X and Y), divided by the total variance measured: The Pearson correlation coefficient is used to measure the direction and strength of the linear relationship of two factors in which the data for both factors are on an interval or a ratio scale of measurement. The Pearson correlation coefficient is used to measure the direction and strength of the linear relationship of two factors in which the data for both factors are on an interval or a ratio scale of measurement. Description Figure 8.6 ⦁ The Consistency of Scores for a Negative Correlation Both figures show approximately the same regression line, but the data points in (b) are more consistent because they fall closer to the regression line than in (a). 𝑟 = Variance shared by 𝑋 and 𝑌 Total Variance measured The correlation coefficient, r, measures the variance of X and the variance of Y, which constitutes the total variance that can be measured. The total variance is placed in the denominator of the formula for r. The variance in the numerator, called covariance, is the amount or proportion of the total variance that is shared by X and Y. The larger the covariance, the closer data points will fall to the regression line. When all data points for X and Y fall exactly on a regression line, the covariance equals the total variance, making the formula for r equal to +1.0 or −1.0, depending on the direction of the relationship between two factors. The farther that data points fall from the regression line, the smaller the covariance will be compared with the total variance in the denominator, resulting in a value of r closer to 0. Covariance is the extent to which the values of two factors (X and Y) vary together. The closer data points fall to the regression line, the more the values of two factors vary together. Table 8.2 ⦁ The Scales of Measurement for Factors Tested Using Correlation Coefficients Correlation Coefficient Scale of Measurement for Correlated Variables Pearson Both factors are interval or ratio data. Spearman Both factors are ranked or ordinal data. Point-Biserial One factor is dichotomous (nominal data), and a second factor is continuous (interval or ratio data). Phi Both factors are dichotomous (nominal data). Figure 8.7 ⦁ Covariance Between X and Y Each circle represents the variance of a factor. The variances of two factors covary inasmuch as the two circles overlap. The more overlap or shared variance of two factors, the more the two factors are related. If we conceptualize covariance as circles, as illustrated in Figure 8.7, then the variance of each factor (X and Y) will be contained within each circle. The two circles, then, contain the total measured variance. The covariance of X and Y reflects the extent to which the total variance or the two circles overlap. In terms of computing r, the overlap or covariance is placed in the numerator; the total variance contained within each circle is placed in the denominator. The more the two circles overlap, the more the covariance (in the numerator) will equal the independent variances contained within each circle (in the denominator)—and the closer r will be to ±1.0. Learning Check 4 ✓ The value of the __________________ provides an estimate of the strength and direction of the relationship between two factors. A professor measures a negative correlation between time spent partying and grades. Interpret this result. 1. A researcher records a correlation of r = +.02. 2. Identify the direction of this correlation. 3. Identify the strength of this correlation. 4. How will the data points appear in a graph for two factors with values that change consistently? Answers: 1. correlation coefficient (r); 2. As time spent partying increases, grades decrease; 3. A. The direction of the correlation is positive, B. The strength of the correlation is weak because .02 is close to 0; 4. The data points will fall close to the regression line. 8.9 LIMITATIONS IN INTERPRETATION Fundamental limitations using the correlational method require that a significant correlation be interpreted with caution. Among the many considerations for interpreting a significant correlation, in this section we consider causality, outliers, and restriction of range. Causality Using a correlational design, we do not manipulate an independent variable, and we certainly make little effort to control for other possible factors that may also vary with the two variables we measured. For this reason, a significant correlation does not show that one factor causes changes in a second factor (i.e., causality). To illustrate, suppose we measure a significant negative correlation between the self-rated mood of participants and the amount of food they eat daily (in calories per day). We will look at four possible interpretations for this correlation: Decreases in how people feel (mood) can cause an increase in the amount they eat (eating). This possibility cannot be ruled out. Increases in the amount people eat (eating) can cause a decrease in how people feel (mood). So the direction of causality can be in the opposite direction. Hence, instead of changes in mood causing changes in eating, maybe changes in eating cause changes in mood. This possibility, called reverse causality, cannot be ruled out either. The two factors could be systematic, meaning that they work together to cause a change. If two factors are systematic, then Conclusions 1 and 2 could both be correct. The worse people feel, the more they eat, and the more people eat, the worse they feel. This possibility, that each factor causes the other, cannot be ruled out either. Changes in both factors may be caused by a third unanticipated confound or confound variable. Perhaps biological factors, such as increased parasympathetic activity, make people feel worse and increase how much they want to eat. So, it is increased parasympathetic activity that could be causing changes in both mood and eating. This confound variable and any number of additional confound variables could be causing changes in mood and eating and cannot be ruled out either. Reverse causality is a problem that arises when the direction of causality between two factors can be in either direction. Correlation does not demonstrate cause. Reverse causality occurs when the direction of causality for two factors, A and B, cannot be determined. Hence, changes in Factor A could cause changes in Factor B, or changes in Factor B could cause changes in Factor A. A confound or confounded variable is an often unanticipated variable not accounted for in a research study that could be causing or associated with observed changes in one or more measured variables. Figure 8.8 summarizes each possible explanation for an observed correlation between mood and eating. The correlational design cannot distinguish between these four possible explanations. Instead, a significant correlation shows that two factors are related. It does not provide an explanation for how or why they are related. Description Description Figure 8.8 ⦁ Four Potential Explanations for a Significant Correlation Because factors are measured, but not manipulated using the correlational method, any one of these possibilities could explain a significant correlation. Description Description Figure 8.9 ⦁ The Effects of an Outlier (a) The graph displays a typical correlation between income and education, with more education being associated with higher income. (b) The graph shows the same data with an additional outlier of a child movie star who earns $1 million. The inclusion of this outlier changed the direction and the strength of the correlation. Outliers Another limitation that can obscure the correlation or relationship between two factors is when an outlier is in the data. An outlier is a score that falls substantially above or below most other scores in a data set and can alter the direction and the strength of an observed correlation. Figure 8.9a shows data for the relationship between income and education without an outlier in the data. Figure 8.9b shows how an outlier, such as the income earned by a child movie star, changes the relationship between two factors. Notice in Figure 8.9 that the outlier changed both the direction and the strength of the correlation. An outlier is a score that falls substantially above or below most other scores in a data set. Outliers can change the strength and the direction of a correlation or relationship between two factors. Restriction of Range When interpreting a correlation, it is also important to avoid making conclusions about relationships that fall beyond the range of data measured. The restriction of range problem occurs when the range of data measured in a sample is restricted or smaller than the range of data in the general population. Restriction of range is a problem that arises when the range of data for one or both correlated factors in a sample is limited or restricted, compared with the range of data in the population from which the sample was selected. Figure 8.10 shows how the range of data measured in a sample can lead to erroneous conclusions about the relationship between two factors in a given population. This figure shows the positive correlation for a hypothetical population (top graph) and the correlations in three possible samples we could select from this population (smaller bottom graphs). Notice that, depending on the range of data measured, we could identify a positive correlation, a negative correlation, or zero correlation from the same population, although the data in the population are actually positively correlated. To avoid the problem of restriction of range, the direction and the strength of a correlation should only be generalized to a population within the limited range of measurements observed in the sample. Description Figure 8.10 ⦁ The Effects of Restriction of Range In this population, shown in the top graph, there is a positive correlation between two factors (r = +.855). Also depicted are three possible samples we could select from this population. Range PC shows a positive correlation (r = +.891), Range ZC shows a zero correlation (r = 0), and Range NC shows a negative correlation (r = −.598)—all within the same population. Because different ranges of data within the same population can show very different patterns, correlations should never be interpreted beyond the range of data measured in a sample. 8.10 CORRELATION, REGRESSION, AND PREDICTION The correlation coefficient, r, is used to measure the extent to which two factors (X and Y) are related. The value of r indicates the direction and strength of a correlation. When r is negative, the values of two factors change in opposite directions; when r is positive, the values of two factors change in the same direction. The closer r is to ±1.0, the stronger the correlation, and the more closely two factors are related. We can use the information provided by r to predict values of one factor, given known values of a second factor. Recall that the strength of a correlation reflects how closely a set of data points fits to a regression line (the straight line that most closely fits a set of data points). We can use the value of r to compute the equation of a regression line and then use this equation to predict values of one factor, given known values of a second factor in a population. This procedure is called linear regression (also called regression). Linear regression, also called regression, is a statistical procedure used to determine the equation of a regression line to a set of data points and to determ Figure 8.10 ⦁ The Effects of Restriction of Range In this population, shown in the top graph, there is a positive correlation between two factors (r = +.855). Also depicted are three possible samples we could select from this population. Range PC shows a positive correlation (r = +.891), Range ZC shows a zero correlation (r = 0), and Range NC shows a negative correlation (r = −.598)—all within the same population. Because different ranges of data within the same population can show very different patterns, correlations should never be interpreted beyond the range of data measured in a sample. 8.10 CORRELATION, REGRESSION, AND PREDICTION The correlation coefficient, r, is used to measure the extent to which two factors (X and Y) are related. The value of r indicates the direction and strength of a correlation. When r is negative, the values of two factors change in opposite directions; when r is positive, the values of two factors change in the same direction. The closer r is to ±1.0, the stronger the correlation, and the more closely two factors are related. We can use the information provided by r to predict values of one factor, given known values of a second factor. Recall that the strength of a correlation reflects how closely a set of data points fits to a regression line (the straight line that most closely fits a set of data points). We can use the value of r to compute the equation of a regression line and then use this equation to predict values of one factor, given known values of a second factor in a population. This procedure is called linear regression (also called regression). Linear regression, also called regression, is a statistical procedure used to determine the equation of a regression line to a set of data points and to determine the extent to which the regression equation can be used to predict values of one factor, given known values of a second factor in a population. To use linear regression, we identity two types of variables: the predictor variable and the criterion variable. The predictor variable (X) is the variable with values that are known and can be used to predict values of the criterion variable; the predictor variable is plotted on the x-axis of a graph. The criterion variable (Y) is the variable with unknown values that we are trying to predict, given known values of the predictor variable; the criterion variable is plotted on the y-axis of a graph. If we know the equation of the regression line, we can predict values of the criterion variable, Y, so long as we know values of the predictor variable, X. To make use of this equation, we identify the following equation of a straight line: The predictor variable (X) is the variable with values that are known and can be used to predict values of another variable. The criterion variable (Y) is the to-be-predicted variable with unknown values that can be predicted or estimated, given known values of the predictor variable. Y = bX + a In this equation, Y is a value we plot for the criterion variable, X is a value we plot for the predictor variable, b is the slope of a straight line, and a is the y-intercept (where the line crosses the y-axis). Given a set of data, researchers can find the values of a and b and then use the equation they found to predict outcomes of Y. To illustrate the use of the regression line to predict outcomes, consider a study conducted by Chen, Dai, and Dong (2008). In this study, participants completed a revised version of the Aitken Procrastination Inventory (API), and their level of procrastination was recorded. The researchers found that the following regression equation could be used to predict procrastination (Y) based on known scores on the API (X): Ŷ = 0.146X − 2.922 In this equation, Ŷ is the predicted value of Y given known scores on the API, a = 2.922, and b = 0.146. Using this information, we could have a student complete the API, plug their API score into the equation for X, and solve for Ŷ to find the procrastination level we predict for that student. The advantage of using linear regression is that we can use the equation of the regression line to predict how people will behave or perform. A caution of using this procedure, however, is that smaller correlations, or those closer to r = 0, will produce inaccurate predictions using the equation of the regression line because the data points will fall far from it. Likewise, the stronger the correlation, or the closer to r = ±1.0, the more accurate the predictions made using the equation of the regression line because the data points will fall closer to it. A correlation cannot describe data beyond the range of data observed in a sample. The equation of the regression line can be used to predict outcomes of a criterion variable. Learning Check 5 ✓ A correlational design does not demonstrate cause. Why? True or false: An outlier can influence both the direction and the strength of an observed correlation. __________ occurs when the range of data for one or both correlated factors in a sample is limited or restricted, compared with the range of data in the population from which the sample was selected. What procedure is used to predict outcomes of one factor given known values of a second factor? Answers: 1. Because we do not manipulate an independent variable, and we make little effort to control for other possible factors that may also vary with the two variables we measured; 2. True; 3. Restriction of range; 4. Linear regression. 8.11 SPSS in FocusCorrelation and Linear Regression The correlational design will likely require the use of a correlation coefficient or linear regression to statistically analyze measured data. In this section, we describe how to compute each type of statistic using SPSS. Pearson Correlation Coefficient (This starts pg. 227) this reading is for assignment 2 To compute a Pearson correlation coefficient using SPSS, suppose we test the hypothesis that greater mobile phone use is associated with increased stress, as has been tested in published research studies (see Lee, 2016; Murdock, Gorman, & Robbins, 2015). To measure mobile phone use, we can use the 27-item Mobile Phone Problem Use Scale (MPPUS; Bianchi & Phillips, 2005). To measure stress, we can use the 10-item Perceived Stress Scale (PSS; S. Cohen & Williamson, 1988). Using these measures, we will enter the data shown in Figure 8.11 to compute a Pearson correlation coefficient using SPSS. Click on the Variable View tab and enter MPPUS in the Name column; enter PSS in the Name column below it. Reduce the value to 0 in the Decimals column for both rows. By default, SPSS should identify the variables, MPPUS and PSS, as being scaled, which they are, so there is no need to change this in the Measure column. Click on the Data View tab. Enter the data for MPPUS in the first column; enter the data for PSS in the second column. Go to the menu bar and click Analyze, then Correlate, and then Bivariate to display a dialog box. Using the arrows, move both variables into the Variables box. Select OK or select Paste and click the Run command. Description Figure 8.11 ⦁ A Table and Scatter Plot Showing the Relationship Between Mobile Phone Use and Perceived Stress The regression line is given in the scatter plot. Both the table and the scatter plot show the same data. The SPSS output table, shown in Table 8.3, is set up in a matrix with MPPUS and PSS listed in the rows and columns. Each cell in the matrix gives the direction and strength of the correlation (r = .540 for mobile phone use and stress; this value is shown with an asterisk for significant correlations), the significance (p = .014; how to interpret a p value is described in Chapters 10–12 and 14), and the sample size (N = 20). To make a decision, if a correlation is significant, then the decision is that the correlation observed in the sample will also be observed in the larger population from which the sample was selected. We can report a correlation in a research journal using the guidelines provided in the Publication Manual of the American Psychological Association (APA, 2020). Using these guidelines, we state the value of r, the p value, and the sample size as shown: A Pearson correlation indicates that greater mobile phone use is associated with greater perceived stress, r = .540, p = .014, N = 20. Description Linear Regression (page 229) also for assignment 2 For situations in which we want to know whether values for one factor predict values for a second factor, we use linear regression. As an example, suppose we conduct a test similar to that computed by Privitera and Wallace (2011). In that study, the researchers tested if scores on the 11-item EDIS-S predicted how much people like the taste of sugar. To measure liking for sugar, participants drank sugar water and rated how much they liked the taste on a 100-millimeter line scale in which higher ratings indicated greater liking for the sugar water. Using data similar to those observed in the study by Privitera and Wallace (2011), we will enter the data shown in Figure 8.12 to compute linear regression using SPSS. Click on the Variable View tab and enter EDISS in the Name column; enter liking in the Name column below it. Reduce the value to 0 in the Decimals column for both rows. By default, SPSS should identify the variables, EDISS and liking, as being scaled, which they are, so there is no need to change this in the Measure column. Click on the Data View tab. Enter the data for the EDIS-S in the first column; enter the data for liking in the second column. Go to the menu bar and click Analyze, then Regression, then Linear . . . to display a dialog box. Using the arrows, move the criterion variable, liking, in the Dependent box; move the predictor variable, EDISS, into the Independent(s) box. Select OK or select Paste and click the Run command. Description Figure 8.12 ⦁ A Table and Scatter Plot Showing the Relationship Between EDIS-S Scores and Liking for Sugar Water The regression line is given in the scatter plot. Both the table and the scatter plot show the same data. Description Table 8.4 ⦁ SPSS Output for Linear Regression The SPSS output table, shown in Table 8.4, displays three ways to analyze the data. The top table shows the proportion of variance, R2 =.797, which is an estimate for how well EDIS-S scores predict ratings of liking for the sugar water. Values closer to 1.0 indicate better predictions. Results for the regression analysis are given in the middle table. Based on the results in that table, we conclude that EDIS-S scores (the predictor variable) do significantly predict a liking for sugar water (the criterion variable), as indicated by the p value in the Sig. column (how to interpret a p value is described further in Chapters 10–12 and 14). To make a decision, if a regression analysis is significant, then the decision is that the predictive relationship observed in the sample will also be observed in the larger population from which the sample was selected. To determine the direction of the relationship between EDIS-S scores and liking for the sugar water, we look at the standardized beta coefficient given in the bottom table. In this example, the beta (β) coefficient is positive, β = +.893, indicating that higher scores on the EDIS-S predict higher ratings for the sugar water. Based on guidelines in the Publication Manual of the American Psychological Association (APA, 2020), we report the results of a regression analysis with one predictor variable by including the value of R2 (top table), the value of β (bottom table), and the results of the regression analysis (middle table; reported as an F value) as shown: A regression analysis showed that EDIS-S scores significantly predicted ratings of liking for sugar water, β = +.893, F(1, 13) = 50.978, p < .001 (R2 = .797). Testing the Assumptions of Parametric Testing Note that a Pearson correlation test is a parametric statistic. This means that it can only be used when the data being tested are on an interval or ratio scale. Additionally, this test has the following assumptions, all of which must be met: Normality. We assume that the data points are normally distributed, such that (a) the population of X and Y scores are normal; (b) for each X score, the distribution of Y scores is normal; and (c) for each Y score, the distribution of X scores is normal. Linearity. We assume that the best way to describe a pattern of data is using a straight line, as opposed to other shapes such as curvilinear shapes. Homoscedasticity. We assume that there is an equal (“homo”) variance or scatter (“scedasticity”) of data points dispersed along the regression line. When the variance of data points along the regression line is not equal, the Pearson correlation coefficient (r) tends to underestimate the strength of a correlation. Likewise, a regression analysis test is a parametric statistic in that it can only be used when the data being tested are on an interval or ratio scale. Additionally, this test has the following assumptions, all of which must be met: Normality of errors. We assume that the errors for each individual value of X (the predictor variable) are normally distributed. Independence of errors. The residuals (or errors) between the observed value and the predicted value should be independent (i.e., not significantly correlated) for each value of the predictor variable. Linearity. We assume that that the best way to predict the value of Y (the variable we are trying to predict) given known values of X (the predictor) is using the equation for a straight line, as opposed to equations for other shapes such as curvilinear shapes. Homoscedasticity. We again assume that there is an equal (“homo”) variance or scatter (“scedasticity”) of data points dispersed along the regression line. Keep in mind that satisfying the assumptions for the correlation and regression analysis is critically important. For each example using SPSS in this chapter, the data are given such that the assumptions for conducting the tests were met. That said, always take caution to test the assumptions when using parametric tests. Ensuring that the assumptions for a given parametric test are met is a critical step to optimize the accuracy of the conclusions you draw from such testing. See also Table 14.2 in Chapter 14 (p. 400) for a summary of alternative tests to the Pearson correlation when the data being tested are not on an interval or ratio scale. Chapter Summary LO 1 Identify and construct open-ended, partially open-ended, and restricted survey items. An open-ended item is a question or statement in a survey that allows the respondent to give any response in their own words, without restriction. This type of question is most often used in qualitative research. A partially open-ended item is a question or statement in a survey that includes a few restricted answer options and then a last option that allows participants to respond in their own words in case the few restricted options do not fit with the answer they want to give. A restricted item is a question or statement in a survey that includes a restricted number of answer options to which participants must respond. This type of question is most often used in quantitative research. LO 2 Identify 10 rules for writing valid and reliable survey items. Ten rules for writing valid and reliable survey items are as follows: Keep it simple. Avoid miscategorizing response items. Avoid double-barreled items. Use neutral or unbiased language. Minimize the use of negative wording. Avoid the response set pitfall. Use rating scales consistently. Limit the points on a rating scale. Label or anchor the rating scale points. Minimize item and survey length. LO 3 Describe methods of administering written surveys and interview surveys. A survey can be written (in print or electronically) or spoken (such as in an interview). A written survey can be administered in person, by mail, or using the Internet. An interview survey can be administered face to face, by telephone, or in focus groups. In-person and face-to-face surveys have the best response rates. In addition, written surveys are preferred to interview surveys in quantitative research partly because interviews are prone to a possible interviewer bias. LO 4 Explain how response rates to surveys can limit the interpretation of survey results. The problem of low response rates is that people who respond to surveys are probably different from those who do not respond. Because we cannot collect data from people who fail to respond, it is difficult to know the exact characteristics of this group of no responders. For this reason, we cannot know for sure whether survey results of those who do respond are representative of the larger population of interest, which includes those who do not respond to surveys. LO 5 Identify how to appropriately handle and administer surveys. To appropriately handle and administer surveys, the survey itself should not be offensive or stressful to the respondent; do not coerce respondents into answering questions or completing a survey; do not harass respondents in any way for recruitment purposes; protect the confidentiality or anonymity of respondents. LO 6 Identify and describe the direction and strength of a correlation. The correlation coefficient, r, is used to measure the extent to which two factors (X and Y) are related. The value of r indicates the direction and strength of a correlation. When r is negative, the values for two factors change in opposite directions; when r is positive, the values for two factors change in the same direction. The closer r is to ±1.0, the stronger the correlation, and the more closely two factors are related. When plotted in a graph, the strength of a correlation is reflected by the distance that data points fall from the regression line. The closer that data points fall to a regression line, or the straight line that best fits a set of data points, the stronger the correlation or relationship between two factors. LO 7 Explain how causality, outliers, and restriction of range can limit the interpretation of a correlation coefficient. Three considerations that must be made to accurately interpret a correlation coefficient are as follows: (1) correlations do not demonstrate causality, (2) outliers can change the direction and the strength of a correlation, and (3) never generalize the direction and the strength of a correlation beyond the range of data measured in a sample (restriction of range). LO 8 Explain how linear regression can be used to predict outcomes. We can use the information provided by r to predict values of one factor, given known values of a second factor using a procedure called linear regression. Specifically, we can use the value of r to compute the equation of a regression line and then use this equation to predict values of one factor, given known values of a second factor in a population. Using the following equation of the regression line, Y = bX + a, we can predict values of the criterion variable, Y, so long as we know values of the predictor variable, X. LO 9 Compute the Pearson correlation coefficient and linear regression using SPSS. SPSS can be used to compute the Pearson correlation coefficient using the Analyze, Correlate, and Bivariate options in the menu bar. These actions will display a dialog box that allows you to identify the variables and to run the correlation (for more details, see Section 8.11). SPSS can be used to compute linear regression using the Analyze, Regression, and Linear . . . options in the menu bar. These actions will display a dialog box that allows you to identify the variables and to run the linear regression (for more details, see Section 8.11).