Thoughts and Theory

I am sure you have all seen or answered a survey question with the following (or at least similar) response options:
- [1] Strongly Disagree
- [2] Disagree
- [3] Neither Disagree Nor Agree
- [4] Agree
- [5] Strongly Agree
To answer the question, respondents would select a particular response option that reflects their level of agreement (or disagreement) with a statement. This is a well-known example of a 5-point Likert scale [1], which was named after its inventor, Rensis Likert. Likert scales are widely used by survey researchers in many disciplines (e.g., psychology, education, public health, business, and marketing) to learn about people’s attitudes towards different phenomena (see SurveyMonkey for sample survey questions). Survey questions presented with a Likert scale can help survey researchers find out more about the topic of interest (e.g., customer satisfaction).
Despite the widespread use of Likert scales in survey research, researchers are often unaware of pitfalls and caveats that could significantly affect the way survey Data should be analyzed and interpreted. In this post, I will summarize important points to consider when analyzing survey data collected via Likert scales.
Ordinal or Interval?
Response options in a Likert Scale question are expected to be ordered based on the phenomenon being measured, and thus they are ordinal.
Now let’s assume that we are responding to a question in a hotel satisfaction survey: "My room was clean". If the survey is using a 4-point Likert scale, the response options would be:
(1) Strongly Disagree – (2) Disagree – (3) Agree – (4) Strongly Agree
If we select Strongly Agree, then our response would indicate that our level of agreement is higher than other respondents who might select Agree, Disagree, or Strongly Disagree for "My room was clean".
We might also be tempted to interpret the level of our agreement based on the numerical values assigned to each response option. However, these numerical values are often arbitrarily determined by survey researchers as a sequence of integers (e.g., 1 to 4). These values simply indicate the "order" of response options, not the strength of agreement. Therefore, the distances between response options cannot be presumed equal based on the values assigned to the response options.
For example, respondents may perceive (4) Strongly Agree and (3) Agree very similarly and thus the difference between these two options might be much smaller than the difference between (3) Agree and (2) Disagree, __ despite having the same distance.

The primary implication of accepting the Ordinal nature of Likert scales is that the numerical values assigned to the response options cannot be treated as interval data, and thus parametric statistics (e.g., mean, standard deviation) and parametric statistical methods (e.g., summing up individual questions to find a total survey score, running a regression with survey scores) would not yield valid results. Some options for analyzing ordinal data from Likert scales include summary statistics such as median and mode, statistical methods such as ordinal regression, Chi-square test of independence, and item response theory (IRT) modeling, and graphical tools such as bar charts and correlation matrix plots.
If you are interested in analyzing and visualizing survey data, you can check out my recent course, Analyzing Survey Data with R, on Pluralsight.
Neutral Option
Sometimes Likert scales include a "middle" response category that states a neutral option. The middle response options often include phrases such as "Neutral", "Neither Agree nor Disagree", or "No Opinion". Including a neutral option may increase the accuracy of survey data because respondents who do not have a strong preference may prefer to select the neutral response option, instead of randomly selecting a response option or skipping the question.
However, including a neutral option comes at a price. Likert scales assume that the target phenomenon is measured on a linear continuum, typically going from negative (e.g., strongly disagree) to positive (e.g., strongly agree) [2]. Therefore, the position of the neutral option on this continuum leads to a scaling problem.
Research shows that respondents often see the visual midpoint of a scale as representing the middle response option [3]. Therefore, the neutral option is often placed in the middle of the response options, such as:
- Strongly Disagree
- Disagree
- Neither Agree Nor Disagree
- Agree
- Strongly Agree
Alternatively, the neutral option can be presented separately at the end (e.g., after "Strongly Agree") to distinguish it from the other response options. To avoid this problem, survey researchers should either use a Likert scale that consists of an even number of response options without a neutral option or select survey questions for which respondents would not select the neutral option very easily.
Number of Response Options
Survey researchers often assume that the more response options a Likert scale includes, the more precise measurement they will obtain. But, what is the optimal number of response options for a Likert scale?
Although 5-point Likert scales are widely used in survey research, some researchers prefer to use 6 or more response options in their Surveys. For example, a 7-point agreement scale would be as follows (see Vagias [2006] for more examples of Likert scales):
- (1) Strongly disagree
- (2) Disagree
- (3) Somewhat disagree
- (4) Neither agree nor disagree
- (5) Somewhat agree
- (6) Agree
- (7) Strongly agree
There is a limit to how precisely respondents can understand and distinguish the response options on a Likert scale. So, survey researchers must consider whether all of the response options can be clearly understood by respondents. Research shows that Likert scales with 2 to 5 response options often yield precise results, although smaller numbers of response options may reduce the measurement precision of a survey. Also, researchers found that there are no clear advantages of using beyond 6 response options on a Likert scale [4].
Negative or Positive Wording?
Survey questions can be phrased either negatively or positively. Negative questions differ in direction from most other questions in the survey. Negative wording is typically accomplished by using negative words or negating a question with no/not. For example, here are two survey questions with positive and negative wording:
Positive: My room was clean.
Negative: My room was dirty. (or, "My room was not clean.")
For respondents who thought their room was clean, it is expected that they would select either "Agree" or "Strongly Agree" for the positive question and either "Disagree" or "Strongly Disagree" for the negative question.
Sometimes survey researchers use both positively and negatively worded questions within the same survey to prevent response bias. However, using both positively and negatively worded questions together has some pitfalls. First, positively and negatively worded questions are not necessarily mirror images of each other [5]. Therefore, when analyzing survey data, reverse-coding the Likert scale for negatively worded questions (e.g., 1-Strongly agree; 2-Agree; 3-Disagree; 4-Strongly disagree) may not necessarily put these questions in the same direction as positively worded questions. Second, research shows that negative wording may confuse respondents, leading to less accurate responses to the survey questions [6, 7]. That is, instead of preventing response bias, it may contaminate the survey data.
Lastly, previous studies indicate that respondents are more likely to disagree with negatively worded questions than to agree with positive ones [8]. For example, a respondent who would select Agree for "My room was clean" might prefer to select Strongly Disagree for "My room was dirty". Therefore, survey researchers are recommended to keep the number of negatively worded questions minimal, while taking the impact of negatively worded questions on responses into account.
Conclusion
In this brief post, I wanted to summarize some common pitfalls and caveats when using Likert scales for surveys. In addition to the points highlighted in this post, there are other concerns that survey researchers need to consider, such as labeling response options with numbers and/or text vs. images (e.g., emojis), using a sliding scale instead of a Likert scale, and the potential mismatch between questions and response options [9].
References
[1] 5-Point Likert Scale. In: Preedy V.R., Watson R.R. (eds) Handbook of Disease Burdens and Quality of Life Measures. Springer, New York, NY. https://doi.org/10.1007/978-0-387-78665-0_6363
[2] https://www.simplypsychology.org/likert-scale.html
[3] Roger Tourangeau, Mick P. Couper, Frederick Conrad, Spacing, Position, and Order: Interpretive Heuristics for Visual Features of Survey Questions, Public Opinion Quarterly, Volume 68, Issue 3, September 2004, Pages 368–393, https://doi.org/10.1093/poq/nfh035
[4] Simms, L. J., Zelazny, K., Williams, T. F., & Bernstein, L. (2019). Does the number of response options matter? Psychometric perspectives using personality questionnaire data. Psychological Assessment, 31(4), 557–566. https://doi.org/10.1037/pas0000648
[5] Spector, P. E., Van Katwyk, P. T., Brannick, M. T., & Chen, P. Y. (1997). When two factors don’t reflect two constructs: How item characteristics can produce artifactual factors. Journal of Management, 23(5), 659–677. https://doi.org/10.1016/S0149-2063(97)90020-9
[6] Colosi, R. (2005). Negatively worded questions cause respondent confusion. Proceedings of the Survey Research Methods Section, American Statistical Association (2005), 2896–2903. Retrieved from https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.556.243&rep=rep1&type=pdf
[7] van Sonderen, E., Sanderman, R., & Coyne, J. C. (2013). Ineffectiveness of reverse wording of questionnaire items: let’s learn from cows in the rain. PloS one, 8(7), e68967. https://doi.org/10.1371/journal.pone.0068967
[8] Naomi Kamoen, Bregje Holleman, Pim Mak, Ted Sanders, Huub Van Den Bergh, Why Are Negative Questions Difficult to Answer? On the Processing of Linguistic Contrasts in Surveys, Public Opinion Quarterly, Volume 81, Issue 3, Fall 2017, Pages 613–635, https://doi.org/10.1093/poq/nfx010
[9] Jolene D Smyth, Kristen Olson, The Effects of Mismatches between Survey Question Stems and Response Options on Data Quality and Responses, Journal of Survey Statistics and Methodology, Volume 7, Issue 1, March 2019, Pages 34–65, https://doi.org/10.1093/jssam/smy005