Are there are too many questions on your survey? Are you worried that your participants may get tired of responding to the questions in the middle of the survey? In this article, I describe how to shorten surveys using Ant Colony Optimization (ACO) in R.
In the social and behavioral sciences, researchers often use online surveys and questionnaires to collect data from a sample of participants. Such instruments provide an efficient and effective way to collect information about a large group of individuals.
Surveys are used to collect information from or about people to describe, compare, or explain their knowledge, feelings, values, and behaviors. [1]
Sometimes participants may get tired of answering the questions during the survey-taking process – especially if the survey is very long. This is known as survey taking fatigue. When participants get tired, they may skip questions, provide inaccurate responses due to insufficient effort responding, or even abandon the survey completely. To alleviate this issue, we need to reduce survey length – either manually by removing some questions or automatically using an Optimization algorithm.
Ant colony optimization (ACO) is an advanced optimization algorithm in computer science. ACO was first inspired by the collective behavior of Argentine ants called iridomyrmex humilis [2]. While searching for food, these ants drop pheromone on the ground and follow pheromone previously dropped by other ants. Since the shortest path is more likely to retain pheromone, ants can follow this path and find promising food sources more quickly (see Figure 1).

Engineers decided to use the way Argentine ant colonies function as an analogy to solve the shortest path problem and created the ACO algorithm [3]. Then, other researchers have begun to apply the same algorithm to different selection problems, such as selecting the "best" questions in a survey. Research shows that ACO outperforms traditional methods of question selection, such as selecting questions with high inter-correlations [4].
Example
In this example, we will use the Experiences in Close Relationships (ECR) survey. The survey consists of 36 questions measuring two attachment dimensions for adults: avoidance and anxiety (see Figure 2). The questions are based on a 5-point Likert scale (i.e., 1 = strongly disagree to 5 = strongly agree). The odd-numbered questions (i.e., questions 1, 3, 5, etc.) belong to the Avoidance subscale (e.g., Q1. I prefer not to show a partner how I feel deep down.), while the remaining questions belong to the Anxiety subscale (e.g., Q2. I worry about being abandoned.). For each subscale, higher survey scores indicate higher levels of avoidance (or anxiety). Individuals who score high on either or both of these dimensions are assumed to have an insecure adult attachment orientation [5].

The original data set for the ERC survey is available on the Open-Source Psychometric Project website. For demonstration purposes, we will use a subset of the original data set based on the following rules:
- Respondents must participate in the survey from the United States,
- Respondents must be between 18 and 30 years of age, and
- Respondents must answer all of the questions.
The final data set is available here.
Using the ACO algorithm, we want to reduce the length of the survey to 12 questions (i.e., 6 questions per subscale). First, we will import the data set into R:
ecr <- read.csv("ecr_data.csv", header = TRUE)

Second, we will activate the ShortForm package in R that includes the antcolony.lavaan function to shorten Surveys using the ACO algorithm.
library("ShortForm")
Third, we will define the factorial structure underlying the ECR survey. That is, there are two dimensions (Avoidance and Anxiety) and each dimension is associated with 18 questions.
model <- '
Avoidance =~ Q1+Q3+Q5+Q7+Q9+Q11+Q13+Q15+Q17+Q19+Q21+Q23+Q25+Q27+Q29+Q31+Q33+Q35
Anxiety =~ Q2+Q4+Q6+Q8+Q10+Q12+Q14+Q16+Q18+Q20+Q22+Q24+Q26+Q28+Q30+Q32+Q34+Q36
'
Next, we will define which questions can be used for each dimension during the selection process. In this example, we want all of the questions associated with each subscale to be considered as "candidate" questions.
items <- list(c(paste0("Q", seq(1, 35, by = 2))),
c(paste0("Q", seq(2, 36, by = 2))))
In the final step, we will put everything together to implement the ACO algorithm. When preparing the antcolony.lavaan function, we will use the default values. However, some parameters, such as ants, evaporation, and steps, could be modified to find an optimal result (or reduce the computation time).
Now, we can go ahead and review the results. First, we will see which questions have been selected by ACO. The following output shows the selected questions ("1") and the eliminated questions ("0").
ecr_ACO[[1]]
# Returned output below:
cfi tli rmsea mean_gamma Q1 Q3 Q5 Q7 Q9 Q11 Q13 Q15 Q17
[1,] 0.9814 0.9768 0.05723 0.755 0 1 1 1 0 0 1 0 1
Q19 Q21 Q23 Q25 Q27 Q29 Q31 Q33 Q35 Q2 Q4 Q6 Q8 Q10 Q12 Q14 Q16
[1,] 0 0 1 0 0 0 0 0 0 1 1 1 1 1 0 0 0
Q18 Q20 Q22 Q24 Q26 Q28 Q30 Q32 Q34 Q36
[1,] 0 0 1 0 0 0 0 0 0 0
Next, we will check which questions have been selected for each dimension.
cat(ecr_ACO$best.syntax)
# Returned output below:
Avoidance =~ Q7 + Q17 + Q13 + Q9 + Q5 + Q3
Anxiety =~ Q2 + Q10 + Q6 + Q22 + Q8 + Q4
We can also visualize the results returned from antcolony.lavaan. For example, we can see changes in the amount of variance explained in the model across each iteration of the algorithm.
plot(ecr_ACO, type = "variance")
# Other alternative plots
# plot(ecr_ACO, type = "gamma")
# plot(ecr_ACO, type = "pheromone")

Conclusion
In the example above, ACO was able to produce a shorter version of the ECR survey that includes only 12 questions. The antcolony.lavaan function finds the optimal solution very quickly. However, the speed of the search process depends on the number of ants, evaporation rate, model fit indices, and the cut-off values established for these indices. For example, the cut-off values can be modified (e.g., CFI > .90, TLI > .90) to speed up the search process. I hope this example helps you create shorter and more effective surveys.
Note: An expanded version of this article is available on my personal blog.
References
[1] Fink, Arlene (2015). How to Conduct Surveys: A Step-by-Step Guide. Sage Publications.
[2] Goss, S., Aron, S., Deneubourg, J.-L., Marie Pasteels. J. (1989). Self-Organized Shortcuts in the Argentine Ant. Naturwissenschaften 76 (12): 579–81.
[3] Dorigo, M., Maniezzo, V., & Colorni, A. (1996). Ant System: Optimization by a Colony of Cooperating Agents. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics) 26 (1): 29–41.
[4] Leite, W. L., Huang, I.-C., & Marcoulides, G. A. (2008). Item Selection for the Development of Short Forms of Scales Using an Ant Colony Optimization Algorithm. Multivariate Behavioral Research 43 (3): 411–31.
[5] Wei, M., Russell, D. W., Mallinckrodt, B., & Vogel, D. L. (2007). The Experiences in Close Relationship Scale (ECR)-Short Form: Reliability, Validity, and Factor Structure. Journal of Personality Assessment 88 (2): 187–204.