The world’s leading publication for data science, AI, and ML professionals.

Can Recommendations from LLMs Be Manipulated to Enhance a Product’s Visibility?

Impact of Large Language Models on E-Commerce

Responsible AI

Image by Author
Image by Author

I recently read a tweet where someone dropped a tip about adding "before:2023" to Google searches to eliminate a lot of AI-generated SEO content. Honestly, I haven’t used it, but you get the gist, right? Today, the internet is swamped with so much AI-generated content that it is impossible to sift the actual signal from the noise. The situation is so problematic that Google has decided to eliminate all AI-generated SEO content crafted to manipulate search algorithms and artificially inflate rankings. Don’t get me wrong, I’ve got nothing against AI-generated content, but it becomes an issue when such content starts to influence what you see in your search results. Things get even trickier in this era of Generative AI when content generation has become so easy.

Large language models (LLMs) are already being used on e-commerce platforms to improve the search and recommendation process. But what happens if this very LLM powering the recommendations is manipulated? Manipulation in the e-commerce marketplace is not new. As per a 2016 report by Reuters, Amazon employed a technique known as "search seeding" to ensure that products under its AmazonBasics and Solimo brands appeared among the top search results shortly after launch. The report specifically notes, "We used search seeding for newly launched ASINs to ensure that they feature in the first two or three ASINs in search results." With LLMs, things can get even worse due to their scale and speed.

A new study titled Manipulating Large Language Models to Increase Product Visibility by Aounon Kumar and Himabindu Lakkaraju studies this scenario in detail. It shows that by incorporating specially designed messages, termed strategic text sequences (STS), into a product’s information, a product’s likelihood of being ranked as a top recommendation significantly increases, giving certain vendors an unfair advantage over their competitors. As for the consumers, practices like these can definitely affect their purchasing decisions and trust in online marketplaces, for trust is a key component of online business.

In this article, let’s understand how the authors created these special text sequences and the results communicated in the paper in greater detail. The authors have made the associated code available on GitHub.

GitHub – aounon/llm-rank-optimizer


How LLM-driven search works

Conventional search engines are pretty effective at finding relevant pages, but not so much when coherently presenting the information. LLMs, conversely, can take the search results and convert them into relevant answers. Upon receiving a user’s query, the search engine pulls relevant information from the knowledge bases like the internet or a product’s manual. It then concatenates this retrieved contextual information with the user’s prompt before feeding it to the LLM, allowing it to generate a tailored, up-to-date response that directly addresses the user’s specific needs. The figure below (from the aforementioned paper) shows the entire process in detail.

Figure 1: LLM-Driven Search Interface as mentioned in the paper, with slight modification by the author. | Source: https://arxiv.org/pdf/2404.07981
Figure 1: LLM-Driven Search Interface as mentioned in the paper, with slight modification by the author. | Source: https://arxiv.org/pdf/2404.07981

Can LLM-generated recommendations be gamed?

The paper presents compelling examples to prove that one can, indeed, game LLM-generated recommendations to favor a specific product. For instance, look at the figure below (We’ll get into the details of how this graph was created later). The graph below shows a clear contrast in the ranking of a product on a recommendation scale before and after adding a strategic text sequence (STS). Before applying the STS, the product consistently ranked at the bottom of the recommendations, near rank 10. After applying the STS, the product shoots to the top of the recommendations, close to rank 1.

Figure 2: The target product went from not being recommended (blue) to the top recommendation (orange) after the addition of the strategic text sequence | source: https://arxiv.org/pdf/2404.07981
Figure 2: The target product went from not being recommended (blue) to the top recommendation (orange) after the addition of the strategic text sequence | source: https://arxiv.org/pdf/2404.07981

As already discussed, the advantage of LLM-enabled search lies in its ability to pull information from the internet or a product catalog. It is at this point that vendors have the opportunity to guide the process. How? By embedding these carefully crafted texts, aka STS, into their product’s information page/catalog so that it becomes input for the LLM.

Figure 3: LLM-driven search after embedding strategic text sequences| source: https://arxiv.org/pdf/2404.07981
Figure 3: LLM-driven search after embedding strategic text sequences| source: https://arxiv.org/pdf/2404.07981

The STS is optimized using Adversarial Attack algorithms like Greedy Coordinate Gradient (GCG), introduced in the paper Universal and Transferable Adversarial Attacks on Aligned Language Models. These attacks are typically used to bypass LLM’s safety constraints and generate harmful outputs. However, the authors in this study repurpose these algorithms for the "more benign" objective of increasing product visibility.


Querying the LLM search interface for coffee machine recommendations

The authors present a scenario where a user wants to buy an affordable coffee machine – note the emphasis on the word affordable. This means the product’s price is of the essence, and the user doesn’t want any expensive options. Let’s begin with the input prompt to the LLM, which consists of three parts, as shown below.

Figure 4: LLM prompt | Image by the Author
Figure 4: LLM prompt | Image by the Author
  • A system prompt – to set the context,
  • The product information – pulled from a database formatted in JSON lines detailing the specifics of ten imaginary coffee machine models. The vendor can embed the STS here.
  • a user’s query – seeking affordable options

Here is an example prompt as described in the paper. Notice how the STS is inserted (in red) in the ‘target product’ field’ for the ColdBrew Master Coffee machine.

Figure 5: Example prompt as described in the paper https://arxiv.org/pdf/2404.07981
Figure 5: Example prompt as described in the paper https://arxiv.org/pdf/2404.07981

Crafting the Strategic Text Sequences

Here is an excerpt from the paper which explains the process of generating these text sequences.

We optimize the STS with the objective of minimizing the LLM output’s cross-entropy loss with respect to the string ‘1. [Target Product Name]’. We initialize the STS with a sequence of dummy tokens ‘*’ and iteratively optimize it using the GCG algorithm. At each iteration, this algorithm randomly selects an STS token and replaces it with one of the top k tokens with the highest gradient. The STS can also be made robust to variations in product order by randomly permuting the product list in each iteration.

For example, if we want to boost the ranking of the ColdBrew Master in the product list, we would add the STS to it. The STS starts with a sequence of placeholder tokens, represented by ‘*,’ as shown below, which are then iteratively optimized using the GCG algorithm.

Figure 6: Example of initializing the STS as described in the code associated with the paper https://arxiv.org/pdf/2404.07981
Figure 6: Example of initializing the STS as described in the code associated with the paper https://arxiv.org/pdf/2404.07981

Additionally, to ensure that the STS performs well regardless of how products are listed, the order of products in the product list can also be shuffled randomly in each optimization iteration.

P.S – For their study, the authors selected an open-source Llama-2–7b-chat-hf, noting that their method could also be applied to more opaque models such as GPT-4.

The results show that despite its high price of $199, which typically leads to less visibility, ColdBrew Master was pushed to the top of the recommendations by integrating STS into its description. And guess what? It only took 100 iterations to elevate its rank from being unlisted to the top after incorporating the STS.

Figure 7: Response of the LLM for user's query | Source: https://arxiv.org/pdf/2404.07981
Figure 7: Response of the LLM for user’s query | Source: https://arxiv.org/pdf/2404.07981

Comparison of Strategic Text Sequence Optimization on Two Products: ColdBrew Master and QuickBrew Express

Now that we have an idea about the impact of STS on product rankings, let’s compare how they affect different products, namely,

☕ ️ColdBrew Master, a high-priced coffee machine at $199, and

☕️ QuickBrew Express is a more affordable option at $89.

Here is a comparison table I’ve created to compare the results.

Figure 8: Comparison of Strategic Text Sequence Optimization on Two Products: ColdBrew Master and QuickBrew Express | Image by author with content from the paper - https://arxiv.org/pdf/2404.07981
Figure 8: Comparison of Strategic Text Sequence Optimization on Two Products: ColdBrew Master and QuickBrew Express | Image by author with content from the paper – https://arxiv.org/pdf/2404.07981

The results above show that despite its high price of $199, which typically leads to less visibility, ColdBrew Master was pushed to the top of the recommendations by integrating STS into its description. Interestingly, this product was not even on the list initially due to its high cost.

Figure 9: ColdBrew Master goes from not being recommended to the top recommended product in 100 iterations of the GCG algorithm, while the QuickBrew Express becomes the top recommended product in 1000 iterations of the GCG algorithm | Source: https://arxiv.org/pdf/2404.07981
Figure 9: ColdBrew Master goes from not being recommended to the top recommended product in 100 iterations of the GCG algorithm, while the QuickBrew Express becomes the top recommended product in 1000 iterations of the GCG algorithm | Source: https://arxiv.org/pdf/2404.07981

On the other hand, the ranking of QuickBrew Express – a more budget-friendly coffee maker that usually gets second place in the recommendations, also improved significantly, often reaching the top position with the addition of STS.

Figure 10: Rank distribution before and after adding the STS for 200 independent evaluations of the LLM
(1 dot ≈ 5%). | Source: https://arxiv.org/pdf/2404.07981
Figure 10: Rank distribution before and after adding the STS for 200 independent evaluations of the LLM (1 dot ≈ 5%). | Source: https://arxiv.org/pdf/2404.07981


Concluding thoughts: Is Generative Search Optimization(GSO) the new SEO?

The situation presented in the paper is not far from reality. The authors draw an apt comparison between Generative Search Optimization(GSO) and the traditional SEO:

Just as search engine optimization (SEO) revolutionized how webpages are customized to rank higher in search engine results, influencing LLM recommendations could profoundly impact content optimization for AI-driven search services.

As I mentioned earlier, the success of online businesses is closely tied to the trust and reputation they establish with their customers. Intentionally manipulating product recommendations raises ethical questions, particularly concerning fairness and consumer deception. The presence of fake product reviews is already an ongoing problem. We certainly do not want manipulated recommendations to complicate this situation further.


Visit my GitHub repository to access all my blogs and their accompanying code in one convenient location.

GitHub – parulnith/Data-Science-Articles: A collection of my blogs on Data Science and Machine…


Related Articles