Analysis of the emerging Chinese social media- The Little Red Book

Martin Liu
Towards Data Science
7 min readAug 7, 2018

--

The Little Red book (https://www.xiaohongshu.com) has become one of the fastest growing social media in China. Different than the popular WeChat and Weibo, Redbook focuses on the beauty and fashion segment, although we do see a diversification of content into the general lifestyle category since its early stage. There is little public data available from the company as it has not gone public yet. In order to understand the platform, I have built a crawler to acquire the profile information and conduct data analysis based on the crawled information.

How to get the data?
Redbook has a website that serves its content on the web client. Although the initial homepage has limited content available for us to crawl, we can identify links that go to the profile page of its users. From there, we can follow up with the commenter’s profile as well as more posts to continue crawling. I will have a separate article on the technical details later.

Data size
Redbook has a very strict anti-crawling mechanism to block crawlers, so IP rotation is a must to acquire a large scale of data. Through a 4-day period of time, I was able to crawl 899,519 profiles from the platform. Although this only represents a smaller number of Redbook’s large user database, all of these users have generated at least some activities on the platform (either by writing a post, creating a board, or commenting on someone else’s post). As a result, this data set can be used to identify the characteristic of Redbook’s most active user group.

Metrics explained
Redbook offers three ways for a user to engage with a content. The user can like the content, “collect” the content or comment on the content. “Likes” functions similar to Facebook’s thumbs-up button, while “collect” means to save a piece of content under the app’s bookmark system for further reference. Generally speaking, people will like a content when they are interested, and “collect” the content when the post can be used for further reference, including those useful tips to revisit later and products they might want to eventually purchase. The third metrics is “comments”, offering the interactivity of the post. Unfortunately, Redbook does not offer an easy way to track the aggregation of the comment number, so this field will not be included in this article.

Jupyter Notebook can be found on https://github.com/Gravellent/redbook_analysis

Who are the most influential users on Redbook?

The top 10 users (excluding the official accounts) are:
范冰冰
林允Jelly
张韶涵
Ritatawang
时髦小姐姐
凌听雨
江疏影
Irene林恩如
欧阳娜娜Nana
美七是我

Although celebrities have been keen on joining this emerging platform, there is a good mix of internet KOLs and true celebrities. Specifically, Ritatawang and 美七是我 did not have much influence before they joined Redbook.

Male users on Redbook?

One of the major traits of Redbook is that its users are predominately woman. Our data shows that merely 2% of its total users are male. Over 60% of users did not identify themselves as male or female, but after some manual inspection, it seems like the unidentified users follow similar trait to those who did pick the gender when they registered.

After taking out the unidentified ones, woman contributes 95% of the total user base. This is similar to what we expected, as most of the content generated on the platform are catered towards women, featuring various beauty products and fashion-related content.

However, it is too soon to say that female is the only player on the social e-commerce playground. Although male users make up only 5% of the total user population, they have exhibited stronger influence over their female counterparts. After calculating the total likes generated by each gender, we see that male user attributes to 8% of the total likes. For the collected posts, men make up 5.9%, slightly lower than the percentage of total likes, but still higher than its population percentage. One explanation for this gap is that content by men on Redbook is often deemed as “interesting”, but readers don’t necessarily want to revisit the content later. On the other hand, female content creators can produce useful information and further reference

From the reach perspective, men have an even higher influence. Male users on average have a much higher fans number on the platform. The average number of fans for men is over 2,400, well over women’s 842.

Interestingly enough, the median and quartile analysis show the opposite result. The median for male users has a fans number of 5, while for women, the number is 11. 75% quartile result also indicates that most men have less reach than women. So why do men have a much higher average?

After checking the gender distribution of users with fans number of 10,000+ and 100,000+, the reason comes clear. There are a group top male KOLs, who process often over 100,000 fans on the platform. Those people have a strong influence which distorts the statistics. On the other hand, for the small number of male users that are not KOLs, they do not interact with the community as strongly.

Where are the Redbook users?

Redbook is known for its high conversion rate to sales. The users on the platform tend to have a strong interest in purchasing high-end beauty and fashion product. Many people refer to the platform as a “种草平台”, which basically means people search for products that interest them and “initiate” the desire to own them eventually. Many of the products discussed on the platform are from major international brands, so the viewers usually process high purchasing power. So where are these users? It’s common knowledge that the majority of the higher-income group locate in Beijing, Shanghai, Shenzhen, and Guangdong. Would the user group fit this demographic?

Again, we start out with analyzing all the data for user location. Since Redbook default a user’s location to “Others”, we see that only 40% of its users input meaningful location information. On top of that, some users stop inputting more information after “China”. For this purpose of this analysis, we will only look at the those who identify themselves in provinces or cities.

The top 5 provinces for user number are Guangdong, Shanghai, Beijing, Zhejiang, and Jiangsu, attributing over 30% of the entire user base. The headquarter of Redbook is located in Shanghai, which explains that the Shanghai metropolitan area is its main user base. (Note: Shanghai, Zhejiang, and Jiangsu can be considered as a large metropolitan area with several high-profile cities) Guangdong has the highest gross GDP by province, and with two of the largest Chinese cities Guangzhou and Shenzhen, it is not a surprise that it is the province with the most users.

Although the quantity distribution shows that Guangdong the largest province, it is also important to take the quality of users into consideration to analyze its user base geographically. Namely, we need to look at how much engagements the users can generate. In terms of likes and collected generated by the users, Shanghai leads by a large margin. Since the company was founded in the city, it is understandable a lot of their seed users are from the region. Another interesting trend we identified is there is a group of Redbook KOLs living abroad and generate good influence. Australia, United States, and the UK all rank among the top 10. Considering that the number of KOLs from these countries are smaller, users living outside China process on average a higher influence that those living domestically.

Limitations
Since the data does not contain the entire user database, the distribution can be inaccurate for all of the users. Also, for the gender and location distribution analysis, since over half of the users do not input the information, it is hard to estimate how the conclusion can be generalized to the whole user group. Moreover, since users input the information themselves, it is possible that some of them register using an inaccurate information (for example, claiming they live aboard while they don’t).

Conclusion
• Users on Redbook are predominately female (about 95%), but men users tend to have a higher influence on average.
• The most influential users on the platform consist of both celebrities and Redbook’s own KOLs
• Guangdong has the highest number of users, but users in Shanghai generate the most influence.
• There are a good amount of users living in U.S., U.K., and Australia, and they exhibit stronger influence than other user groups.

--

--