Trending YouTube Video Statistics
?

R programming language is powerful when it comes to analyse and quickly visualise data. You can do that with RStudio easily. I explore daily statistics for trending YouTube videos, and you can find this data set from [this link](https://www.kaggle.com/efebuyuk/trending-youtube-video-statistics). Moreover, you can also find my Kaggle notebook regarding this work from this link.
The questions I am going to answer for this analysis are as the following:
- What are the correlations among these attributes: category id, views, likes, dislikes and comment count?
- What are the video clip appearances by country?
- What are the percentages of likes based on time of day?
- How many times have video clips appeared during the different time intervals across the countries?
- What are the most frequent tags in the Great Britain data set?

As we can see from the above graph, the most correlated data points are likes and comment counts, and that correlation is 0.86. The more a video has likes, the more it has comments. After that, we see that the second-highest correlation is 0.81, and this correlation comes from the attributes of views and likes.

When we check the number of appearances by country, it is clear that the number of video appearances in Russia is the highest, following by Mexico and India.

Time of day is very important in broadcasting, so in video streaming nowadays. That is why it is crucial to check when a video has appeared. Above graph represents the percentages of likes based on time of day. According to that, day time (10:00–16:00) has the highest percentage by 32.3%.

Mexico, South Korea and Japan have slightly lower number of appearances on different time intervals comparison.

Most used tags are video and music, 32 and 19 times respectively. In this graph, I only checked Great Britain’s data set, so the English tags have been processed and visualised. It is important to know that if you are dealing with other languages, then you need to be careful about encoding. Otherwise, you cannot process the data.
If you want to check my R code, then you can see my Kaggle notebook for this work. Here is the link.