Ntil the end of the snowball-sampled data. Figure 21 shows the same thing but `zoomedno. active users in our dataset per day 400 000 350 000 300 000 no. users 250 000 200 000 150 000 100 000 50 000 –rsos.royalsocietypublishing.org R. Soc. open sci. 3:…………………………………………A prA prA ugMJuSe pdateFigure 20. Miransertib web number of active users per day. Dates before 22 April 2014 have fewer than 53 000 active users.no. active users in our dataset per day 400 000 350 000 300 000 no. users 250 000 200 000 150 000 100 000 50 000 –22 14 20 ct O 30 O ctO ct 2 20 O 28 ct 20aylyppctctctctctctctctctctctSeSeOOOOOOOOOOOOdateFigure 21. Number of active users per day for a restricted range of dates.in’ to a restricted range of dates. We note that the number of users oscillates between weekdays and weekends, and the weekly total gradually increases and peaks on 15 October 2014. Then the number of users falls off quite rapidly. The shape of figure 20 is largely due to the fact that only the last 200 tweets (from the time of the API request) per user were collected. Thus, those users who tweet frequently do not show up in the first half of the chart as their earlier tweets have not been collected. We used the Twitter dataset to create an evolving network, where the set of vertices is fixed and the edges between them can get PD168393 change in each time step (a day in our case). In order to choose a fixed set of vertices, we have chosen the week from 9 October to 15 October 2014 inclusive, which is the week with the highest activity measured by the number of tweets. We then filtered the data using several criteria in order to focus on `regular’ human users. A number of classes of user with unusual behaviour were filtered out, as they would skew the results of our analyses (and threaten to make the network structure become degenerate, as we described in ?.2): — Users with a very high tweeting frequency. If a user tweets hundreds of tweets in a few hours then these messages might have been automatically generated. This practice is followed byct50 000 45 000 40 000 35 000 30 000 25 000 20 000 15 000 10 000 5000 0 0.5 1 2 3 4 5 6 7 8?4 15?1 22?8 29?2 day difference between the first and the last tweetrsos.royalsocietypublishing.org R. Soc. open sci. 3:…………………………………………Figure 22. Histogram of the number of days between the first tweet and the last tweet in our snowball-sampled data, for those users who had posted at least 200 tweets since account creation.many companies and organizations for advertising purposes, but the messages are not genuinely representative of human behaviour. Figure 22 shows the number of days (rounded up to the next value) between the first tweet and the last tweet in our snowball-sampled data, just for those users who had posted at least 200 tweets since account creation. When setting a threshold on tweeting frequency to exclude users, we should of course filter out only a small minority of users. We observe in figure 22 a natural `gap’ in the data at the value 1, with only 48 users appearing in this bin. We select a day difference of 1–equivalently a tweeting frequency of 200 tweets per day–as our threshold, excluding 1153 users with a higher tweeting frequency than this. — Users who mention themselves very frequently, who may also be bots. We used a threshold of 0.5 for the ratio of the number of self-mentions to the number of all mentions made by a user, chosen because the number of users with a s.Ntil the end of the snowball-sampled data. Figure 21 shows the same thing but `zoomedno. active users in our dataset per day 400 000 350 000 300 000 no. users 250 000 200 000 150 000 100 000 50 000 –rsos.royalsocietypublishing.org R. Soc. open sci. 3:…………………………………………A prA prA ugMJuSe pdateFigure 20. Number of active users per day. Dates before 22 April 2014 have fewer than 53 000 active users.no. active users in our dataset per day 400 000 350 000 300 000 no. users 250 000 200 000 150 000 100 000 50 000 –22 14 20 ct O 30 O ctO ct 2 20 O 28 ct 20aylyppctctctctctctctctctctctSeSeOOOOOOOOOOOOdateFigure 21. Number of active users per day for a restricted range of dates.in’ to a restricted range of dates. We note that the number of users oscillates between weekdays and weekends, and the weekly total gradually increases and peaks on 15 October 2014. Then the number of users falls off quite rapidly. The shape of figure 20 is largely due to the fact that only the last 200 tweets (from the time of the API request) per user were collected. Thus, those users who tweet frequently do not show up in the first half of the chart as their earlier tweets have not been collected. We used the Twitter dataset to create an evolving network, where the set of vertices is fixed and the edges between them can change in each time step (a day in our case). In order to choose a fixed set of vertices, we have chosen the week from 9 October to 15 October 2014 inclusive, which is the week with the highest activity measured by the number of tweets. We then filtered the data using several criteria in order to focus on `regular’ human users. A number of classes of user with unusual behaviour were filtered out, as they would skew the results of our analyses (and threaten to make the network structure become degenerate, as we described in ?.2): — Users with a very high tweeting frequency. If a user tweets hundreds of tweets in a few hours then these messages might have been automatically generated. This practice is followed byct50 000 45 000 40 000 35 000 30 000 25 000 20 000 15 000 10 000 5000 0 0.5 1 2 3 4 5 6 7 8?4 15?1 22?8 29?2 day difference between the first and the last tweetrsos.royalsocietypublishing.org R. Soc. open sci. 3:…………………………………………Figure 22. Histogram of the number of days between the first tweet and the last tweet in our snowball-sampled data, for those users who had posted at least 200 tweets since account creation.many companies and organizations for advertising purposes, but the messages are not genuinely representative of human behaviour. Figure 22 shows the number of days (rounded up to the next value) between the first tweet and the last tweet in our snowball-sampled data, just for those users who had posted at least 200 tweets since account creation. When setting a threshold on tweeting frequency to exclude users, we should of course filter out only a small minority of users. We observe in figure 22 a natural `gap’ in the data at the value 1, with only 48 users appearing in this bin. We select a day difference of 1–equivalently a tweeting frequency of 200 tweets per day–as our threshold, excluding 1153 users with a higher tweeting frequency than this. — Users who mention themselves very frequently, who may also be bots. We used a threshold of 0.5 for the ratio of the number of self-mentions to the number of all mentions made by a user, chosen because the number of users with a s.