Introducing the ReTweetability Index

If we want to be able to create contagious Tweets, we have to know what contagious Tweets look like. And I’ve created a new site that allows you to do just that:

The site has a list of the most ReTweetable users, as well as a search feature that allow you to find the most contagious users Tweeting about various topics.

There are 3 major areas where Twitter users can affect the number of ReTweets they get:

  • Followers
  • Tweeting Volume
  • Contagiousness of Content

We know what “more followers” and “more Tweets” look like, providing well-defined targets in those areas, but, until now, there has been no standard measurement of contagiousness.

I’ve looked into the effect that a user’s number of followers and content of their Tweets had on the level of ReTweeting that occurred. Predictably, the number of followers you have will get you more ReTweets, but the correlation isn’t as strong as expected. Certain patterns of common words and phrases do emerge.

Lists that rank users by the simple number of times they are ReTweeted are not displaying a list of those users with the most ReTweetable content. If a user has a large number of followers, or posts a huge amount of content, naturally they’re going to get more ReTweets; however, it is important to note that this is not due to how contagious his or her Tweets were.

What I’m trying to do with the ReTweetability metric is begin to develop a simple formula upon which the infectiousness of a user’s content can be measured. This algorithm would eliminate the effect of the user’s follower count and Tweeting rate.



The ReTweetability metric I’m using for the index right now is based on the natural logarithm of both the followers and Tweets per day numbers. This is done to compress the range of variation in both numbers, while acounting for the power law shaped graph displayed by the distributions of the two variables.
Prior to using the logarithm, the formula over-penalized users with higher than average followers (around 100) and Tweets per day (around 5), which turns out to be most users.

I’ve also explored the possibility of using the square root of the 2 values; this produces a range smaller than without using the natural logarithm, but larger than with it. I would love to have a discussion about the correct method for this, and I expect some variation in the formula here.

Due to the extremely small result of the formula, I’ve had to multiply it by 10,000,000 to enhance its readability — I would also love feedback on this.

If you liked this post, don't forget to subscribe to my RSS feed or my email newsletter so you never miss the science.


@muunkky March 3, 2009 at 12:07 pm

Hi Dan,
I love that you’re taking this problem on and I agree that there is a third variable to retweetability. In order to find the correct relationship you have to define the metric a little more clearly. Right now it appears that you have defined as follows:
Number of ReTweets = (Retweetability)(lnFollowers)(lnTweets).

If you’re having a problem with correlating the data it’s probably dus to the structure of this equation and not simply the operators of the second two terms. You have three independent variables, you may need to try some different types of regression.

francamenteWeb March 3, 2009 at 1:13 pm

GOOD! The idea is excellent.
But in my searches I didn’t find Italian user on list, is it normal?

Dan Zarrella March 3, 2009 at 2:57 pm

@muunkky do you have any suggestions as to a better formula?

scorpfromhell March 5, 2009 at 7:28 am

Did you try adding 2nd degree followers to the no. of direct followers?

What variations does it show then?

jon March 12, 2009 at 2:04 pm

Dan, your collection of posts on retweeting is great work — very interesting!

In terms of the retweetability index, I think of it in terms of the contagiousness of a post (relative to the tweeter’s network) and the retweetability-potential of a network. Combining these into a single metric blurs those distinctions.

Also most people don’t intend all their post for retweeting. Looking at my own behavior, there are a lot of updates to friends, comments in Twitter chats, or conversations with individuals and small groups — and nobody else is interested in these. So dividing by the total number of updates the way you do seems to overstate the influence of the Mashables of the world, all of whose posts are intended for broad influence.

> Prior to using the logarithm, the formula over-penalized users with higher than average followers (around 100)

Why do you see this as “over-penalizing” as opposed to accurately reflecting?

> Due to the extremely small result of the formula, I’ve had to multiply it by 10,000,000 to enhance its readability — I would also love feedback on this.

Personally I’d find it more readable if results were typically in the 1-100 (or maybe 1-1000) range.


@muunkky March 19, 2009 at 2:23 pm

@danzarrella I don’t have anything in mind as it’s hard to see the trends without the data. Firstly I think it is important to have a strong definition of what it means to be retweetable. I’d also second what jon said about filtering out @replies. If you haven’t done this already I think it might level things out quite a bit.

I think you have the best variables already… I can’t think of anything you could add. I’m imagining that due to the very nature of retweeting it must be almost impossible to see accurately past the first generation of tweets. So I would say an important variable would be total instances of terms like “RT @muunkky”.

You are currently measuring the fraction of a tweeter’s flock that retweets a percentage of their tweets. It would be great to try playing with a few indices like “Total number of non-reply tweets that get retweeted”… but again, there’s no reliable way of measuring which tweets are being retweeted.

I think that in order to improve the formula you need a stronger definition of what it is to be retweetable. With that knowledge then maybe different approaches like probability considerations may be more appropriate.

Jim April 20, 2009 at 2:28 am

Another factor to consider is how many people a “follower” is following. The higher that number, the lower the percentage possibility of a retweet due to the higher number of tweets crossing that “followers” screen. There is a higher chance that a possible candidate for a retweet will be missed.
~ Jim [@SEO_Web_Design]

Phil Harper May 5, 2009 at 9:00 am

Dan, loving your work on the retweetability metric. This needs to be intergrated into twitter profiles so that twitter moves away from being a popularity contest to being a retweetability contest. At least that way people will try to add value rather than keep us up to date on how their shower went in the morning.

David Fox December 25, 2009 at 4:46 pm

Is your index still working? I see all 0's in the retweet column, and most of the people I try to check so far don't seem to have twitter accounts. Is your tool broken, or is it twitter?

Amitha Amarasinghe March 18, 2010 at 8:41 am

This isn't working! I get no results for any of the keywords I search

{ 5 trackbacks }