The Punctuation of ReTweets

140 characters doesn’t leave much room for extraneous letters, numbers or symbols, so you might think that punctuation would be sparse in Tweets. But I compared a random sample of over 1 million “normal” Tweets to a sample of over 10 million ReTweets and found that 85.86% of Tweets contain some form of punctuation, and an overwhelming 97.55% of ReTweets do as well.

Of course, the prevailing ReTweet format includes a colon to better display the original Tweet, but even when ignoring this form of punctuation, ReTweets still contain more punctuation than non-ReTweets (93.42% to 83.78%).

I then analyzed the frequency of specific types of punctuation and found that hyphens, periods and colons are the most ReTweetable punctuation, occurring far more commonly in ReTweets than in regular Tweets, while the rarest mark, the semicolon, is the only unReTweetable punctuation mark.

If you liked this post, don't forget to subscribe to my RSS feed or my email newsletter so you never miss the science.


cheeky_geeky August 24, 2009 at 7:55 am

That's because punctuation is not just “extraneous characters” but rather critical ways of inflecting language. So basically the premise is a strawman. In fact, it could have been predicted that RT's have more punctuation than non-RT's, simply because people have to abbreviate more and things like colons help with that.

danzarrella August 24, 2009 at 7:58 am

@cheeky_geeky perhaps my lead contained a bit of a straw man, but the data is valuable without that premise.

Tom O'Brien August 24, 2009 at 9:17 am

Punctuation can of course change the correspondence meaning in subtle ways. In a multicast environment like Twitter, we'll never see as succinct a message as the telegraph between Victor Hugo and his publisher:
Hugo Single Character Telegram: ? (How goes Les Miserables opening)
Single Character Response Publisher: ! (Fantastic)

But it does beg an interesting question when it comes to describing or commenting on the #SaveRetweets campaign.

So Dan:

Jeff Heuer August 24, 2009 at 11:29 am

Listen, I hate to pick on Dan Zarrella. He seems to be supporting the application of science to social media, something I whole-heartedly support. SCIENCE IS AWESOME, and I whole-heartedly encourage injections of science into the otherwise lame analyses of social media phenomena. Unfortunately, I'm not sure where the “science”, or the “understanding” is in this post, and that absence does a disservice to science, making it LESS AWESOME, and SAD. :(

Let's unpack the message of this post. I hypothesize that what readers here are most interested in is “how do I maximize the odds of my tweet being retweeted?”. (If I am wrong, please let me know.) The large “punctuation occurence” (misspelling and all) graph seems to be suggesting that a tweet is more likely to be retweeted if it contains a colon. But let's start by looking at the two bars representing “tweets”. Slightly over 85% are “with colons”, and slightly under 85% are “without colons”. How could this be? Either a tweet has a colon, or does not, so the percentage of each should add up to 100%, not roughly 170%. Yes, Excel 2007 makes some nice-looking charts, but what is the point of this chart?

Let's say I would like my tweets to be more retweeted. Does this post help me craft tweets that are, scientifically, more likely to be retweeted? Not that I see. Where is the insight?

danzarrella August 24, 2009 at 11:34 am

The with and without colons graph represents Tweets that have punctuation (including colons) and those that have punctuation (not counting colons). I did this because the standard ReTweet format includes colons by default. I should have been more clear on this so that you didn't misunderstand.

crucial August 24, 2009 at 11:43 am

Could someone please explain to me the “standard ReTweet format” and perhaps provide an example using the colon. The format I use does not have a colon. Thanks!

danzarrella August 24, 2009 at 11:47 am

TweetDeck does “RT @username: tweet”

crucial August 24, 2009 at 11:53 am

Thanks. Did not know.

mat1982 August 25, 2009 at 1:15 am

Punctuation is still important regardless of the number of characters available. It gives an indication of the persons ability to write. I could not in all honesty read something that i know would annoy me due to its “wall of text” characteristics.

Panayiotis Pete Karabetis August 26, 2009 at 8:58 pm

I like hyphens, my good man. Grammar is the devil when you're limited to 140 chars, but what is a guy to do?

Pete | The Tango Notebook

Panayiotis Pete Karabetis August 27, 2009 at 3:58 am

I like hyphens, my good man. Grammar is the devil when you're limited to 140 chars, but what is a guy to do?

Pete | The Tango Notebook

Guest September 10, 2010 at 11:17 pm

It’s interesting that your own comment is missing some important punctuation–in more than one place, I might add.

{ 3 trackbacks }