Psychological Profiling Via Twitter

Posted on Jun 15th, 2009 Comments

This weekend I was playing with a bunch of different linguistic analysis methods to better understand ReTweets, and while I uncovered a ton of cool new data which I’ll be sharing a little later this week, I also came upon an idea I think is pretty awesome, probably groundbreaking, and definitely worth Twittering about.

Communication is a window into a person’s mind, and the way a person talks can tell you a lot about how they think. Linguists have developed two methods to decoding the written word into a meaningful profile of a person’s cognitive processes.

One method is called the Regressive Imagery Dictionary (RID). This coding scheme is designed to measure the amount and type of three categories of content: primordial (the unconscious way you think, like in dreams), conceptual (logical and rational though) and emotional.

Significantly more primordial content has been found in the poetry of poets who exhibit signs of psychopathology than in that of poets who exhibit no such signs (Martindale, 1975). There is also more primordial content in the fantasy stories of creative as opposed to uncreative subjects (Martindale & Dailey, 1996), in psychoanalytic sessions marked by therapeutic “work” as opposed to those marked by resistance and defensiveness (Reynes, Martindale & Dahl, 1984), and in sentences containing verbal tics as opposed to asymptomatic sentences (Martindale, 1977). A cross-cultural study of folktales from forty-five preliterate societies revealed, as predicted from the “primitive mentality” hypothesis of Lévy-Bruhl (1910) and Werner (1948), that amount of primary process content in folktales is negatively related to the degree of sociocultural complexity of the societies that produced them (Martindale, 1976). Martindale and Fischer (1977) found that psilocybin (a drug that has about the same effect as LSD) increases the amount of primordial content in written stories. Marijuana has a similar effect (West et al., 1983). Research has also revealed more primordial content in verbal productions of younger children as compared with older children (West, Martindale, & Sutton-Smith, 1985) and of schizophrenic subjects as compared with control subjects (West & Martindale, 1988).

The other method is Linguistic Inquiry and Word Count (LIWC). In development for over 15 years, the LIWC measures the cognitive and emotional properties of a person based on the words they use.

In order to provide an efficient and effective method for studying the various emotional, cognitive, and structural components present in individuals’ verbal and written speech samples, we originally developed a text analysis application called Linguistic Inquiry and Word Count, or LIWC.

I’ve combined these two systems with a Porter stemming algorithm and my own Twitter analysis infrastructure to create TweetPsych.com.

TweetPsych uses the LIWC and RID to build a psychological profile of a person based on the content of their Tweets. It compares the content of a user’s Tweets to a baseline reading I’ve built by analyzing an ever-expanding group of over 1.5 million random Tweets, then highlighting areas where the user stands out.

The service analyzes your last 1000 Tweets; as such, it works best on users who have posted more than 1000 updates. It is also better suited for running analyses on accounts that are operated by a single user and use Twitter in a conversational manner, rather than simply a content distribution platform. It takes a few moments to analyze an account the first time, but subsequent views of a profile will load faster.

I’ve tried to translate the codes that come from the two linguistic systems into more meaningful explanations, but I may have missed a few. I will continue to expand these definitions, while also refining the system and algorithm to better analyze Twitter-specific content.

I think the possibilities of a system like this are enormous, from matching like-minded users to identifying users that exhibit certain useful or desirable traits. I’d love to hear your thoughts on where this could be improved or where I could take this technology next.

  • I think this is very interesting indeed. I've been researching how NLP techniques can be applied to web content and advertising, which i don't believe anyone is looking into. It's the ability to tap into the behaviour of individuals, the behaviour they don't even know they are transmitting. The ability to transfer this understanding to the web has extremely powerful potential.

    Imagine if you could run advertising on via social tools knowing that certain people will interact before they do. Poerful stuff. I look forward to seeing what else you come up with.

    I think it would also be helpful on tweetpsych to have a key so i can better understand my analysis.
  • "Imagine if you could run advertising on via social tools knowing that certain people will interact before they do." Funny, usually when I hear people say stuff like this, it's in fear, despair, or jest. The fact that it excites you seems creepy.
  • That's fascinating right there! I didn't know anyone was working with NLP these days at all. NLP and Twitter... what a mind-blowing concept!
  • Very interesting thing to do! I'm afraid I don't have the background to make much sense of the results though. Is there a reference that describes what these terms mean in this context? Is there any significance to the number of terms in the results? Or in the order in which they appear?
  • Nice - if you can get that very fine tuned, that could be a powerful marketing tool.

    Good work! Can't wait to hear more about it.
  • Krishanu
    I agree with Sarah, it would be really nice if you provided a link to an explanation of the analysis.
  • Interesting, but still rudimentary. I'm wary of automating any kind of behavioral analysis. Useful? Absolutely. But I think this is still applied to the prospect list philosophy. It will undoubtedly help current sales processes. However, I'm interested in seeing how much success in terms of conversions it ends up having.

    Would be better if you had links to the words and a definition of exactly what they indicated. Is this possible?
  • Am I the only one that could see the potential abuse of something such as this tool? What about the potential backlash an applicant for a job could experience from an employer who decides to run a program such as this? Or an employee

    As interesting a tool as something like this could be, Twitter is a TOOL - a social devise that truncates people's thoughts and language into 140 characters - and should not be construed as someone's personality in total.
  • Word. I was fascinated when I got an @reply that someone had analyzed me, but this thread is weirding me out. The fact that I didn't initiate my "analysis" makes it even sketchier. I wouldn't take it much more seriously than a mood ring, except that everyone else is.
  • Thanks Dan for trying out new things.

    The results for me weren't useful. Because I didn't know what to do with them or how to use the output.

    One way I see this app going is: it goes through all my followers. And gives me a statistical analysis of their profiles. And based on that - gives me tips on how to communicate better with them.

    But I doubt that the twitter API limit will allow you to build such a tool...

    Without bridging the gap between the output and how to use the output to improve communication, it won't be a very valuable tool. The output needs to make sense.
  • tr
    can't get it to work for some reason?
  • noxhanti
    Love this. I have 2 accounts dealing with completely different subjects. The results for the 2 accounts were very different and gave me some insight into how I am coming across on each. In doing so it clarified for me my networks responses 2 the content of my tweets. Although there is some of me in each account the tool also highlighted for me the fact that it is the subject matter rather than the real me that is eliciting a response
  • Dustin
    Very interesting indeed, and I have to say (like most people commenting), I would love an explanation page on what my results mean.
    Great site!
  • Yes, amazing effort here. Congrats. I agree with the comment that the results are so hard to understand for a normal person... I don't even understand the meaning of the word Cognitive... If you can bring this down in mortal language I think it can be revolutionary... right now after doing my test I have more questions than answers :)
  • I understand the concepts and the results to a degree, but then there are outputs like Posemo, and Negamo. I even googled and couldn't find anything discernible about these terms.
  • e
    Interesting idea, but how can we interpret this without definitions and a comparison to other Twitterers? And which terms are primordial, conceptual, or emotional?
  • Luke
    Definitely need some explanations. I have absolutely no idea what my results mean. For example my top 3 results for the primordial, etc. section are: # Regr knol concreteness, # Abstract tought, # Temporal repere. If this had some form of interpretation attached it could be useful.
  • Pepe Lopez
    Posemo and Negemo are likely to be "positive emotion" and "negative emotion" respectfully, some of the others are likewise guessable, but as to what bearing they have on personality... it's an effin' mystery to me.
  • Hi Dan, This is a very interesting (and powerful) concept. I think it can be further enhanced with detailed explanations of the results. Over time it would be interesting to obtain/compare stats of other Twitter users (i.e. who you're are most compatible to, who you should avoid like the plague etc).

    Great stuff!
  • I love this idea. I'm continually impressed by the Hubspot team. Thanks Dan.
  • Dan, I'm fascinated with what I'm finding not just with my own map (MeriWalker) but with maps I've run on a few other folks, including the President. I've got a background in linguistics myself so when I saw the story about what you were experimenting with I ran over to see.

    I'm interested in how you created your baselines and what the definitive list of "topics" you're measuring against. I'm also really curious about what "sexual expression" means because I have been totally unaware of making any tweets that I would label as sexually expressive, much having a pattern of them develop over the last six months (my life in twitter really began at the end of December, 08).

    Will you be publishing more about this here ... or elsewhere? I'm very interested in what you're doing with this.

    Blogged twice about my thoughts briefly at http://meri.posterous.com this afternoon.

    Standing by...
  • Annie
    Why doesn't mine work. I tried friends user names but mine doesn't work. It seems interesting...
  • Mat
    I'm not too sure i would appreciate someone analysing me in this way, and i think that sometimes we read too much into stuff like this.
  • @Mat

    I encourage you to read the text at the bottom of the TweetPsych page.

    With most descriptive analyses we can very rarely make attributions to individual cases. That does not detract from their ability to provide useful information about the world.
  • not for me, because Crazy works to keep me from going insane, but anyone can input another's twitter handle in and read the results. I would like to suggest you program in a privacy and/or opt-out option(s). Thank You for considering.
  • Hi,
    I tried it, and all I got was this -

    http://tweetpsych.com/?name=EmyAugustus

    It really doesn't tell me much!
  • Is there a way to give it permission to read my Tweets, though my updates are protected/invitation only?
  • Liz M
    It's a neat little app! It's difficult to interpret the data points and line items though. Can you write something more specific about what each of those numbers refers to or the purported impact of those characteristics? Basically a crib sheet for those too busy to go to the original works?

    Anyway - thanks it's fun!

    Liz
  • Few are doing thoughtful work with parsing Twitter for insight beyond business/ROI surface levels, so TweetPysch is an interesting start. I tried it with myself and people I feel I know really well and can see some match to actual experience with this.

    I won't mention obvious feature improvements already mentioned above, the underlying theory powering this app is worth talking about before features are blown out. Questions that came up for me were about RID and LIWC; how much are they based on psychoanalysis or other (non-scientific/outdated) frames? Are there any other more science-based/valid ways to do machine interpretation of word use in context? Are there better linguistic/anthropological methods to parse language meaning? If there is outdated aspects to RID and LIWC can those parts be removed and the still valid parts just be used? -- This tool and the possibilities are fascinating, but ultimately the existential framing/theory/model that powers it will determine its actual value, more than any features built around it.
  • I find your work really interesting. I was looking for something of this kind to apply recommenders system to the world of microblogging and I think your idea is actually ground breaking in user profiling!
    Would you release any API or limit TweetPsych to pure consultation?
    Cheers,
    Beppe
  • Wishbone
    Interesting, but ultimately useless without a glossary of terms. Most of us just aren't going to take the time to search this stuff out. I'll check back though to see if it ever gets added. With a glossary this could be very useful and interesting for us laymen ;-)
  • I am sorry to disagree but... there is nothing psychological here. A linguistic method can give us a linguistic profile.This is important, but that don't give us a psychological profile. There is a ling between the way some one uses the language and it's personnality, for sure, but theses link haven't been yet formalised.
  • dude! This rocks.

    I can't wait to see more explanations of terms included in the results but hey, in case anyone hasn't noticed it's in BETA!!!


    ;oP

    btk
  • Hayley
    Not sure about this idea - how are you controlling your variables? Exact definitions of words vary according to personal and cultural experience, can you ever really pin down the personal relevance/definition of a word (or group of words) for an individual?

    Perhaps it would be useful to select volunteers to undergo more tried and trusted personality tests e.g. MMPI and analyse for correlations? If you haven't done this already, of course!
  • looking at some of the word references, maybe 'somebody' should bring it all up to date. The RID (http://www.kovcomp.co.uk/wordstat/RID.html) word relationships are no later than 1990! Give me a BREAK! Wording used nowadays carries different meanings via inflections and abbreviations. (ROFL is a mental picture, that may not describe what I'm actually doing.)
    If you're using the LIWC2007, congrats, and hope you're using a Mac with it.
    Applause for this effort and hope it will continue to evolve, along with current language usage.
    Please keep use of this public, you may enlist help you didn't know was there.
  • Did you get a chance to mess around with the OpenAmplify API? Try seeing the result of a mashup based on the amazing work you have already done.

    Dave Weinberg
    Community Manager
    OpenAmplify

    http://community.openamplify.com/user/CreateUse...
  • Bill
    Well, yesterday I got results that were words, though without any context, and they seemed to make sense.

    Today, I got a long list of constructs, with no definitions, and a number next to each one. I don't have a clue what they mean. Is there a key somewhere that unlocks these results?
  • I started to write up my thoughts and realized I need to just blog them instead to save space. I wrote them up here: http://thetylerhayes.com/2009/06/16/my-concerns... (sorry, I hope you don't mind the link, I'm honestly not trying to linkbait or anything!).

    No matter what, I can't wait to see TweetPsych evolve and innovate. More importantly, I can't wait to see it inspire others to create more tools in human-oriented areas like cognitive function, etc.

    Also, Dan, the comments section on your site in Google Chrome is meeeeeeessed up!
  • This is an interesting experiment. I'd like to see more summaries written in laymans terms and the results in relation to the self.

    If I'm a small business owner who wants to integrate my personal brand into my business brand, I would anticipate reading a summary of results based on actions. Also, having a rating system that allows me to see how my tweets stack up against the chart for a single person business would be fascinating and making suggestions on how to improve certain areas.
  • Not sure what to make of the numbers. If someone were right at the norm of all the measured levels, would they be right at zero? And then would a score of 15 mean 15% above the normal usage of those terms that fit into that category? I see some people score over 100 in some categories so I am guessing that that the number doesn't imply the percentage of messages which reveal a specific psychological tendency. Also... I think it would be better if the dictionary was stored in a different type of file that didn't have to be unpacked. Otherwise... it seems like an interesting tool... if not a scary one.
  • Thanks Dan for improving the tool!

    The output is much better - a lot more coherent!

    And the "some people that think like you" is awesome!

    Thanks.
  • WonByHim
    Like many others, I'd like to know what the scores mean. Like, in a certain category, what score would be for an average, mild mannered person? So we could see where we are compared to someone like that. Is there a "total score" somewhere that means something?
  • happyseaurchin
    nice :)
    once it is linked with trust metrics
    you got it made :)
  • I suppose your tool can't deal with any foreign language. I wonder how the iteration between English and foreign tweets influence the result. Nice toy.
  • How about RT? The wording comes from a second, third person. Does your tool account for Retweets if they a tagged as such? :-)

    BTW: Human ressources who use this tools without accounting for these inherent flaws should be sacked. :-)) Retweets aren't always tagged.
  • John Milton Wesley
    6/18/09

    Dan:

    This is truly a unique idea. Congratulations! So be prepared for those who will declare it thus, and then claim there isn't enough evidence to back it up.

    John Milton Wesley
  • hi,
    i just simply love it, my twitter id: we4tech :)
    thanks for bringing such a nice tool for us.

    best wishes,
  • Interesting... But could you write an explanation of the scores? Don't know how to interpret those numbers...

    - Amy Gahran
  • Kirti Vashee
    Yes the results are not very clear and it would be useful to have a graphic image to show where one falls on the three core types scale.

    Also it makes sense to recommend people that you would be sympathetic to or attracted to - not those who are just like you.
  • Great idea! Does it work in spanish?
  • Mark
    It's already being done by DARPA... and to a MUCH better degree.
  • Interesting. But speaking from Ann Arbor, I can say that DARPA is missing the boat (sub?) on this. With a charismatic (and shrewdly sociopathic) leader, this could be the seed of a new religion. Twientology, perhaps. ;-)
  • I think a machine can´t crawl a human mind, but humans can crawl the internet for friends all over the world.
  • Very cool!

    You should visualize the results.
    Maybe via: http://code.google.com/apis/visualization/
  • Hi Dan - Congratulations. This is a very neat application and very innovative. You should commercialise this and sell it to recruiters and HR departments. I've blogged about it here: http://bit.ly/1jCZIU
  • fionaboyd
    Hi Dan, I love this work you're doing. My partner and I set up a social networking site in 2003 called www.folklikeme.com which only ever had around 8000 members and we couldn't find the way to grow outwards and keep finding relevant folk to match up with others. I think Twitpsych would have helped enormously. I want to have another shot at FLM later this year, so would be really interested in where you're at with this, then.
  • Interesting, but no frame of reference where the numbers are concerned for us non-educated tweeters.
  • Tom
    Very cool, Dan, but it would help immensely to know what those scores relate to. I got a 150 on abstract thought, but a 9 on methphor. Does that mean I totally suck on metaphor, despite being a metaphor junkie, and the comment that "Many of your Tweets contain metaphors"? Lots of fun, though, thanks!

    @TomYHowe
  • Just came across this! Very good idea, great feat of engineering! Personally, I'd prefer if it wasn't promoted as being psychological, as people might get the idea that this has to do with academic psychology, which it isn't. I put further thoughts here:

    http://generallythinking.com/blog/index.php/200...

    It would be really cool if you could run some studies, testing your content against established psychological measures. Then you might have something that's compelling to people, and actually gives valid feedback, too. It'd be a shame if this became solely a marketing tool, or a "personality" test on a par with the ones you get in women's magazines, when there's potential there.

    But either way, you've made something pretty special there, well done.
  • Batman
    Do you think you can predict my next tweet? That would be a great application... :)

    In other news, you've now told me how I tweet, and you've given me some numbers, and you haven't told me what they mean. Now that you know how I tweet, maybe you can explain it to me, so, that we'll both know. Would also like more than five users that I'm similar to. Thanks!!!
  • hi guys, we have a same interest. i write social media with psychological approach. maybe we can share this. thnks
  • pixoxo
    Does tweetpsych works for portuguese-speakers too?
  • andywise
    Great stuff, do you have any plans to publish an API ? I would love to use your analysis as part of a project I'm working on at the moment.
    @andrew_wise
  • How exciting. Couldn't wait until someone did this. Thank you, Dan : )
  • Fascinating! Thanks for sharing.
  • This is such an absolutely genius concept, and with further exploration and development I am Sure it will prove itself to be quite useful and even possibly revolutionary. I'll be sure to get the word out in whatever capacities I may and to follow you for any updates. Fantastic work
  • codemyconcept
    This is a great tool for marketing! Makes using twitter for advertisement a lot more useful.
blog comments powered by Disqus