Data Science gets Social: Seattle Twitter developers meeting

Last night I attended the the Twitter Developers “Tea Time” conference in Seattle.  It’s a forum for the Twitter platform team to engage with their developer community, and showcases some of the most interesting examples of new businesses being built on Twitter data.  I had a great time,and enjoyed talking to some great folks at Twitter, as well as folks in the local developer community that are at the forefront of data science in the social analytics space.  Quick notes are below, followed by my musings:


Room scan:

  • 4th floor at Seattle’s Hotel 1000 @Hotel1000, nice catered bar / lounge setting outside, standard meeting room inside.  Good vibe, perfectly sized.
  • Approximately 65 folks in the room total
  • 4 guys from Twitter platform team
  • 6 presenters representing 3 companies (SimplyMeasured, Formulists, Bing)
  • 55 devs

Twitter team:


Twitter State of the Union:

  • 750k registered developers
  • 15b API calls per day
  • 1.1m registered apps (tokens)
  • 250m tweets per day
  • Datasift and Gnip growing as firehouse providers

Developer Presentations:

The idea grew out of a steady stream of simpler unmonetized startups, they decided to make a business out of it.

TweetStats.Com -> Tweetsum -> Tweepsearch ->SimplyMeasured

  • Mongodb as primary store
  • Gnip as tweet source
  • uses redis/resque
  • ingress then queue to backend for link tracking via other sites like, Klout etc.

  • Very slick “freemium” model
  • running on .Net framework (got a golf clap from the crowd)
  • undersold it, but feels like they’re building a goldmine of list data…


Bing Social Analytics

  • It was funny that they invited someone from Microsoft Bing mostly because the event was so close to the Microsoft Redmond campus and the Twitter folks probably assumed it’d be a no brainer since Seattle is our home turf, but the guy that came is from Microsoft’s Mountain View, CA campus (he probably flew up to Seattle on the same flight as Twitter team!)
  • Bing Social Analytics team is 4 people in Mountain View campus – a startup within Bing
  • Great showcase for the Windows Azure platform.
  • No SQL Server anywhere in the conversation.
  • NTLB initiative: “No Tweet Left Behind” – parallel caching system to queue requests.  At peak load the ingress consumption lag is ~ 5-10 minutes, at low load, 1-3.  Impressive firehose chugging.


This was the most ironic (painful?) presentation for me to watch for several reasons – talk to me sometime about how much fun my team had getting access to Bing data for our project earlier this year.  (It wasn’t Vishal’s fault, these policies are hashed out at pay grades higher than he or I operate at.)  Vishal pointed out several times in his presentation that the data they had was being asked for by plenty of other teams in Microsoft but that he wasn’t able to share it with all of them, only projects that directly supported Bing.  But he found it encouraging that there were so many other people in the company interested in this space.  (My head asplode.)  Time and time again, I’ve seen this same kind of internal friction between Microsoft teams getting in the way of what should be some easy wins.  The “big company tax” is painful to deal with, especially when there are new and rapid innovations taking place outside the company.  I take some small relief in knowing I’m not the only one who sees this:


Q&A sessions:


A guy from UW asked about how research teams can vs. can’t use or share Twitter data.  The policy is unclear and there have been a few high profile published studies that Twitter issued takedowns for after the fact.  (Ouch!) The panel squirmed a bit.  No definitive answer, these were platform guys not policy folks.  Referred questions to: api-research@Twitter.Com

Awkward Question of the Night – An ex-MSFT guy doing his own business asked “How safe is my app if Twitter decides to enter that space? (think Seesmic, who had a thriving client app that Twitter shut out – also see TweetDeck, which got acquired by Twitter).  Again, a bit of softshoeing / no definitive answer.  Twitter is still figuring out how best to guide and nurture devs, while at the same time they’re figuring out their own business.  Such are the risks of sailing your ship into the blue ocean.



Misc. thoughts / musings for the fools who scrolled down this far:


  • The Seattle Data Science community is embracing Social.  This was easily the most focused and vibrant platform dev outreach event I’ve attended since my time with the early days of Xbox Live.  Quality crowd – 90% of the people in the room “get it”.  Small room of people, but they’re all chomping at the bit doing data science on one of the most rapidly growing sets of big data out there right now: Twitter.  The Seattle startup scene is the healthiest I’ve seen it since 1998 – it’s still small by Valley standards, but to have this many people interested in this specific a platform in Seattle?  Wow!
  • Twitter is still in the awkward teen stage, and building a business off their platform is exciting but uncertain.  They’re growing like crazy but the business model is still in flux.  They’re sitting on a valuable growing corpus of big data and they know it, but they haven’t yet charted out exactly how they want to go forward (at least, not publicly).  Data Science startups like SimplyMeasured and Formulists are pioneering here, but personally I wonder how long it’ll be before Twitter “pulls a Twitter” and gets into the space themselves, and either acquiring them or competing against them directly.
  • Data Analytics / Data Science on social data is taking off like mad.  Just this morning as I peck out these notes I see the following 2 PR releases in my RSS reader:



That’s $38 million flung at the problem in one day.  Data Scientists are jumping on Twitter and partying – it’s a growing industry and there are plenty of opportunities for small players to make big inroads while the larger enterprise players slumber.  The common thread among the folks I talked to last night was that they’re all mostly still getting their feet wet and taking on smaller clients.  But when I floated the idea of engaging with existing enterprise clients with existing BI systems they admitted that was potentially the pot of gold at the end of the rainbow.


  • Microsoft is not even in the conversation yet, and time is ticking.  There were several former and current MSFT employees in the room, but it was clear we were not the cool kids.  Tellingly, when the Formulists founder showed his architecture deck showing Windows Azure and .Net, it was met with a sarcastic golf clap from the back of the room, and sympathetic (curious?) expressions from the crowd.  But when folks asked about perf and storage and got satisfactory answers, they seemed happy with the responses – these are folks who don’t even seem to realize Azure is a viable option.  If Microsoft (and other large companies) wants to be a place the Twitter dev community folks turn to once their Data Science and Social Analytics businesses grow to the point where they want to go after the Enterprise client, it’s time to wake up and start engaging with the pioneers, as well as getting ready for the wave of settlers coming after them.  It’s still early days in the wild wild west of Data Science.

Big thanks to the gang from Twitter for putting on the event, and it was great meeting some fellow devs who are playing with Twitter data!




Posted by Brian

Leave a Reply