Monitoring a term like ‘Facebook’ with the Twitter Search API can result in hundreds of tweets every minute. If you continuously issue requests for the next set of data, you’ll eventually get a message that reads, “You have been rate limited. Enhance your calm.” Essentially, it’s not possible to refresh the results from the search API fast enough and you will lose a significant portion of the data. The Twitter Stream API lets you get around this by serving up a firehose of tweets on nearly any topic as they happen. Twitter also recently announced a new Site Stream API which is designed for 3rd party sites. That product may suit your needs better.
The process is fairly simple. Open a connection and then suck it all in. While this example is specific to Twitter, it could easily be implemented for any other service that provides an open socket and continually pumps data through it. There are three stages to the code and you’ll expand on section three to process the data for your own use.
- Connecting to the API
- Creating the request
- Handling/Processing the data
In the first part, we establish a connection. We’re going to want to keep this connection open – so we use PHP’s built in socket connection function fsockopen(). For REST based services it often makes sense to use curl because of the simplicity and wide range of browser like support. Alas, curl also likes requests to end, so it’s a non-starter. This code will be running continuously, so you may also need to remove any command line execution time limits in your PHP configuration or simply include a call to set_time_limit(0).
Next we’ll create the initial request. I used an array to hold my query data. This is handy because other methods support additional parameters which I won’t go into here. The track parameter takes a comma delimited list of values. The following examples come from the Twitter documentation on methods and parameters.
Track examples:
Twitter will return statuses which contain: TWITTER, twitter, “Twitter”, twitter., #twitter and @twitter.
Twitter will not return statuses which contain: TwitterTracker and http://www.twitter.com.
helm’s-alee will return statuses which contain: helm’s-alee
helm’s-alee will not return statuses which contain: #helm’s-alee
twitter api,twitter streaming will return statuses such as: “The Twitter API is awesome” and “The twitter streaming deal is fast”.
twitter api,twitter streaming will not return statuses such as: “I’m new to Twitter”.
chirp search,chirp streaming will return statuses such as: “Listening to the @chirp talk on search”, “I’m at Chirp talking about search!”, and “loving this search talk #chirp”.
The parameters and login information are then formatted into our GET request. You can use Basic Authentication on this API which removes the hastle of setting up OAuth. After we fwrite() these values to the server, Twitter will immediately begin sending our stream.
The final part is to process the data. The loop code will run for each and every tweet received. Beware, not everything that comes over the wire will be usable data! NOOP and other noise will come as well, so be sure to check that you can parse what you receive. Twitter also doesn’t guarantee the order of the Tweets, although they’re roughly chronological.
Lastly, and this is a PHP gotcha, on 32-bit platforms, json_decode() and other functions will munge big integers leaving a useless Tweet Id in the object you get back. Twitter’s new snowflake ids are 64-bit. Use the string representation of the Tweet Id, stored in id_str, instead.
set_time_limit(0); $query_data = array('track' => 'facebook'); $user = 'username'; // replace with your account $pass = 'password'; // replace with your account $fp = fsockopen("stream.twitter.com", 80, $errno, $errstr, 30); if(!$fp){ print "$errstr ($errno)\n"; } else { $request = "GET /1/statuses/filter.json?" . http_build_query($query_data) . " HTTP/1.1\r\n"; $request .= "Host: stream.twitter.com\r\n"; $request .= "Authorization: Basic " . base64_encode($user . ':' . $pass) . "\r\n\r\n"; fwrite($fp, $request); while(!feof($fp)){ $json = fgets($fp); $data = json_decode($json, true); if($data){ // // Do something with the data! // } } fclose($fp); } |