Gardenhose is a Streaming API feed that continuously sends a sample (roughly 15% according to Ryan Sarver at the 140tc in September 2009) of all tweets to feed recipients. This is some code for dumping the tweets to files named by date and hour. It is in PHP which is not my favorite language, but works nonetheless. I received a few requests to post it, so here it is.
<?php //gardenhosedump.php $username = ''; $password = ''; while(true) { $file = fopen("http://" . $username . ":" . $password . "@stream.twitter.com/1/statuses/sample.json","r"); while($data = fgets($file)) { $time = @date("YmdH"); if ($newTime!=$time) { @fclose($file2); $file2 = fopen("{$time}.txt","a"); } fputs($file2,$data); $newTime = $time; } //need to close the file, but only if it is open! try { @fclose($file); } catch (MyException $e) {} try { @fclose($file2); } catch (MyException $e) {} } ?>
Some shell script for those who don’t php:
while [ 1 ]; do `curl http://stream.twitter.com/1/statuses/sample.json -s -u: -Y 0 –retry 9999 –retry-max-time 0 >> /tmp/tweets`; sleep 2s; done;
Thank you! Nice code, and doesn’t require any special software other than a shell.