Facebook Spider
I’ve written a Perl program to spider Facebook. I was looking for a way to quickly generate statistics about the University of California, Berkeley student population, and I figured that since almost everybody had a Facebook account, I could dump all of Facebook’s information into a database and generate reports from that information. Since this program has proven useful, I’ve decided to release it to the general public.
How It Works
If you’re unfamiliar with the term spider, I recommend that you read the Wikipedia page on web spiders for a thorough discussion of how a spider works. In a nutshell, my program goes to a Facebook user’s profile, scans their friends list for other profiles, visits each of their profiles, scanning their friends list, and so on. Along the way, my program also scans a user’s profile for information, parses it, and inserts it into a SQL database.
Features
I’m only aware of one other Facebook spider: a Perl script written by Michael Kelly. However, Michael’s script only collects information about user’s friends. My script captures all the information available in a user’s profile (except for the ‘About Me’ field). Furthermore, my script provides the following enhancements:
- Multi-threaded support. Each user’s profile is processed in its own thread. The total number of threads can be set using a command-line parameter, and the program uses semaphores to enforce the maximum number of threads.
- SQL database storage. My script stores user information in a SQL database ordered by Facebook UID. I’ve used relatively simple queries throughout the script, so any SQL database should be supported (i.e., MySQL and PostgreSQL should work). However, I’ve chosen SQLite3 as the default database. If you wish to use another database type, install the appropriate DBD driver and modify the database handle line to use that driver.
- Easy data processing. Since all data is stored in a SQL database, it should be relatively easy to write programs that query the database for information.
- Sleep between threads. It’s possible to provide a value, in seconds, that my script should wait before spawning a new thread. This should prevent the script from overloading the Facebook servers.
Quick Start
Assuming you have all the necessary Perl modules and sqlite3 installed:
- Create a SQLite3 database:
$ sqlite3 database.db ‘CREATE TABLE userdata ( uid integer, name, friends, school, status, sex, concentration, residence, hometown, highschool, screenname, mobile, website, lookingfor, interestedin, relationshipstatus, politicalviews, interests, clubsjobs, favoritemusic, favoritemovies, favoritebooks );’ - Create a facebook.conf:
$ cp facebook.conf.sample facebook.conf
$ vim facebook.conf - Start the script:
$ ./facebook.pl -t 2 -s 10 -f database.db [SOME FACEBOOK UID]
I Want It!
The script has been removed at Facebook’s request.
Notes
I haven’t tested the script lately, but it should still work. If it doesn’t, post a comment, and I’ll release an update.
Since my script parses the HTML returned from Facebook, if Facebook makes any changes to their profile layouts, I’ll have to make major modifications to the code.
Future
I’m in the process of designing an interface to Facebook that resembles Google Maps. Users will be able to interactively visualize their friend network, and clicking a user’s “node” should bring up their Facebook profile in a new window. More details will be forthcoming.
January 26th, 2006 at 11:46 am
code no work. not myspace compatible!!! whyyyyyyyyyyyyyyyyy im gonna go cut myself now…
i cant get myself to write in alternating caps… thats just too gay
January 26th, 2006 at 4:51 pm
OMG this is so illegal it isn’t even funny. Mr. evilcoder sir, you are in big trouble
February 12th, 2006 at 11:50 am
IRC = illegal!!!
http://security.berkeley.edu/tutorial/SecurityTutorialCertificateVertical.pdf
replace use vim with use notepad and cp with copy+paste in windows gui and i dont think they will understand the command line args
March 4th, 2006 at 7:53 pm
ITS BEEN MORE THAN A MONTH SINCE THE LAST BLOG!!!!
*prods you with a hot poker*
March 4th, 2006 at 9:05 pm
i thought cows where the ones that got prodded
*brands moo*
lawl!!!
March 18th, 2006 at 12:16 am
Do you know where I could get a facebook program to auto add friends?
Thanks
Bryan Barton
March 18th, 2006 at 10:57 am
I’m planning on writing a Facebook Perl module that should make writing such a program very easy.
I don’t know of any pre-existing programs for auto-adding friends.
March 25th, 2006 at 12:46 pm
Hi there. I’m working on creating a 3-dimensional flythrough of the
interwoven connections in facebook for a multimedia thesis project.
However, my expertise is with After Effects and Photoshop. I’m pretty
much illiterate when it comes to coding.
Would you be able to help me gather the data on the USC freshman class?
Thanks so much!
Best,
Ari
April 17th, 2006 at 6:54 am
Hi: I lived at Bowles Hall during my last two years at UC. I have tried to follow the disaster this last year, and have a few questions I would like to ask.
If you have the time, could you send me your e-mail address?
Thanks
October 22nd, 2006 at 6:31 pm
“Future
I’m in the process of designing an interface to Facebook that resembles Google Maps. Users will be able to interactively visualize their friend network, and clicking a user’s “node” should bring up their Facebook profile in a new window. More details will be forthcoming. ”
**** this sounds yummy. i want to eat it when you have it ready.
January 8th, 2007 at 11:10 am
Hey would it be possible for you to send me this script? Thanks.
April 24th, 2007 at 1:11 am
I was hoping you could also send me a copy of your facebook spider, I would really appreciate it.
Thanks in advance,
Dame
May 12th, 2007 at 3:46 am
Is it possible to get e version of the script?
July 10th, 2007 at 6:33 pm
hey, i’m trying to write a script that will basically allow me to download my entire user profile onto my computer (photos, including those taken by others, wall posts, etc.) however, I’m having trouble figuring out how to interface with facebook. i’m trying to do this in java so that i can integrate it into another project of mine. any tips or ideas of where to look for help would be greatly appreciated. thanks in advance!
September 16th, 2007 at 9:34 am
If you still read these, I would also like a copy of the script.
Thanks!
September 16th, 2007 at 6:54 pm
Hey.. would it be possible for you to send me this script? Thanks
October 9th, 2007 at 1:16 pm
Unfortunately there is broken link to facebook.pl script. Please send it to me or publish anywhere. Thx
deveba A gmail
November 8th, 2007 at 4:04 pm
Hey Stephen
I am completely non -techie and from the UK and I was wondering if you could create a spider that would crawl my friends list on facebook and their friends list and copy their email address into an excel spreadsheet?
Id pay you to make such a program
Cheers
Shuster
November 8th, 2007 at 4:59 pm
Steven,
It looks like you had to remove the app. Are you still able to share it? It is something I would really like to use. It is what I have been trying to write but I am running into problems logging into facebook.
I would like to use this tool to run on other social networks.
Thanks
andy
November 10th, 2007 at 3:10 pm
Has anyone been able to get this app? Or anyone like this. I would really like to see a copy.
Andy
January 7th, 2009 at 5:04 pm
funny dirty love poems