An Updated Guide to Backing Up or Exporting Your Facebook

Last update: March 15 at 7:11 p.m. EST

At the beginning of this year, I posted a series of FQL queries that would enable Facebook users to backup most of their account information. While a few services offering Facebook backups exist (see Part 6 below), I noted that none of them were anywhere near as comprehensive as using FQL when it came to messages and metadata. Over time, I added more queries and tricks to provide a more complete archive. Eventually, the only major block of information missing from the method was the e-mail addresses of friends.

I discovered that fellow blogger Tutkiun had found a way to exploit a Microsoft import tool to harvest friends’ addresses, but it was more obtrusive than I desired, since it required sending a Microsoft invitation message to all of those addresses during the import. At the time, though, the Tutkiun’s trick was the only legitimate way of exporting e-mail addresses.

In the last few weeks, though, Yahoo! has added an import tool to their e-mail product, and this now allows you to export your friends’ addresses quite easily and without sending any e-mails. Given this new feature and the slightly messy state of my last post, I decided to gather all of the current tricks for Facebook export and re-post them in this updated guide. Note that this guide can be rather technical, and most of the data provided will be in special formats that you likely won’t be able to browse and edit easily. If you’re looking for a simpler but less exhaustive approach, check out the services I discuss in Part 6 of this post.

I’ll note, though, that while this guide now gives you a very thorough export of their accounts, you may be disappointed to discover you have hardly anywhere to then import the information. Personally, I consider this method better suited for archiving your Facebook information than anything else. I think the best solution for “taking your social graph with you” will involve data portability which integrates with the Facebook API, but that will have to wait for another post.

I’ve divided my guide into seven parts. The first part can be accomplished using only the Facebook API Test Console. The second part involves a little trick for gathering phone numbers. The third section requires some technical know-how, as you need to administer a Facebook application. The fourth part describes using Yahoo! to get e-mail addresses. The fifth part provides option FQL queries for those wishing to archive even more of their information. The sixth section discusses solutions for backing up photos and videos, since the FQL queries only provide metadata and not the actual files. Finally, the seventh part adds some notes, tips, and limitations about the overall process. The second part yields data in JSON format, while the first, third, and fifth parts can either load JSON (my preference due to smaller file size) or XML. With the fourth section, Yahoo! can export to CSV format.

Update 1: Added optional queries 7-11, info on Fotobounce

Part 1: Standard FQL Queries

All of these queries can be executed using the Facebook API Test Console. Set the respone format to XML or JSON (I recommend JSON), select “fql.query” for the method, then fill in the query box. In each query, replace 00000000 with your Facebook ID number. (One way of finding this number is to visit your Facebook profile, right-click the link to “See All” of your friends, and copy the address. In the address, the number that follows “id=” is your Facebook ID.)

  1. Profile information of you and your friends: SELECT uid, first_name, last_name, name, pic_big, affiliations, religion, birthday, birthday_date, sex, hometown_location, political, current_location, activities, interests, music, tv, movies, books, quotes, about_me, hs_info, education_history, work_history, profile_url, profile_blurb, family, username, website FROM user WHERE uid = 00000000 OR uid IN (SELECT uid2 FROM friend WHERE uid1 = 00000000)
  2. Your friend lists: SELECT flid, name FROM friendlist WHERE owner = 00000000
  3. The members of your friend lists: SELECT flid, uid FROM friendlist WHERE flid IN (SELECT flid FROM friendlist WHERE owner = 00000000)
  4. Pages you’re a fan of: SELECT page_id, name, pic_big, website, type FROM page WHERE page_id IN (SELECT page_id FROM page_fan WHERE uid = 00000000)
  5. Links you have posted: SELECT link_id, owner_comment, created_time, title, summary, url, image_urls FROM link WHERE owner = 00000000
  6. Events you have attended: SELECT eid, name, tagline, pic_big, host, description, event_type, event_subtype, start_time, end_time, creator, location, venue FROM event WHERE eid IN (SELECT eid FROM event_member WHERE uid = 00000000 AND rsvp_status = “attending”)
  7. Your notes: SELECT note_id, title, created_time, content FROM note WHERE uid = 00000000
  8. Comments on your notes: SELECT object_id, post_id, fromid, time, text FROM comment WHERE object_id IN (SELECT note_id FROM note WHERE uid = 00000000)
  9. Your photo albums: SELECT aid, cover_pid, name, created, modified, description, location, size, link, visible, modified_major, type, object_id FROM album WHERE owner = 00000000
  10. Comments on your photo albums: SELECT object_id, post_id, fromid, time, text FROM comment WHERE object_id IN (SELECT object_id FROM album WHERE owner = 00000000)
  11. Your photos’ metadata: SELECT pid, aid, src_big, src_big_height, src_big_width, link, caption, created, modified, object_id FROM photo WHERE aid IN (SELECT aid FROM album WHERE owner = 00000000)
  12. Comments on your photos: SELECT object_id, post_id, fromid, time, text FROM comment WHERE object_id IN (SELECT object_id FROM photo WHERE aid IN (SELECT aid FROM album WHERE owner = 00000000))
  13. People tagged in your photos: SELECT pid, subject, text, xcoord, ycoord, created FROM photo_tag WHERE pid IN (SELECT pid FROM photo WHERE aid IN (SELECT aid FROM album WHERE owner = 00000000))
  14. Your videos’ metadata: SELECT vid, title, description, thumbnail_link, embed_html, updated_time, created_time FROM video WHERE owner = 00000000
  15. Comments on your videos: SELECT object_id, post_id, fromid, time, text FROM comment WHERE object_id IN (SELECT vid FROM video WHERE owner = 00000000)
  16. People tagged in your videos: SELECT vid, subject FROM video_tag WHERE vid IN (SELECT vid FROM video WHERE owner = 00000000)
  17. Groups you’re a member of: SELECT gid, name, nid, pic_big, description, group_type, group_subtype, recent_news, creator, update_time, office, website, venue, privacy FROM group WHERE gid IN (SELECT gid FROM group_member WHERE uid = 00000000)

Part 2: Phone Numbers

Visit this URI: http://www.facebook.com/friends/ajax/superfriends_phonebook.php?__a=1 The “payload” parameter lists any phone numbers of Facebook friends you can access, sorted by the Facebook ID number of each friend.

Part 3: Extended FQL Queries

These queries can only be executed by an application with certain extended permissions, and the Test Console does not have any. To perform these queries on your own, you need to have a Facebook application you administer. You can then enable the needed permissions using the following two URIs, replacing “ffffffffffffffffffffffffffffffff” with the application’s API key:

  • http://www.facebook.com/authorize.php?api_key=ffffffffffffffffffffffffffffffff&v=1.0&ext_perm=read_mailbox
  • http://www.facebook.com/authorize.php?api_key=ffffffffffffffffffffffffffffffff&v=1.0&ext_perm=read_stream

Once these permissions are enabled, you can again use the API Test Console by setting the application field accordingly.

  1. Threads in your inbox (requires “read_mailbox” permissions): SELECT thread_id, folder_id, subject, recipients, updated_time, parent_message_id, parent_thread_id, message_count, snippet, snippet_author, object_id FROM thread WHERE folder_id = 0
  2. Threads in your outbox (requires “read_mailbox” permissions): SELECT thread_id, folder_id, subject, recipients, updated_time, parent_message_id, parent_thread_id, message_count, snippet, snippet_author, object_id FROM thread WHERE folder_id = 1
  3. Messages in your inbox (requires “read_mailbox” permissions): SELECT message_id, thread_id, author_id, body, created_time, attachment FROM message WHERE thread_id IN (SELECT thread_id FROM thread WHERE folder_id = 0)
  4. Messages in your outbox (requires “read_mailbox” permissions): SELECT message_id, thread_id, author_id, body, created_time, attachment FROM message WHERE thread_id IN (SELECT thread_id FROM thread WHERE folder_id = 0)
  5. Your wall posts (requires “read_stream” permissions): SELECT post_id, app_id, source_id, updated_time, created_time, attribution, actor_id, target_id, message, app_data, action_links, attachment, comments, likes, privacy, permalink, tagged_ids, is_hidden FROM stream WHERE source_id = 00000000
  6. Comments on your wall posts (requires “read_stream” permissions): SELECT post_id, fromid, time, text FROM comment WHERE post_id IN (SELECT post_id FROM stream WHERE source_id = 00000000)

Part 4: E-mail Addresses

This part requires you to have a Yahoo! Mail account. If you don’t already have one, you can create one for free. In fact, I’d advise creating a new account to avoid your Facebook friends’ e-mail addresses getting mixed up with any others already in your address book.

  1. To add your friends’ e-mail addresses to your Yahoo! Address Book, follow the steps given on this page at the Yahoo! Mail blog. Essentially, you simply click “Import Contacts,” choose “Facebook,” and follow the steps. You will have to authorize a Facebook application built by Yahoo! for this purpose.
  2. To save a local copy of these addresses, you can use the export tools in Yahoo! Address Book. This help page from Yahoo! provides information on ways to transfer contacts from Yahoo! to various other programs. These specific steps are given for saving your entire address book as a CSV file:
  1. Open your Yahoo! Address Book and click “Import/Export” in the upper-right corner.
  2. In the Export section, click “Export Now” next to the phrase “Microsoft Outlook.”
  3. A dialog window opens and gives you the option to save your Yahoo! Address Book to disk as a .csv file. Click “Save to Disk”.
  4. Click “OK”.

Part 5: Optional FQL Queries

These FQL queries can be in the same way as those from Part 1, since they do not require any extended permissions. Personally, I think they go somewhat beyond the scope of what I’d consider an account backup, but people have shown an interest in exporting more information and none of these queries provide content not already accessible to you via the Facebook site. By the way, I’m not aware of a way to access who was tagged in notes and thus save the notes you’re tagged in.

  1. Photos you’re tagged in: SELECT pid, aid, src_big, src_big_height, src_big_width, link, caption, created, modified, object_id FROM photo WHERE pid IN (SELECT pid FROM photo_tag WHERE subject = 00000000)
  2. Comments on photos you’re tagged in: SELECT object_id, post_id, fromid, time, text FROM comment WHERE object_id IN (SELECT object_id FROM photo WHERE pid IN (SELECT pid FROM photo_tag WHERE subject = 00000000))
  3. People tagged in photos you’re tagged in: SELECT pid, subject, text, xcoord, ycoord, created FROM photo_tag WHERE pid IN (SELECT pid FROM photo WHERE pid IN (SELECT pid FROM photo_tag WHERE subject = 00000000))
  4. Videos you’re tagged in: SELECT vid, title, description, thumbnail_link, embed_html, updated_time, created_time FROM video WHERE vid IN (SELECT vid FROM video_tag WHERE subject = 00000000)
  5. Comments on videos you’re tagged in: SELECT object_id, post_id, fromid, time, text FROM comment WHERE object_id IN (SELECT vid FROM video WHERE vid IN (SELECT vid FROM video_tag WHERE subject = 00000000))
  6. People tagged in your videos you’re tagged in: SELECT vid, subject FROM video_tag WHERE vid IN (SELECT vid FROM video_tag WHERE subject = 00000000)
  7. People who liked your notes: SELECT object_id, user_id FROM like WHERE object_id IN (SELECT note_id FROM note WHERE uid = 00000000)
  8. People who liked your photos: SELECT object_id, user_id FROM like WHERE object_id IN (SELECT object_id FROM photo WHERE aid IN (SELECT aid FROM album WHERE owner = 00000000))
  9. People who liked your videos: SELECT object_id, user_id FROM like WHERE object_id IN(SELECT vid FROM video WHERE owner = 00000000)
  10. People who liked photos you’re tagged in: SELECT object_id, user_id FROM like WHERE object_id IN (SELECT object_id FROM photo WHERE pid IN (SELECT pid FROM photo_tag WHERE subject = 00000000))
  11. People who liked videos you’re tagged in: SELECT object_id, user_id FROM like WHERE object_id IN (SELECT vid FROM video WHERE vid IN (SELECT vid FROM video_tag WHERE subject = 00000000))

Part 6: Photos and Videos

The FQL queries in previous sections relating to photos and videos save metadata (e.g. caption, size, date uploaded, etc.) but not the actual files themselves. For photos, one field (src_big) provides a direct URI for downloading the file, and programmers so inclined could build scripts that process the JSON data and download all of the referenced files.

Several other options already exist, though, for downloading photos. Currently I know of these:

  1. Backupify. This service provides online backups of many online sites besides just Facebook. Their basic plan is free and provides backups for one account per site (e.g. one Facebook account, one Twitter account, etc.). If you find my backup method too technical, Backupify’s Facebook archives include an XML file that contains a list of your friends (with birthdays and locations), your notes, your status updates, your links, and events you’ve attended. Backupify also saves a copy of all your photos and all photos you’re tagged in. With the free plan, your photos are stored online and you can download them individually, but you have to pay for a plan that offers ZIP archives for downloading multiple files at once (currently $39.95/year, though free on sale through March 21).
  2. SocialSafe. This is an application you install on your computer which uses the Facebook API to create local backups. If you find my backup method too technical, SocialSafe offers the advantage of an interface for easily browsing your backups and maintaining different versions over time, though at a small fee ($2.99 at time of publication). Currently, SocialSafe only backs up your profile information, a list of your friends, your wall posts, and your photos, but the developers have stated they plan on adding more data, and upgrades are free after your purchase. Note that SocialSafe only saves photos you have uploaded; their site says they removed the ability to download tagged photos for legal reasons. I don’t know the details of the situation, but it seems they wanted to avoid liability in case you are tagged in copyrighted photos.
  3. FaceDown. This is a free program you can download which mimics a web browser logging into Facebook and lets you save all of the photos linked from a given Facebook page. I’ll quickly add that this program violates a fundamental security practice in that it asks for your Facebook username and password directly. Normally I would say you should never provide such credentials to a non-Facebook service or program, but in this particular case, your options are somewhat limited and the program functions as a web browser (i.e. it does not store your credentials or forward them to a third party). Also, I’m not entirely sure if this program remains consistent with the Facebook terms of service, which forbids “automated means” of collecting information from other users. I know Facebook monitors such activity, so while it’s unlikely if you’re careful, using this program could risk your account being disabled by Facebook.
  4. ArchiveFacebook. This is a Firefox add-on (it requires the Mozilla Firefox web browser) which seeks to archive a variety of information from your Facebook account, including photos. However, I’m quite certain this add-on violates the Facebook terms of service, which forbids “automated means” of collecting information from users, and thus I cannot recommend it. Using this add-on will risk your account being disabled.
  5. Fotobounce. This is an application you install on your computer (Mac or Windows) that syncs local photo albums with those on Facebook, and even maintains tags in photos. It appears that if your Facebook were somehow deleted, this program would be the easiest for restoring photo albums. From what I understand, Fotobounce also offers export capabilities similar to other programs listed above.

Obviously, no option is perfect, but of the four, I would personally say that Backupify is the one I find most appealing – except that I’m intrigued by Fotobounce, which I only recently discovered. I should note that I have personally used only Backupify and SocialSafe.

For videos, I know of no simple way to download multiple files. However, I have created a JavaScript bookmarklet which loads the source MP4 file of a given Facebook video. To use it, bookmark this link in your web browser and load the bookmark when viewing a page for a Facebook video. The bookmarklet automatically loads a high quality version if it’s available. The source code for the bookmarklet is as follows:

javascript:(function(){x=document.getElementsByTagName('div');for(y=0;y<x.length;y++){if(x[y].className.indexOf('mvp_player')==0){z='swf_'+x[y].id;}}eval('x='+z+'.getVariable("highqual_src");');if(!x){eval('x='+z+'.getVariable("video_src");');}x=x.replace(/[%]3A/,':');x=x.replace(/[%]2F/g,'/');location=x;})()

Alternatively, I discovered that others have built a Firefox add-on (requires the Mozilla Firefox web browser) called Facebook Video which lets you download Facebook videos. I have not tested this add-on or investigated it much, but it certainly appears to be legitimate.

Part 7: Additional Notes

  • With each FQL query, I have selected what I thought would be the most useful fields and what would yield reasonable response sizes. While I tried to make each request complete enough to represent a backup of the relevant information, you may disagree with my choices.
  • Some of the FQL queries may generate very large responses, so wait a few minutes if the results do not appear right away. In some cases, you may need to limit requests (i.e. adding “LIMIT 1,500″ for the first 500 results, “LIMIT 501,1000″ for the second 500, etc.). I tested the first query in Part 1 against an account with about 700 friends and did eventually receive a response using the Test Console, but it was about 1.2MB of data.
  • In my testing, the responses to these queries were complete, with one apparent exception. The query for comments on wall posts seemed to include only recent comments. I haven’t investigated the issue much yet but will post any updates if I find more information on the issue.
  • Note that the optional queries in Part 5 will generate data that overlaps significantly with information from Part 1. For example, any photo that you are tagged in which you also uploaded will appear in queries from both sections.
  • I have tested all of these techniques and they worked for me as described, but I can make no guarantee that they will work for you or continue to work in the future. I cannot guarantee your use of these methods will not result in any loss of information or exposure of personal information. Also, while I believe all of these methods follow any applicable terms of service at the time of publication (unless otherwise noted), I cannot guarantee that they do nor that they will be permitted in the future. You should not use these methods to download any content that you are not licensed to download. All of these instructions are provided simply as a convenience, and you follow them at your own risk.
  • Please feel free to send questions or feedback to me (my e-mail is theharmonyguy@gmail.com), but realize I may not be able to provide technical support if you’ve never used FQL or worked with JSON data. This is not a general solution for non-technical users by any stretch, but it at least provides an option for backing up Facebook information that’s more complete than methods I’ve seen elsewhere.

Facebook Adds Code for Clickjacking Prevention

Over the last several months, many Facebook users have fallen prey to clickjacking “worms.” Lured by tempting links on a friend’s wall, victims would click through to a page that seemed to promise interesting photos or other info. But the page instead contained an invisible inline frame that loaded Facebook’s share page. When a user clicked for their prize, they instead posted the attack page to their wall as well. In at least one case, the attack page also tried to install malware.

In each case, Facebook responded fairly quickly, and one benefit to the site’s centralized nature is that administrators can purge known links to clickjacking attacks from walls across the system. Still, by the time such problems become known, myriad users may have already been compromised. Posting shared links may not cause much damage, but this blog has outlined before how much is truly possible with clickjacking. It’s long been possible for attackers to use clickjacking for installing applications, thus harvesting user data before Facebook cuts off viral channels. As with many security risks, though, there seems to be a lag time between discovery of potential and actual exploitation. Even the basic clickjacking attacks of late have been possible for quite a long time before they first surfaced.

Given the many threats posed by clickjacking, I’ve been surprised that Facebook has never seemed to show an interest in implementing code aimed at blocking an attack. After similar “worms” appeared on Twitter, the microblogging site added framebusting JavaScript to reduce the risk. Completely avoiding clickjacking is difficult (if not impossible apart from added browser protections), but such measures certainly make it much more difficult.

But quietly, Facebook has fortified their code. I’m not sure how long the new protection has been in place – I’ve not seen it reported anywhere, and only noticed it this week. The only mention I saw of it on Twitter came only a few days previous. In any event, Facebook deserves praise for the change, and I personally find their current solution rather clever.

On high-risk pages (possibly every page, but I’ve only checked high-risk ones, such as for link sharing and application authorization), a block of code checks whether the page is “top” – that is, whether or not it’s inside of a frame. If the page finds itself “framed,” an image is loaded that notifies Facebook, and a div element is loaded on top of the page. The div is set to cover every element in the page, and adds a dark filter if visible. Finally, the div has an onclick event set which loads the Facebook page outside of the frame. Thus if someone clicked a link hiding an invisible Facebook iframe, they would only click the div and see the page reloaded in the full window.

An immediate weakness is that this requires JavaScript to work, but you’ll find it’s rather hard to use Facebook without JavaScript enabled. That may be disappointing from a usability perspective, but it’s certainly a plus in this context. In particular, authorizing an application appears to require JavaScript.

I’m very glad to see Facebook add an innovative way of protecting their users from clickjacking attacks. This change adds a layer of difficulty to several Facebook attacks I’ve described in the past. Granted, there are still many ways that applications can be exploited, but this new code may remove at least one attack vector.

Facebook StumbleUpon Digg Twitter Instapaper FriendFeed Delicious Google Bookmarks Share/Bookmark

Introducing Social Hacking’s New Look—and Myself

I’m happy to make several announcements today. First, I’ve long felt this blog had a rather staid design that needed upgrading. Over the last several weeks, I’ve worked on putting together the new look you now see at theharmonyguy.com. I went ahead and brought the theme live, but I still plan on making further adjustments to the code, so I’d ask for patience as the site developers. Thanks to Elegant Themes for providing the basis of the new design. I have some ideas for further updates to the content of this site to match the theme change, but those will have to wait until later.

Second, I’d like to introduce myself. I’m known to many online as “theharmonyguy,” a screen name that goes back many years for me. Using it as my moniker for writing about security research was a split-second decision when TechCrunch covered my first major “hack” in 2007. Part of my decision came from wanting to keep my hacking endeavors separate from other development projects I had in mind back then. More recently, though, security research has become more than a small hobby, and I think it’s time to shed the anonymity. While I’ll continue to use “theharmonyguy” as an online identity, my real name is Joey Tyson. I graduated from Wake Forest University last year with a masters degree in mathematics, but I’ve spent several years working in IT consulting and web development prior to my career as a hacker.

And that brings me to my third announcement. I’ve officially joined the team at Gemini Security Solutions in Chantilly, Virginia, and look forward to starting work with them in March. A big shout-out to the Liquidmatrix Security Digest for the job posting that led me to Gemini. I’m excited about serving Gemini as they provide quality information security consulting to other companies. Also, I’ve been graciously allowed to continue this blog and my personal Twitter feed with the caveat that they don’t interfere with my work duties. Please note, however, that everything I post here is my own perspective and does not in any way reflect on my employer.

Over the next few weeks I’ll be moving to a new state, adjusting to a new area, and getting settled in a new job, so I may not be posting as frequently during the transition. But I still plan on maintaining (and perhaps expanding) both this blog and my Twitter feed for the near future. Thank you so much to all my readers for your help and support!

Facebook StumbleUpon Digg Twitter Instapaper FriendFeed Delicious Google Bookmarks Share/Bookmark

Facebook SPAM on BlackBerry Devices

I always thought the Facebook Application for BlackBerry was a buggy, slow piece of junk.  Now I have noticed that this application is being abused by spammers to propagate Viagra and Percocet SPAM.  The screen shot to the right is an actual Facebook notification I received on my BlackBerry.

There seems to be an interesting bug in the Facebook Application for BlackBerry in which a spammer can spoof the “facebookmail.com” domain to have SPAM messages show up in your notifications list within the BlackBerry Facebook application.  This only works if you have the Facebook for BlackBerry Application installed AND you have an email account configured on your BlackBerry (yes, this includes a corporate email account as well).  The email account you have configured on your BlackBerry is where you actually receive the SPAM message, not through Facebook.

The Facebook Application for BlackBerry appears to notify on any new email in one of your BlackBerry mailbox’s with “*.facebookmail.com” in the sender or return-path field.  This is a win for the spammer because now you think Facebook is spamming you and with the addition of an email, you’re more tempted to click on the link.  The Facebook Application for BlackBerry is no stranger to controversy and this particular bug has been noticed recently by others as well.  It also appears that this bug only affects the BlackBerry Facebook application.  When testing the iPhone app I couldn’t replicate the issue.

To test this bug I used EXIM4 in Ubuntu as a mail relay with mailtools to send the email.  This allowed me to send a spoofed email as “agent0x0@facebookmail.com” to one of the email accounts I have configured on my BlackBerry.  Here are screen shots of the spoofed email in my inbox and what it looks like in the Facebook Application for BlackBerry:

My opinion is that a mobile Facebook application should never be polling your personal email for these messages…but then again this could be a “feature” of this nicely designed application, right? :-)   Special thanks to Kevin Johnson for helping with some of the research/testing.

Share and Enjoy


FacebookTwitterDeliciousDiggStumbleUponAdd to favoritesEmailRSS


Using Google Buzz Can Expose Your Gmail Address

I’ve discovered another trick that may surprise some, this time relating to Google’s services. I don’t view the issue as a vulnerability, but it likely goes against user privacy expectations. In short, having a public Google profile (which you might have created when checking out Google Buzz) can allow others to figure out your Gmail address.

This really shouldn’t be that surprising, given that your username is generally consistent across Google services, and a public profile is public. But those who currently have numeric profile addresses (e.g. http://www.google.com/profiles/104424237445852766735) might think their profile is not easily tied to their username.

But by using Picasa, Google’s photo sharing service, it’s often quite simple to go from a numeric profile address to an actual username. To protect yourself from this access, visit the Picasa settings page.  Under “Your gallery URL,” add a new username and select the new username for your gallery URL. Also, you may want to edit your nickname.

In my testing thus far, it matters little whether you’ve used Picasa before – if you have a Gmail account, Picasa is also enabled on your account. And while individual Picasa albums have privacy controls, I have not found a way to block simply loading your Picasa home page.

With the introduction of Buzz, Google is encouraging users to take advantage of Google profiles. But in the process, Google is tying together services that many users may have treated quite distinctly in the past. If you want your Gmail address to remain private, you need to manage properly the other Google services you use to avoid one of them exposing your Gmail username.

Update (Feb. 13): It appears Google has adjusted their services to prevent the original URI trick from working. Previously, adding a profile number to picasaweb.google.com (e.g.  http://picasaweb.google.com/104424237445852766735) would either load a page with the username visible, the username embedded in the page’s source code (_user.name in JavaScript), or an error page in a few particular instances. One configuration that would simply produce an error page was if you had Picasa setup under a different username than your Gmail username, hence my advice. It now seems that using a numeric Picasa URI will either load an error page if the user does have Picasa setup or a page indicating the user does not have Picasa galleries but with no username anywhere in the page.

I’ve already done some preliminary testing to see if Google Reader could also be used to discover usernames, but so far that does not seem possible. Still, it’s wise to be cautious when using a tool that interacts with so many other services.

Facebook StumbleUpon Digg Twitter Instapaper FriendFeed Delicious Google Bookmarks Share/Bookmark

Facebook’s Fluid Definition of Publicly Available Information

In yet another example of security through obscurity, Facebook modified their platform last July to prevent applications from accessing public photo albums for users that were not friends of the logged-in user. Facebook had previously said such applications did not violate the site’s privacy policy, since the behavior followed photo album privacy settings – applications could only load albums marked as visible to “Everyone.”

But “Everyone” is the default privacy setting for photo albums, and many users probably don’t mean for everyone to see their photos. As a CNET report noted:

A Facebook spokesperson said the company made the change so the technology more closely matched users’ privacy expectations.

“We made this change in order to ensure that users who have their profiles set to a privacy other than ‘everyone’ are not surprised by photos being exposed through the API,” Facebook engineer Matt Trainer wrote in response to complaints on the developer forum site.

In other words, Facebook introduced inconsistent application of privacy settings (are the albums available to everyone or not?) so that users would continue to believe a false representation of who could access their content.

Fast forward to 2010, as Facebook users grapple with revamped privacy controls, new default settings, and the general introduction of “publicly available information,” or PAI. With the announcement of PAI, Facebook removed users’ ability to control access for certain bits of information. Among the data now included in the PAI category: the list of your Facebook friends.

That particular change riled many critics, and Facebook eventually backpedaled a bit, allowing users to remove friends lists from their profiles. But the company made quite clear that your list of friends was still considered publicly available information. With this behavior, Facebook setup a strange distinction between permission and visibility. Everyone was technically allowed to see your friends list, but had no means to do so if you removed it from your profile.

Of course, it wasn’t long before someone discovered a “means to do so.” In December, I posted a simple trick that would reveal the names and profile pages of any user’s friends, regardless of whether they blocked such a list on their profile. I try to follow principles of responsible disclosure with security vulnerabilities, but in this case, my “hack” in no way violated or worked around Facebook’s stated privacy policy, since friends lists were now public.

But the other day, I tried using my trick once more, and noticed that it no longer worked for users who chose to hide their friends lists. I’ve also found that issuing an FQL query for the friends list of a user beside the currently logged-in user fails – I don’t recall precisely the behavior of such a command back in December.

Oddly enough, Facebook has yet to block my trick for viewing a user’s public photo albums, which avoids last July’s changes as it does not involve the Facebook API.

It seems Facebook wants to have their cake and eat it too – give users the impression they still maintain control over their data, but still classify the data as public if circumstances warrant. Personally, I think it better for the company to treat “public” information consistently so that any user surprises come now and not later when people discover other means of accessing content.

By the way, a simple adaptation of my photos trick lets you discover a user’s full name based on their profile ID (which, by the way, is included in the filename of every photo you post – and that filename may be maintained if you upload the photo to sites such as Twitter), regardless of their profile privacy. (Some users restrict access to their profile, so trying to load it directly or request their name via the Facebook API Test Console would fail.) Is this new trick a violation of user privacy or a demonstration of “publicly available information?”

Facebook StumbleUpon Digg Twitter Instapaper FriendFeed Delicious Google Bookmarks Share/Bookmark

Cross-Site Scripting Pop Quiz

You have ten seconds to spot the problem in the image below. Ready? Go!

Example of ESPN's "Report a Bug" page

I hope you spotted the problem right away, as it’s a classic example of a cross-site scripting hole. The page mentions that the report will reference a particular URI, and that address also appears as a parameter in the page’s URI. As you might guess, the parameter is not being filtered, allowing one to insert any HTML code.

I found it rather ironic that I came across this problem as I was looking for a means to contact ESPN about two other XSS holes. All three issues were reported to ESPN back in late November, then reported again via different means earlier this month. After receiving no response to either report, I decided to go ahead and release this hole publicly.

By the way, I realize some of my posts about XSS issues aren’t directly related to social networking sites and thus diverge from the usual fare on this blog. However, I think they can serve as important lessons for all developers, including those building social networking applications. This sort of vulnerability is exactly the type that leads to FAXX hacks in Facebook applications. And perhaps it will serve as some comfort to smaller developers that even large sites are susceptible to such problems. Anyway, I also think it’s important to record these finds for future reference, and this blog is about the only place I have to do so.

Facebook StumbleUpon Digg Twitter Instapaper FriendFeed Delicious Google Bookmarks Share/Bookmark

In Defense of Walled Gardens

It’s easy to assume that when it comes to data and software development, “open” is always better than “closed.” We’ve seen an explosion of open source software, praised companies for supporting open standards, and breathlessly tracked products with “open” in their name, from OpenID to OpenSocial. “Closed” has become the scarlet letter of the Internet, at times expressed by the censure of being branded a “walled garden.”

Facebook has often faced this criticism, particularly after unveiling the Facebook Platform in 2007. Several bloggers compared Facebook unfavorably to AOL of yesteryear, eschewing Facebook’s “proprietary” (gasp!) FBML and FQL interfaces. Some even portrayed Facebook as a competitor to the Web itself. While the definition of “walled garden” was not always particularly clear, observers were unhappy with so much data flowing into Facebook and so little flowing out.

One would think that now, with the Facebook API able to expose your wall, News Feed, inbox, and just about every bit of profile data (even e-mail addresses to some degree) Facebook would be allowed in the open club. Indeed, some writers have noted changes since 2007 that justify dropping the dreaded horticultural moniker. But others continue to speak worriedly of Facebook’s dominance, even still drawing comparisons to AOL.

I, for one, not only have full confidence in the Web outlasting any supposed competition but also see Facebook as very much a part of that resilient network. In fact, I’d like to propose a bit of Internet heresy by according walled gardens a place among the open fields of the online realm.

First, let’s establish one fact: Facebook has always been part of, not opposed to, the Web. Disregard ridiculous arguments over FBML and FQL, which were no more of a threat to HTML and SQL than Smarty and WordPress template functions. Open any Facebook application, choose to “view source,” and all you’ll see is good ol’ HTML, CSS, and JavaScript. The Facebook Platform allowed developers to build on top of Facebook, just as Movable Type and Joomla allowed developers to write plug-ins using Perl and PHP. (Differences: you could not roll your own Facebook, Facebook essentially installed every plug-in, and you have to host the code.) That Facebook disallowed certain HTML security risks and added a few convenient tags for interfacing with their content in one approach to development (one could always write full-blown HTML using canvas iframes instead of FBML) hardly meant they were reinventing Web standards.

Technical considerations aside, some writers argued that Facebook opposed the Web in spirit – more specifically, the spirit of openness. Even though Facebook applications (and inverting Facebook’s criticized original setup, other web sites via Facebook Connect) have wide access to Facebook data now, average users still face hurdles if they wish to view posts and information from other Facebook users. At minimum, one has to create an account and login to Facebook to see beyond bare basics. Prior to recent privacy changes, access to content from non-friends was highly limited even after logging in. And while some of Facebook’s data should start appearing in search engines this year, very little has been indexed by Google so far. As I said before, users generate much content within the context of Facebook, but that context usually remains locked away from public access.

Before I respond directly to such charges, I’d note that Facebook (or any social networking utility) serves a limited purpose. Did you catch that? Facebook serves a limited purpose. Facebook was never meant to duplicate the Internet. If I need reference material on world history, I might turn to Google or (as a starting point) Wikipedia. If I want to know the latest technology news, I can bring up Techmeme. If I want to catch up on a favorite TV show, I’ll probably load Hulu. None of these tasks have any inherent social component that would cause me to first open Facebook when fulfilling them.

Last summer, however, I wrote a series of articles on particular doctrinal issues that affected certain people in churches and organizations I’ve been a part of. I’m not ashamed of my opinions, but they involve points that would not make sense to someone who did not have the background and context of the limited audience I had in mind when writing. Consequently, I would prefer such musings did not appear in the Google search results of an acquaintance unfamiliar with my topics. I shared my thoughts with certain friends via Facebook Notes.

I have friends who live several states away that wish to keep me and others posted on life in their growing family. They want to share what adventures their children are having with extended family across the country. Friends desire to see how they’ve decorated their new home and exchange tips on managing it. Rather than open themselves to potential hazards of their house and kids being featured in an image search, my friends can use Facebook’s photo albums to control who can observe their daily life.

These are but two use cases out of a hundred or more that (1) inherently involve a user’s social graph and (2) inherently involve content not intended for public consumption. To argue that Web-based services other than Facebook could provide similar functionality in an open context misses the point. Yes, I could have published my articles with Blogger, my friends could post their photos on Flickr. But these particular examples are not simply about sharing ideas and pictures – they involve sharing ideas and pictures with certain people.

Are you dissatisfied with Facebook limiting access to content? That’s where the rest of the Web (again, Facebook is one part of the Web) comes in handy. If you’re a blogger who wants the world to hear your thoughts, forget Facebook and start a blog. If you’re a photographer who wants to advertise your portfolio, forget Facebook and use a more open service. If you’re looking to interact with a small subset of the world, however, a walled garden may be just the thing for you.

Of course, these days Facebook’s leadership may cringe at my last paragraph, as they seem to be taking a new angle on their service’s purpose. But a few years ago, privacy and control (in essence, the very things that made it a “walled garden”) are what distinguished Facebook from competitors. Personally, I have trouble buying Mark Zuckerberg’s story that “if he were to create Facebook again today, user information would by default be public” (ReadWriteWeb), as such a site really wouldn’t have been much on an innovation. Recall that prior to Facebook’s rise, MySpace dominated social networking sites, and many (if not most) MySpace profiles were publicly accessible. Limitations are what made Facebook novel – originally, only college students were even allowed to create profiles on the site, and all profiles followed a strict layout.

In my experience, friends flocked to Facebook because it let them participate in new technologies (sharing digital photos, for instance) but in a controlled environment where they could enjoy a level of privacy. The garden walls were selling points – college students didn’t want just anyone seeing their photos and messages; later on, parents didn’t want their teenagers communicating with just anyone. I think that’s partly why Facebook’s recent moves encouraging users to share more openly generated such controversy. Users felt they had fallen victim to a “bait and switch” scheme: they invited their friends to use Facebook so they could share privately, now suddenly Facebook has forced them to share certain information and is pushing them to share the rest.

It’s also worth noting that in its early days, Facebook offered little functionality that couldn’t be found elsewhere. The notion of a profile, the ability to send private messages, the exchange of ideas centered around certain topics – all of these features were hallmarks of forum sites for years. But while the members of a forum formed a particular social graph centered around a certain niche community, Zuckerberg portrayed a user’s social graph on Facebook as a mirror of their everyday, real-life connections. Facebook (and other sites, such as MySpace) took the features of forums and adapted them for a general purpose audience, where you essentially chose the members of your forum based on who you already communicated with offline.

This brings us back to Facebook’s limited purpose and why some of the hand-wringing over its “proprietary” nature strikes me as overreacting. For instance, back in 2007, Gervase Markham of the Mozilla Foundation expressed concern that the messaging system in Facebook or LinkedIn might turn e-mail into a closed system incompatible with outside domains – in short, a walled garden. But Facebook messages could never replace SMTP e-mail. If I want to interact with close friends, Facebook messages provide a convenient, hassle-free means of doing so. Yet if I need to exchange notes or documents with acquaintances, large groups, or even businesses, Facebook messages are hardly up to the task. Once again, such use cases do not involve my social graph. I think Facebook recognized this as they expanded and started trying to juggle multiple social graphs beyond a user’s closer friends, since messages from fan pages (essentially, communications for a business) are filed away in a folder quite separate from a user’s main inbox of messages.

All this being said, please don’t think I’m opposed to more “open” approaches to handling my social graph, such as distributed social networking – far from it. I think Facebook is still an early player in online social networking, and that we’ll see many more platforms and ideas develop in years to come. But I think we’re still a long way from a time where the open alternatives provide end users with more value than walled gardens in the types of use cases I’ve already outlined. As much as I’d like to see federated social networking platforms thrive, I foresee many hurdles that have yet to be overcome. Distributed networks will have to deal with issues relating to performance (imagine generating a news feed when your friends’ data comes from hundreds of different servers), retention (is data cached, how long, etc.), reliability (what happens when a few of your friends’ servers are down?), privacy (how will access be controlled and monitored), and security (avoiding injection attacks, ensuring all hosts stay up-to-date, etc.), not to mention monetization (a problem that still plagues closed systems). And when it comes to user value, remember that walled gardens have a few inherent advantages – in a security example, if Facebook detects a worm spreading malicious links via messages, they can block all messages with a certain signature or strip out links to a known rogue site.

I suppose my main point is that we need not be concerned if Internet users (even 350 million of them) find use for a service that strikes many technology-minded people as a walled garden. While the Internet was built on open, equal access, that very setup enables some services to provide certain features in a more limited context while still taking advantage of Web technologies. And for many people, these gated communities provide real value that would actually diminish if Google began indexing it all. While certain circles seem to think any notion of online privacy is at best naïve (and granted, some users need to exercise more caution in what they post online, regardless of what service they use), I tend to think that the only people saying privacy is dead are those named in its will. And when privacy does become a factor in sharing online, at times, a garden might need walls.

P.S.: Lest you think I’ve changed my opinion in light of recent privacy controversies, I’d note that I stated very similar thoughts back in 2007 when some of these debates over Facebook first developed.

Facebook StumbleUpon Digg Twitter Instapaper FriendFeed Delicious Google Bookmarks Share/Bookmark

1 6 7 8 9 10 28