by Paul Butler on Monday, December 13, 2010 at 8:54pm
Visualizing data is like photography. Instead of starting with a blank canvas, you manipulate the lens used to present the data from a certain angle.
When the data is the social graph of 500 million people, there are a lot of lenses through which you can view it. One that piqued my curiosity was the locality of friendship. I was interested in seeing how geography and political borders affected where people lived relative to their friends. I wanted a visualization that would show which cities had a lot of friendships between them.
I began by taking a sample of about ten million pairs of friends from Apache Hive, our data warehouse. I combined that data with each user’s current city and summed the number of friends between each pair of cities. Then I merged the data with the longitude and latitude of each city.
At that point, I began exploring it in R, an open-source statistics environment. As a sanity check, I plotted points at some of the latitude and longitude coordinates. To my relief, what I saw was roughly an outline of the world. Next I erased the dots and plotted lines between the points. After a few minutes of rendering, a big white blob appeared in the center of the map. Some of the outer edges of the blob vaguely resembled the continents, but it was clear that I had too much data to get interesting results just by drawing lines. I thought that making the lines semi-transparent would do the trick, but I quickly realized that my graphing environment couldn’t handle enough shades of color for it to work the way I wanted.
Instead I found a way to simulate the effect I wanted. I defined weights for each pair of cities as a function of the Euclidean distance between them and the number of friends between them. Then I plotted lines between the pairs by weight, so that pairs of cities with the most friendships between them were drawn on top of the others. I used a color ramp from black to blue to white, with each line’s color depending on its weight. I also transformed some of the lines to wrap around the image, rather than spanning more than halfway around the world.
After a few minutes of rendering, the new plot appeared, and I was a bit taken aback by what I saw. The blob had turned into a surprisingly detailed map of the world. Not only were continents visible, certain international borders were apparent as well. What really struck me, though, was knowing that the lines didn’t represent coasts or rivers or political borders, but real human relationships. Each line might represent a friendship made while travelling, a family member abroad, or an old college friend pulled away by the various forces of life.
Later I replaced the lines with great circle arcs, which are the shortest routes between two points on the Earth. Because the Earth is a sphere, these are often not straight lines on the projection.
When I shared the image with others within Facebook, it resonated with many people. It’s not just a pretty picture, it’s a reaffirmation of the impact we have in connecting people, even across oceans and borders.
Paul is an intern on Facebook’s data infrastructure engineering team.
I am sorry, I received this in an email, so I do not know where the original post came from.
I wanted to share this because it made me think about what Facebook is now. Amazing statistic alert!!! 1 out of EVERY 10 people in the world is now a member of Facebook. Think about that for a moment. One service is being used by one-tenth of the entire world’s population.
That means that as a species we are connected in a million different ways. I know John who knows George who knows Paul who knows Ringo who knows me. But I have never met George or Paul. Above and beyond that, Facebook makes connections that I never even thought of as feasible. I am regularly amazed at the people who say that they like my blog, which they found on Facebook.
I have two takeaways from that thought:
1. We can connect with so many people who have the same interests, wants and desires as us without us even knowing it. I can write about my interest in personal finance and have friends and coworkers (who I never would have guessed are interested in personal finance) engage me in conversation on a post I wrote.
2. I wish I could see names and faces for every person who reads my blog. I have analytics, but that only tells me how many people read it and where they came from (both what website directed them to me and where they are geographically). I know that might make some privacy concerns come up, but it would be great to know that the guy I see on the disc golf course would be very interested in discussing movies with me while we disc.
Another thought that this picture gave me was “Wow, look at what huge portions of the world are missing.” Most of Africa is black. Many parts of South America are gone. Looking at this map, would you even know that China existed? I think it shows how far we have to go to truly make the world connected. Soapbox time: I truly believe that education is the most important thing that separates rich from poor, good from evil and (obviously) ignorance from tolerance. If the entire world was as connected as the U.S. is east of the Mississippi, we would be living in an entirely different world.
What did this picture make you think of? Any comments on my thoughts? Let me know in the comments below.