ANALYZING MY FACEBOOK NETWORK Using Gephi 

INTRODUCTION:

Social networks are popular infrastructures for communication, interaction, and information sharing on the Internet. Popular social networks such as Twitter, Facebook, Google+, Instagram etc.  provide communication, storage and social applications for hundreds of millions of users. Users join, establish social links to friends, and leverage their social links to share content, organize events, and search for specific users or shared resources. These social networks provide platforms for organizing events, user to user communication, and are among the Internet’s most popular destinations.

Visualizing data is like photography. Instead of starting with a blank canvas, you can manipulate the lens used to present the data from a certain angle.  When the data is the social graph of 500 million people, there are a lot of lenses through which you can view it. One that piqued my curiosity was about the interaction and relationship between Facebook mutual friends.

DESCRIPTION OF ANALYSIS WORK:

The graph in the first page is a visual representation of my Facebook friends and rendered using Gephi. This analysis allowed me to better understand the relationship between the mutual friends and their respective relationships with each other. The systematic layout of the said network, sorting  of relevant nodes and edges, sizes for the nodes or edges or where to position each node with respect to each other are very important for the said analysis.  All of these are decisions that we need to make in order to find the hidden highlights in the data.

In order to get my network of friends from Facebook, an application named Netvizz was used to extract the data from different parts of social network.

Some of the key data extraction modules for Netvizz are mentioned below: 
1.       Your personal friend network
2.       The network of likes- All the pages that have been liked by mutual friends
3.       Friendship connections or interactions within Facebook groups
4.       Networks related to Facebook pages

For this particular analysis, the first parameter was chosen in Netvizz and “.gdf” file was extracted for further analysis and visualization in Gephi. In my case, the data consisted 1200+ nodes- representing my entire friend’s network with an average of 8.154 edges, representing all the connections in between them.

The .gdf file was imported to Gephi for visualization purpose. The first step in order to better visualize my network was changing the graph layout. “Force Atlas” layout was chosen as it makes nodes repulse one another while edges attract the nodes they connect. 

The important thing to note here is that initial group ratification has started with formation of unrelated and disconnected groups. It should be noted that the biggest group would have many sub groups associated with it. In order to make those sub-groups more distinguishable, relevant statistics available in Gephi known as Modularity analysis was used. Modularity statistics make use of a method that allows fast unfolding of community in large networks. In the modularity statistical report showed that the network had 18 communities but 10 communities had very few members in it. So in practical terms, only 8 communities shows relevancy in data visualization.

The network with relevant communities with proper colors coding was used to visualize the clusters. In order to find other meaningful insights from single-nodes statistics, ranking with respect to the size of each node was performed based on its betweenness centrality. This is a measure which equals to the shortest path from all vertices to all others that pass through the nodes. To get the said measure, graph distance statistics was performed. It should be noted that radius of the analyzed network was 11 and the average path length is 5.4 approx.

OBSERVED HIGHLIGHTS: 

There  were many interesting highlights observed in the network, obviously because the dataset in naturally too personal to find interesting findings. Nevertheless the analysis had been very illuminating and useful in order to become more familiar with graphs and how to visualize them. 

The measure of betweenness centrality shows a measure of the whole network, since the shortest paths should visit all the nodes eventually. On the contrary, the degree or connectivity may throw highly localized results. For instance, the most connected node show high connectivity inside one relatively community, but the said group could be isolated from the rest of the network where the nodes have low degree. 

Several communities distinguishable on the analyzed network have high connection among them. Family, colleagues,school & college friends show high level of connectivity among them. On the contrary academic peer groups and online gaming communities is highly disconnected from previous contacts but with high internal degree of connectivity 


ABOUT GEPHI:
Gephi is an open-source network analysis and visualization software package written in Java on the NetBeans platform initially developed by students of the University of Technology of Compiegne (UTC) in France. It uses a 3D render engine to display large networks in real-time and to speed up the exploration. A flexible and multi-task architecture brings new possibilities to work with complex data sets and produce valuable visual results. 



Comments

Popular Posts