ANALYZING MY FACEBOOK NETWORK Using Gephi
INTRODUCTION:
Social networks are popular infrastructures for
communication, interaction, and information sharing on the Internet. Popular
social networks such as Twitter, Facebook, Google+, Instagram etc. provide communication, storage and social
applications for hundreds of millions of users. Users join, establish social
links to friends, and leverage their social links to share content, organize
events, and search for specific users or shared resources. These social
networks provide platforms for organizing events, user to user communication,
and are among the Internet’s most popular destinations.
Visualizing data is like photography. Instead
of starting with a blank canvas, you can manipulate the lens used to present
the data from a certain angle. When the
data is the social graph of 500 million people, there are a lot of lenses
through which you can view it. One that piqued my curiosity was about the
interaction and relationship between Facebook mutual friends.
DESCRIPTION
OF ANALYSIS WORK:
The graph in the first page is a visual
representation of my Facebook friends and rendered using Gephi. This analysis
allowed me to better understand the relationship between the mutual friends and
their respective relationships with each other. The systematic layout of the
said network, sorting of relevant nodes
and edges, sizes for the nodes or edges or where to position each node with
respect to each other are very important for the said analysis. All of these are decisions that we need to
make in order to find the hidden highlights in the data.
In order to get my network of friends from
Facebook, an application named Netvizz was used to extract the data from
different parts of social network.
Some of the key data extraction modules for
Netvizz are mentioned below:
1. Your personal friend network
2. The network of likes- All the pages that have
been liked by mutual friends
3. Friendship connections or interactions within
Facebook groups
4. Networks related to Facebook pages
For this particular analysis, the first
parameter was chosen in Netvizz and “.gdf”
file was extracted for further analysis and visualization in Gephi. In my case,
the data consisted 1200+ nodes- representing my entire friend’s network with an
average of 8.154 edges, representing all the connections in between them.
The .gdf
file was imported to Gephi for visualization purpose. The first step in order
to better visualize my network was changing the graph layout. “Force Atlas” layout was chosen as it
makes nodes repulse one another while edges attract the nodes they connect.
The important thing to note here is that
initial group ratification has started with formation of unrelated and
disconnected groups. It should be noted that the biggest group would have many
sub groups associated with it. In order to make those sub-groups more
distinguishable, relevant statistics available in Gephi known as Modularity
analysis was used. Modularity statistics make use of a method that allows fast
unfolding of community in large networks. In the modularity statistical report
showed that the network had 18 communities but 10 communities had very few
members in it. So in practical terms, only 8 communities shows relevancy in
data visualization.
The network with relevant communities with
proper colors coding was used to visualize the clusters. In order to find other
meaningful insights from single-nodes statistics, ranking with respect to the
size of each node was performed based on its betweenness centrality. This is a
measure which equals to the shortest path from all vertices to all others that
pass through the nodes. To get the said measure, graph distance statistics was
performed. It should be noted that radius of the analyzed network was 11 and
the average path length is 5.4 approx.
OBSERVED
HIGHLIGHTS:
There were many interesting highlights observed in the network, obviously because the dataset in naturally too personal to find interesting findings. Nevertheless the analysis had been very illuminating and useful in order to become more familiar with graphs and how to visualize them.
The measure of betweenness centrality shows a measure of the whole network, since the shortest paths should visit all the nodes eventually. On the contrary, the degree or connectivity may throw highly localized results. For instance, the most connected node show high connectivity inside one relatively community, but the said group could be isolated from the rest of the network where the nodes have low degree.
Several communities distinguishable on the analyzed network have high connection among them. Family, colleagues,school & college friends show high level of connectivity among them. On the contrary academic peer groups and online gaming communities is highly disconnected from previous contacts but with high internal degree of connectivity
ABOUT GEPHI:
Gephi is
an open-source network analysis and visualization software package written in
Java on the NetBeans platform initially developed by students of the University
of Technology of Compiegne (UTC) in France. It uses a 3D render engine to
display large networks in real-time and to speed up the exploration. A flexible
and multi-task architecture brings new possibilities to work with complex data
sets and produce valuable visual results.
Comments
Post a Comment