Gephi Visualization, Attempt 2

This post is all about the power of Twitter and the web. When you need help or are trying something new, people are always willing to help you. Just be sure you write about it online first.

After publishing my post on failing to visualize Twitter followers, the Tony Hirst himself commented on my post, noting perhaps why I hadn’t been able to see the .CSV file in Gephi the first time around.

He suggested I open Gephi and go to the Data Laboratory and import the data that way. We also had a nice heart-to-heart, via Twitter, on what I was expecting to see by looking at my followers list. The phrase “star network” came up. Here’s part of the Twitter conversation.

(I used to archive the conversation.)

After realizing I needed a list of my followers’ followers and that I didn’t have that, Tony was really nice and used his own script to send me a .gdf file.

When I tried to originally save the .gdf, I had some problems. I ended up using the Data Laboratory to upload what I had saved, but it was just showing the nodes — no edges. There wasn’t quite that much I could do with a bunch of nodes. I felt like I was getting nowhere and becoming frustrated with myself. Why can’t I do this?!

After about 15 very long minutes, I realized I had to copy and paste the link Tony tweeted at me into Text Wrangler, making sure the first line was filled with “nodedef > name.” Finally, I was able to get a graph that had edges and nodes. (It’s amazing how accomplished that made me feel!)

Now, I could go to “Visualising Twitter Friend Connections Using Gephi: An Example Using the @WiredUK Friends Network” and follow the instructions. Woohoo!!

So the first step was to filter the graph using the Giant Component filter, under Filters > Library > Topography on the righthand side. (This filters out any unconnected nodes.) As far as I can see, that didn’t change the graph too much.

Now I will go about coloring the graph. Tony recommends the modularity statistic. He says, “This algorithm attempts to find clusters in the graph by identifying components that are highly interconnected.” Modularity is under Statistics > Network Overview.

I ran Modularity twice and both times, it said I have 85 communities.

Now I am going to color the graph by opening the Partition panel on top lefthand corner of the screen. In order to refresh the partition, you have to click on the two green arrows that look like a recycle icon. Then I chose Modularity Class for the partition. I changed the colors a bit for the first few nodes, but then decided to just leave it with the colors picked.

Then I went onto Display. The first one I chose was “Force Atlas.” The Force displays show how tightly connected a group is. I used the default settings and hit run. My grap changed into something with four, more or less, distinct areas. (Note: I do need to change the colors.  They do matter after all!)

To compare, I then ran “Force Atlas 2.” I immediately noticed there were a lot outliers. Not sure, yet, if that’s significant or not.

Ok, now time to turn on the labels to see what node is what.

My first attempt made it clear that whatever colors I had chosen for nodes didn’t matter. The words covered them all and were unreadable.

I copied Tony’s directions on how to change the size of the nodes and text according to their importance. Unfortunately, my graph was still really dense and hard to read.

I played around with the text some more and was able to get it slightly more readable. I also changed the orientation around a bit and modified the colors a wee bit.

My Final Product:

The analysis of my Twitter followers show they are basically grouped into four main categories:

Purple: Journalism. These are people mainly based in NYC who are journalists. This is where you’ll find 10,000 Words, the blog I write for, and other people I respect and follow, who follow me back.

Turquoise: London. Everyone here represents those I’ve come to know since moving to London. The Guardian community editors I know and Paul Bradshaw show up quite prominently.

Yellow: The general socia media community. This group is basically comprised of a lot of individuals I know who talk about social media. I think a lot of them follow each other but they aren’t associated with one large group in my life, such as journalism folks or London people.

Hot Pink: Boston. These are all tweeps who live in Boston or tweet about Boston. It’s interesting that none of them have large nodes. Most deal with community news and have smaller amounts of followers than the other three groups.

There are, of course, other groups in this map, such as social media folks I went to undergrad school with. But I was really impressed by how well this graph represented my life so well. Journalism, London, Boston, and social media make up huge elements of my identity and it’s really well depicted here.

Read A Not-So Successful Attempt To Visualize My Twitter Followers (or Gephi Visualization, Attempt 1)