• Home
  • Readings
  • Github
  • MIES
  • TmVal
  • About
Gene Dan's Blog

No. 111: Visualizing Erdos Collaborators

20 November, 2014 8:31 PM / Leave a Comment / Gene Dan

So a while back I wrote about visualizing networks – well, I’m back at it again and I found a sweet dataset containing the Enron emails. However, that dataset might take a few day’s worth of computing power to process, so in the meantime I’ll be showing a much more manageable dataset containing research partnerships of mathematicians who collaborated with Paul Erdos.

In short, Paul Erdos was a great 20th century mathematician who was famous for his eccentric behavior, prolific publishing, and his copious consumption of amphetamines and caffeine. Today, mathematicians often refer to something called an Erdos number, which indicates one’s closeness to the late mathematician. For example, those who published papers directly collaborating with Erdos himself receive an Erdos number of 1, those who collaborate with mathematicians whose Erdos number is 1, but not Erdos himself are given an Erdos number of 2, and so on.

 

Downloading the Dataset

You can get the dataset here. It is a simple text file, although it needs to be processed into .gexf xml format in order to be used with gephi.

Erdos02 (~-Desktop) - gedit_003

The text file can be divided into two main sections, one containing the node labels, or the names of the mathematicians who collaborated with Erdos. The second section contains the edges, which link one mathematician to the other.

Erdos02 (~-Desktop) - gedit_004

I had to use a script to transform the text file into XML format. I was impatient, so I used Excel. If I were doing a more important project, I would have chosen otherwise.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
Option Explicit
Sub create_edges()
 
Dim x, y, z As Long
Dim currform, currend, curredge As Long
 
For x = 1 To 507
    currform = Range("A1").Offset(x-1, 0).Value
    y = WorksheetFunction.CountA(Range(x & ":" & x)) - 1
    For z = 1 to y
        currend = Range("A1").Offset(x-1,z).Value
        Sheets("edges").Range("A1").Offset(curredge-1,0).Value = curredge
        Sheets("edges").Range("A1").Offset(curredge-1,1).Value = currfrom
        Sheets("edges").Range("A1").Offset(curredge-1,2).Value = currend
    Next z
Next x
 
End Sub

erdos (copy).gexf (~-Desktop) - gedit_006

The above image shows the processed data in XML form.

Visualizing the Collaborators

The next step is to import the dataset into gephi. At first, the visual form of the data looks like a meaningless blob:

Gephi 0.8.2 - Project 0_002

To remedy this, I used the Force Atlas algorithm to spread the nodes out to visualize the network structure.

Midway through the Force Atlas Algorithm

Midway through the Force Atlas Algorithm

Here’s what it looks like after running for half an hour As you can see, the network structure is becoming more apparent:

Gephi 0.8.2 - Project 0_008

After stopping the Force Atlas algorithm, the network is a little easier to interpret, but the nodes are all the same size. To emphasize the most important nodes, I adjusted the size of the nodes by eigenvector centrality. As you can see, Erdos is the most influential member.

Gephi 0.8.2 - Project 0_010

After running a community detection algorithm, we can see several distinct communities of mathematicians amongst the collaborators, indicated by color:

Gephi 0.8.2 - Project 0_011

The final output is below. Click to see it in full resolution (it might take a while to render if you try to click on it):

erdos2

 

Posted in: Uncategorized / Tagged: erdos, erdos collaboration graph, erdos network graph, erdos number

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Post Navigation

← Previous Post
Next Post →

Archives

  • September 2023
  • February 2023
  • January 2023
  • October 2022
  • March 2022
  • February 2022
  • December 2021
  • July 2020
  • June 2020
  • May 2020
  • May 2019
  • April 2019
  • November 2018
  • September 2018
  • August 2018
  • December 2017
  • July 2017
  • March 2017
  • November 2016
  • December 2014
  • November 2014
  • October 2014
  • August 2014
  • July 2014
  • June 2014
  • February 2014
  • December 2013
  • October 2013
  • August 2013
  • July 2013
  • June 2013
  • March 2013
  • January 2013
  • November 2012
  • October 2012
  • September 2012
  • August 2012
  • July 2012
  • June 2012
  • May 2012
  • April 2012
  • March 2012
  • February 2012
  • January 2012
  • December 2011
  • September 2011
  • August 2011
  • July 2011
  • June 2011
  • January 2011
  • December 2010
  • October 2010
  • September 2010
  • August 2010
  • June 2010
  • May 2010
  • April 2010
  • March 2010
  • September 2009
  • August 2009
  • May 2009
  • December 2008

Categories

  • Actuarial
  • Cycling
  • Logs
  • Mathematics
  • MIES
  • Music
  • Uncategorized

Links

Cyclingnews
Jason Lee
Knitted Together
Megan Turley
Shama Cycles
Shama Cycles Blog
South Central Collegiate Cycling Conference
Texas Bicycle Racing Association
Texbiker.net
Tiffany Chan
USA Cycling
VeloNews

Texas Cycling

Cameron Lindsay
Jacob Dodson
Ken Day
Texas Cycling
Texas Cycling Blog
Whitney Schultz
© Copyright 2025 - Gene Dan's Blog
Infinity Theme by DesignCoral / WordPress