No. 101: Visualizing Networks

Hello,

Today I’d like to give a brief update on some of the things I’ve been working on over the past few months, and perhaps briefly cover some projects that I’ve planned for myself for the upcoming year.

Networks

A few months back, an actuary contacted me and asked me if I wanted to study network analysis. I had looked into the subject a year ago and even bought some books (Networks by Mark Newman and Networks, Crowds, and Markets by Easley and Kleinberg), but never got around to reading them. Last week, I finally started reading the Easly and Kleinberg text, and right now I’m in the early stages covering basic terminology.

In short, the study of networks is a combination of graph theory and game theory, and is used to study crowd behavior and things like conformity, political instability, epidemics, and market collapse. These things have interested people for some time as they are social phenomena that have, from time to time, led to social upheaval and destruction.

Visualization

Over the past decade, the amount of data that we have on social networks has grown exponentially, and so has computing power. This has now made the empirical analysis of networks, which was once impractically expensive and time-consuming, possible. I’ve initiated a project called hxgrd which will essentially serve as a platform for simulating discrete, turn-based behavior amongst crowds. I’ve only initialized the repository (I’ll talk more about this github project in another post), and I’m looking for some software that I can integrate into the platform to save me some time later on.

I stumbled accross a software called gephi which is an open-source tool used to visuzlize and analyze networks. I downloaded the program out of curiosity and went through their tutorial, which invovles visualizing the relationships between characters of Les Misérables (perhaps you have read the book or seen the musical). Here is a chart generated by gephi:

The graph consists of circles, called nodes, and lines connecting these nodes, called edges. Each circle represents a character that appears in the novel. Each line represents an association between characters. The size of the circles and names of the characters vary proportionally with the number of connections that a character has. As you can see here, Jean Valjean, the main character, has the greatest number of connections.

However, just because a character has the most connections doesn’t mean they are the most influential, and an alternative measure, called betweenness centrality, is a measure of a node’s importance. Below, we can see that Fantine has the highest betweenness centrality:

Gephi can also determine the groups to which characters belong, denoted by color:

The dataset that was used to generate these diagrams is in XML format:

And below you can see what the complete GUI looks like:

Well that’s it for today! I’ll look into the software to understand how it works and to see if I can integrate parts of it into my hxgrd project.

4 Thoughts on “No. 101: Visualizing Networks”

DW on 26 December, 2013 at 2:43 PM said:

Any ideas about insurance applications for this kind of stuff?

I often come across interesting analytic frameworks and tools and sigh with disappointment that there is no way for me to use them at work. And so only a weak justification for learning them.

Reply↓
- Gene Dan on 26 December, 2013 at 7:54 PM said:
  
  I’ve thought about what would happen if the market for catastrophe modeling software were to be monopolized or concentrated – if the vast majority of insurers were to use models that were written by only a handful of companies, what kind of systemic risk would they face? I believe this phenomenon could be explored via network analysis.
  
  I’m not sure how applicable this work would be for an actuarial department, but it would be useful for anyone looking to prevent system-wide collapses.
  
  Reply↓
meep on 20 January, 2014 at 5:52 PM said:

I was thinking of doing something like this with reinsurance — I’ve got access to the full row-by-row reinsurance schedules from the annual statement (I’ve only been pulling for life insurance but I could get at least two years of P&C)

Reply↓
Pingback: No. 111: Visualizing Erdos Collaborators | Gene Dan's Blog

No. 101: Visualizing Networks

4 Thoughts on “No. 101: Visualizing Networks”

Leave a Reply Cancel reply

Post Navigation

Archives

Categories

Links

Texas Cycling