• Home
  • Readings
  • Github
  • MIES
  • TmVal
  • About
Gene Dan's Blog

No. 111: Visualizing Erdos Collaborators

20 November, 2014 8:31 PM / Leave a Comment / Gene Dan

So a while back I wrote about visualizing networks – well, I’m back at it again and I found a sweet dataset containing the Enron emails. However, that dataset might take a few day’s worth of computing power to process, so in the meantime I’ll be showing a much more manageable dataset containing research partnerships of mathematicians who collaborated with Paul Erdos.

In short, Paul Erdos was a great 20th century mathematician who was famous for his eccentric behavior, prolific publishing, and his copious consumption of amphetamines and caffeine. Today, mathematicians often refer to something called an Erdos number, which indicates one’s closeness to the late mathematician. For example, those who published papers directly collaborating with Erdos himself receive an Erdos number of 1, those who collaborate with mathematicians whose Erdos number is 1, but not Erdos himself are given an Erdos number of 2, and so on.

 

Downloading the Dataset

You can get the dataset here. It is a simple text file, although it needs to be processed into .gexf xml format in order to be used with gephi.

Erdos02 (~-Desktop) - gedit_003

The text file can be divided into two main sections, one containing the node labels, or the names of the mathematicians who collaborated with Erdos. The second section contains the edges, which link one mathematician to the other.

Erdos02 (~-Desktop) - gedit_004

I had to use a script to transform the text file into XML format. I was impatient, so I used Excel. If I were doing a more important project, I would have chosen otherwise.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
Option Explicit
Sub create_edges()
 
Dim x, y, z As Long
Dim currform, currend, curredge As Long
 
For x = 1 To 507
    currform = Range("A1").Offset(x-1, 0).Value
    y = WorksheetFunction.CountA(Range(x & ":" & x)) - 1
    For z = 1 to y
        currend = Range("A1").Offset(x-1,z).Value
        Sheets("edges").Range("A1").Offset(curredge-1,0).Value = curredge
        Sheets("edges").Range("A1").Offset(curredge-1,1).Value = currfrom
        Sheets("edges").Range("A1").Offset(curredge-1,2).Value = currend
    Next z
Next x
 
End Sub

erdos (copy).gexf (~-Desktop) - gedit_006

The above image shows the processed data in XML form.

Visualizing the Collaborators

The next step is to import the dataset into gephi. At first, the visual form of the data looks like a meaningless blob:

Gephi 0.8.2 - Project 0_002

To remedy this, I used the Force Atlas algorithm to spread the nodes out to visualize the network structure.

Midway through the Force Atlas Algorithm

Midway through the Force Atlas Algorithm

Here’s what it looks like after running for half an hour As you can see, the network structure is becoming more apparent:

Gephi 0.8.2 - Project 0_008

After stopping the Force Atlas algorithm, the network is a little easier to interpret, but the nodes are all the same size. To emphasize the most important nodes, I adjusted the size of the nodes by eigenvector centrality. As you can see, Erdos is the most influential member.

Gephi 0.8.2 - Project 0_010

After running a community detection algorithm, we can see several distinct communities of mathematicians amongst the collaborators, indicated by color:

Gephi 0.8.2 - Project 0_011

The final output is below. Click to see it in full resolution (it might take a while to render if you try to click on it):

erdos2

 

Posted in: Uncategorized / Tagged: erdos, erdos collaboration graph, erdos network graph, erdos number

No. 110: The Black Market

17 November, 2014 8:10 PM / Leave a Comment / Gene Dan

2013 was an eventful year for the black market, with  Silk Road (and now Silk Road 2.0) having been seized by the federal government, substantially impacting the value of bitcoins relative to the U.S. Dollar (although the vast majority of bitcoins are used in legal transactions). That, in addition to other developments in cyberspace including the government’s involvement in surveilling and controlling it, has gotten me interested in the underground economy, how it works, along with the involvement of the government and seemingly legitimate private organizations. I’ve started to educate myself with Andrew Feinstein’s book on the global arms trade, but before that I would like state some of my naive assumptions about the way the black market works, and then maybe come back to this in a few months’ time once I’ve learned a little bit more.

Selection_006

The Black Market Exists Because Demand is Left Unsatisfied in Conventional Markets

Simply put, people want goods that aren’t available from legally authorized markets. People want drugs. They want pirated media and counterfeit goods of name-brand products. Vengeful lovers want murders carried out on their behalf and curious individuals might want to try some products from countries currently under economic sanctions.

The Absence of Property Rights Leads to Organized Crime

The threat of punishment by the authorities, in tandem with government control over conventional routes of transportation deters everyone but the most die-hard suppliers from reaching their customers. This government-induced scarcity causes prices of some illicit good or services to be thousands of times more than what they would otherwise be in the absence of government restriction. The high price drives those who are the most willing to risk punishment to take extreme measures to make markets.

Unlike suppliers of legal goods, black market suppliers do not receive government protection via property rights. This creates the need for black market businessmen to raise their own private armies to protect their own inventory along with their supply chain affiliates – this leads to the formation of quasi-governmental entities that are commonly known as gangs, or drug cartels. If you think about it, protection money is kind of like taxes – in exchange for a payment, you will be protected by the gang. If you fail to pay up, you’ll get your property sacked and your kneecaps broken. Likewise with taxes, in exchange for a payment you will be protected by an army. If you fail to pay up, you’ll either go to jail, face garnishment of wages, or both. So a gang is, in a way, a de-facto government overseeing the market for whatever illegal good or service it sells. The lack of property rights creates a huge incentive for rival gangs to fight over control of the market, as consolidation would allow them to further control prices via monopoly power. Now, you might ask, why then aren’t Dell and Lenovo raising their own private armies and blasting each other to smithereens to gain control over the computer industry? The reason is that it’s too expensive – computer prices simply aren’t high enough to justify the cost of raising an army and going to war, because the production, transport, and sale of computers aren’t restricted by the government. In other words, there’s no artificial scarcity with respect to computers.

Violence as a Symptom

You might have heard that buying drugs and counterfeit goods supports violence. Well, in this steady-state of affairs it does – your money would likely end up in some drug-lord’s hands which could fuel his next turf war – but it’s a poor justification for discouraging someone from buying something on the black market. One should ask why the market for these goods are controlled by violent drug lords in the first place. The high price of restricted goods, along with the lack of property rights encourages the formation of privately-armed businesses which in turn increases the demand for weapons which in turn fuels the illegal arms market – not a pretty picture. Legalization of drugs would lead to a result similar to legalized alcohol, the lower price would decrease the presence of organized crime and street violence, but at the cost of public health and safety which would also put a strain on our public services. So sometimes the question of legalization is not so easy to answer.

 

 

Posted in: Uncategorized

No. 109: Interlude

27 October, 2014 9:07 PM / Leave a Comment / Gene Dan

Let \(\mathbf{y}\sim \text{MVN}(\mu,\mathbf{V})\), where \(\mathbf{y}\) has \(n\) elements but the \(Y_i\)’s are not independent so that the number \(k\) of linearly independent rows (or columns) of \(\mathbf{V}\) (that is, the rank of \(\mathbf{V}\)) is less than \(n\) and so \(\mathbf{V}\) is singular and its inverse is not uniquely defined. Let \(\mathbf{V}^{-}\) denote a generalized inverse of \(\mathbf{V}\) (that is a matrix with the property that \(\mathbf{V}\mathbf{V}^{-}\mathbf{V}=\mathbf{V}\)). Then the random variable \(\mathbf{y}^T\mathbf{V}^{-}\mathbf{y}\) has the non-central chi-squared distribution with \(k\) degrees of freedom and non-centrality parameter \(\lambda=\mathbf{\mu}^T\mathbf{V}^{-}\mathbf{\mu}\).

Whenever I read a paragraph like this, a few things pop into my mind: 1) I should have paid more attention in linear algebra class, 2) my current method of study isn’t conducive to long-term memory retention, and 3) how many of my peers really have a firm grasp over statistical theory?

It’s painful. Mathematics texts, especially advanced undergraduate/early graduate texts such as this one, require the reader to have a firm grasp of prerequisite subjects. In the example above, I’ve forgotten most of the terminology, and at best I have a vague recollection of some of the vocabulary from my freshman matrices course. Linear independence, rank, and invertibility will be easy to look up – those are taught early on in a typical linear algebra course. Only then will I be able to confirm that if the rank of V is less than n that V is singular and its inverse is not uniquely defined. And then there’s the concept of a generalized inverse. That’s not something I ever covered as an undergrad and I’ll need to learn what that means.

For a paragraph like this, I have a few options. Looking up forgotten terminology is a must, but what about the terms I haven’t seen before? With the case of a generalized inverse, I might have to spend an hour or two practicing before I can move on to the next paragraph. For more obscure concepts, learning them might take several days or more – but then there’s the practical constraint on things I need to accomplish at work, so sometimes I will need to assume that a claim is true, move on to the next paragraph, and look into it later after I’ve learned more math.

Nowadays it’s becoming increasingly common for me to read books that require me to draw on previous knowledge from several subjects. In the case of generalized linear models, I need to know linear algebra, calculus, statistics, and computer programming. This means that as I move on to more advanced subjects, I no longer have the liberty to promptly forget something after I’ve covered it. Therefore, at this point I’ve made the decision to restructure my learning technique to be more conducive to long-term retention.

I’ll need to reduce the pace at which I’m moving through textbooks. Up until now I’ve been reading textbooks at a pace of 10 pages per day, maybe reading 3 textbooks at a time. That’s too fast – it isn’t slow enough for me to be able to transfer material from short-term to long-term memory (see my post on spaced repetition). But then, if I did that, my learning would slow down to a crawl and I’d die before I really learned anything – so I’ll need to study additional subjects to make up for lost volume. Therefore, my schedule will resemble something closer to reading 10 books at the same time but only 3 pages per day. After each 3-page session, notecards are compiled, and after all 10 books are read the cards are combined and then reviewed. This technique is called interleaving – simultaneously learning from several different subjects at once. The reason why this technique is so effective is because rarely do real-life problems present themselves in the ordered, compartmentalized fashion that we are accustomed to in the classroom – essentially, uninterleavened practice fails to recognize that real-life problems are chaotic. Interleavened study makes practice closer to performance.

Selection_003

Current progress on spaced repetition

I haven’t decided what to do with the 70 Days of Linear Algebra project that I postponed after moving. That project doesn’t fit my new model of study, but I’d still like to finish it. Most likely, I’ll try to pick up where I left off while applying the new model to subsequent pursuits.

Posted in: Logs

No. 108: My Path to Enlightenment

12 August, 2014 12:50 AM / Leave a Comment / Gene Dan

So I want to understand everything. Unfortunately, my limited lifespan, along with the seemingly infinte amount of information there is out there makes this goal impossible. I suppose I should try to understand as much of the universe as possible while making my best effort to contribute to mankind’s body of knowledge.

I’ve been playing around with an open source diagramming tool called Dia, which makes it easy to draw all sorts of visual models from EER diagrams to project management flowcharts. One of the challenges that exists when it comes to discovery is being able to successfuly communicate your findings to a wider audience. You might discover something profound, but if you cannot get anyone else to understand what you have found, or at least be aware of it, what ever you have found will be lost to humanity after your passing.

Fortunately in this day in age, we have the internet. That gives society the capicity to share information at previously unimaginable speeds – and Dia is just one tool out of many that allows people to distill complex ideas into simple diagrams, to be sent to a wide audience via the information superhighway. There are tradeoffs, of course. A diagram cannot capture every single detail about a concept and thus can leave out crucial information. However, the need to quickly reach as many people as possible with a basic concept outweighs the need to cram every single detail possible into a single transmission.

Anyway, I have been using Dia in an attempt to further clarify my educational goals by sketching visual models of the interdependencies of the various subjects that I’ve been studying. I have had an increasing interest in studying the physical world – of the physical sciences, I’ve studied biology the most (3 semesters including genetics), but I never really got around to seriously studying chemistry or physics. So the initial goal for me in this realm (I’ve called it “Trinity”) is to get a firm footing in general bio/chem/phys. Using basic college texts, in combination with spaced repetition techniques, I think I’ll be able to understand and retain enough information to tackle the interdisciplinary subjects of physical chemistry, biochemistry, and biophysics.

Trinity

However, I’ve found out that you cannot study subjects in isolation, there will always be times where you’ll need to pull information from other fields to tackle a problem. I encountered this issue when studying genetics in college, where a good grasp of combinatorics is needed. Likewise in general chemistry, solving systems of linear equations is required to balance chemical formulas. I majored in math so I have visited quite a few of the subjects below. The diagram is oversimplified, as you cannot realistically expect such a clean linear progression when studying mathematics:

math

And then there’s Philosphy. I might have taken 6 or 7 philosphy courses in college, unfortunately most of them involved reading excerpts from famous philosophers (Socrates, Plato, Descartes, etc.) and didn’t cover any general philosphy, so I lack the vocabulary to articulate what I’d like to study here. As I look into this subject more deeply I’ll be able to add more things to the diagram:

ph

I majored in economics, the only topics I haven’t visited below are advanced macro/microeconomic theory, which are graduate subjects:

econ

The use of computers as greatly amplified mankind’s ability to synthesize and make use of information. And for individuals, as increased their ability to access and organize information for their own purposes. Computers are immensely useful. They allow people to calculate as well as conduct experiments via simulation that are pratically infeasible in society due to various constrants:

cs

So putting everything together…

cyb

In short I like to study systems. I want to know more about power and control, how economies rise and how they collapse, and how biological and social systems remain stable or evolve over time. The closest thing I could find that’s similar to this idea is cybernetics, but I’d have to admit that the wikipeida article is currently over my head, so I could be wrong, and I’d have to update the diagram if that’s the case.

Anyway the diagram isn’t accurate – many of these subjects aren’t concretely defined and there’s a lot of overlap between them. Likewise the order of study and the interdependencies aren’t as neat either, but at the very least, articulating my thoughts is a start and invites feedback. As I proceed, I’ll encounter mistakes and dead ends, and corrections will have to be made, but that’s all part of the learning process.

Posted in: Logs, Mathematics / Tagged: cybernetics, enlightenment, systems

No. 107: My Move to Chicago

4 August, 2014 10:38 PM / 1 Comment / Gene Dan

As I said in my previous post, I have a big announcement to make – last month I recieved two very competitive job offers from insurance firms in Chicago, both of which involved predictive analytics. While the choice between the two firms wasn’t easy, the choice to move to Chicago was – I have lived in Houston for more than two decades of my life and in Texas for almost all of it. As a young adult, I think moving to a city where I know almost no one would be good for my personal development and maturity. My opinion is that the lack of close social ties within this city will encourage me to make decisions on my own without relying on other people – the safety net of familiarity.

It’s a shame that I’m leaving Houston right when the economy is booming in Texas. People are flocking to the state and its cities are experiencing high growth rates. Having lived near the center of Houston for three years, I’ve seen the area rapidly transform as the number of modern, high-rise buildings have started to multiply at an astonishingly brisk pace. Areas that were once considered to be run-down, crime-infested slums such as the Heights are now considered some of the best places to move for someone fresh out of college.

As Houston metamorphosizes into a modern city, I’m sure the infrastructure and mass transit will become more accomodating to the influx of people coming here to find jobs in the energy sector. However, in its current state the city is a long ways from matching the transit systems of more developed establishments such as Manhattan or Chicago, or even that of smaller cities like Washington D.C. or San Francisco. Most people still need a car to accomplish simple tasks like picking up groceries, or even visiting friends within the same neighborhood. At the same time, Houston shows a lot of promise – the lack of zoning laws ought to make it easier for developers to construct high-density residential complexes to absorb the increased demand for housing – hopefully avoiding (or at least mitigating) a property bubble like the current one in the Bay Area. I have hopes that one day high-speed rail will connect the major metropolitan areas of Texas, theoretically allowing a person to live in Austin and work in Houston with less than an hour commute between, which would enable people to seek the best job opportunities in other cities without having to uproot their families – for example, a family based in Dallas might have a mother who works in Austin and a father who works in Houston – a mass transit system like this would provide the perfect combination of career flexibility and domestic stability that families need.

However these dreams are a long ways off. Even if such projects were to commence today, it would take an entire generation to realize the benefits from the investment At my age I can’t wait for something like this to be completed, or even to be approved in the first place. So, I have packed up my bags and moved across the country to experience city life while I still can, in a place where the population density is more than three times higher. So far, I’ve been here for a week – it takes me 15 minutes to walk to work, and there is a grocery store right across from my apartment. The night life is amazing, and trains and cabs can take me anywhere I might need to go. At the moment, Chicago can provide me with a lifestyle that Houston can’t – although I hope that as Houston matures, new residents and the next generation of college graduates can enjoy the features and opportunities that a modern city provides.

10498503_10202506001681171_1371097549280330539_o

Posted in: Logs

Post Navigation

« Previous 1 … 7 8 9 10 11 … 30 Next »

Archives

  • September 2023
  • February 2023
  • January 2023
  • October 2022
  • March 2022
  • February 2022
  • December 2021
  • July 2020
  • June 2020
  • May 2020
  • May 2019
  • April 2019
  • November 2018
  • September 2018
  • August 2018
  • December 2017
  • July 2017
  • March 2017
  • November 2016
  • December 2014
  • November 2014
  • October 2014
  • August 2014
  • July 2014
  • June 2014
  • February 2014
  • December 2013
  • October 2013
  • August 2013
  • July 2013
  • June 2013
  • March 2013
  • January 2013
  • November 2012
  • October 2012
  • September 2012
  • August 2012
  • July 2012
  • June 2012
  • May 2012
  • April 2012
  • March 2012
  • February 2012
  • January 2012
  • December 2011
  • September 2011
  • August 2011
  • July 2011
  • June 2011
  • January 2011
  • December 2010
  • October 2010
  • September 2010
  • August 2010
  • June 2010
  • May 2010
  • April 2010
  • March 2010
  • September 2009
  • August 2009
  • May 2009
  • December 2008

Categories

  • Actuarial
  • Cycling
  • Logs
  • Mathematics
  • MIES
  • Music
  • Uncategorized

Links

Cyclingnews
Jason Lee
Knitted Together
Megan Turley
Shama Cycles
Shama Cycles Blog
South Central Collegiate Cycling Conference
Texas Bicycle Racing Association
Texbiker.net
Tiffany Chan
USA Cycling
VeloNews

Texas Cycling

Cameron Lindsay
Jacob Dodson
Ken Day
Texas Cycling
Texas Cycling Blog
Whitney Schultz
© Copyright 2025 - Gene Dan's Blog
Infinity Theme by DesignCoral / WordPress