• Home
  • Readings
  • Github
  • MIES
  • TmVal
  • About
Gene Dan's Blog

Category Archives: Logs

No. 114: Visualizing the Blockchain

24 December, 2014 5:29 PM / Leave a Comment / Gene Dan

For those of you who don’t know what Bitcoin is, it’s a digital currency that’s been gaining attention over the last few years, mostly due to its obscure user base, popularity on the black market (although most bitcoin transactions are legal), and its exchange rate volatility versus the U.S. dollar.

I’ve been interested in Bitcoin for quite some time, since unlike cash transactions, all bitcoin transactions are recorded on a publicly available ledger called the Blockchain. Because the blockchain records all transactions that occur over the bitcoin network, it can be a valuable source of information, revealing interesting patterns about peer-to-peer monetary transactions that were previously unavailable under traditional currency, due to lack of available data.

I stumbled across some CSV files on the internet that contain parsed blockchain information available in a script-friendly format here. Using this dataset I wrote a script to extract the transactions from the first 500 bitcoin users:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
#import dataset
#edges <- read.csv("user_edges.txt", header=FALSE)
#head(edges)
 
###subset first n users
lim <- 500
edges.sub <- edges[edges$V2 <= lim & edges $V3 <= lim & (edges$V2 != edges$V3), c("V2","V3")]
head(edges.sub,500)
sub.unique <- edges.sub[!duplicated(edges.sub),]
sub.unique$edgenum <- 1:nrow(sub.unique)
head(sub.unique)
sub.unique$edges <- paste('<edge id="', as.character(sub.unique$edgenum),'" source="', sub.unique$V2, '" target="',sub.unique$V3, '"/>',sep="")
 
###build nodes
nodes <- data.frame(id=sort(unique(c(sub.unique$V2,sub.unique$V3))))
nodes$nodestr <- paste('<node id="', as.character(nodes$id), '" label="',nodes$id, '"/>',sep="")
head(nodes)
 
### build metadata
gexfstr <- '<?xml version="1.0" encoding="UTF-8"?>
<gexf xmlns:viz="http:///www.gexf.net/1.1draft/viz" version="1.1" xmlns="http://www.gexf.net/1.1draft">
<meta lastmodifieddate="2010-03-03+23:44">
<creator>Gephi 0.7</creator>
</meta>
<graph defaultedgetype="undirected" idtype="string" type="static">'
 
 
### append nodes
gexfstr <- paste(gexfstr,'\n','<nodes count="',as.character(nrow(nodes)),'">\n',sep="")
fileConn<-file("output.gexf")
for(i in 1:nrow(nodes)){
  gexfstr <- paste(gexfstr,nodes$nodestr[i],"\n",sep="")}
gexfstr <- paste(gexfstr,'</nodes>\n','<edges count="',as.character(nrow(sub.unique)),'">\n',sep="")
 
### append edges and print to file
for(i in 1:nrow(sub.unique)){
  gexfstr <- paste(gexfstr,sub.unique$edges[i],"\n",sep="")}
gexfstr <- paste(gexfstr,'</edges>\n</graph>\n</gexf>',sep="")
writeLines(gexfstr, fileConn)
close(fileConn)

I subsequently imported the output file into gephi to create a network visualization of the transactions. You can view the process in the video below.

https://www.youtube.com/watch?v=wjw0ksaRSO4&feature=youtu.be

The resulting graph:

Transactions amongst the first 500 users of Bitcoin

Transactions amongst the first 500 users of Bitcoin

Here you can see that the modularity algorithms have identified clusters of tightly-knit users who transact frequently with each other, along with influential users who may be running businesses or may be serving as middlemen between other groups of users.

Posted in: Logs, Mathematics / Tagged: bitcoin, blockchain, graph, network

No. 109: Interlude

27 October, 2014 9:07 PM / Leave a Comment / Gene Dan

Let \(\mathbf{y}\sim \text{MVN}(\mu,\mathbf{V})\), where \(\mathbf{y}\) has \(n\) elements but the \(Y_i\)’s are not independent so that the number \(k\) of linearly independent rows (or columns) of \(\mathbf{V}\) (that is, the rank of \(\mathbf{V}\)) is less than \(n\) and so \(\mathbf{V}\) is singular and its inverse is not uniquely defined. Let \(\mathbf{V}^{-}\) denote a generalized inverse of \(\mathbf{V}\) (that is a matrix with the property that \(\mathbf{V}\mathbf{V}^{-}\mathbf{V}=\mathbf{V}\)). Then the random variable \(\mathbf{y}^T\mathbf{V}^{-}\mathbf{y}\) has the non-central chi-squared distribution with \(k\) degrees of freedom and non-centrality parameter \(\lambda=\mathbf{\mu}^T\mathbf{V}^{-}\mathbf{\mu}\).

Whenever I read a paragraph like this, a few things pop into my mind: 1) I should have paid more attention in linear algebra class, 2) my current method of study isn’t conducive to long-term memory retention, and 3) how many of my peers really have a firm grasp over statistical theory?

It’s painful. Mathematics texts, especially advanced undergraduate/early graduate texts such as this one, require the reader to have a firm grasp of prerequisite subjects. In the example above, I’ve forgotten most of the terminology, and at best I have a vague recollection of some of the vocabulary from my freshman matrices course. Linear independence, rank, and invertibility will be easy to look up – those are taught early on in a typical linear algebra course. Only then will I be able to confirm that if the rank of V is less than n that V is singular and its inverse is not uniquely defined. And then there’s the concept of a generalized inverse. That’s not something I ever covered as an undergrad and I’ll need to learn what that means.

For a paragraph like this, I have a few options. Looking up forgotten terminology is a must, but what about the terms I haven’t seen before? With the case of a generalized inverse, I might have to spend an hour or two practicing before I can move on to the next paragraph. For more obscure concepts, learning them might take several days or more – but then there’s the practical constraint on things I need to accomplish at work, so sometimes I will need to assume that a claim is true, move on to the next paragraph, and look into it later after I’ve learned more math.

Nowadays it’s becoming increasingly common for me to read books that require me to draw on previous knowledge from several subjects. In the case of generalized linear models, I need to know linear algebra, calculus, statistics, and computer programming. This means that as I move on to more advanced subjects, I no longer have the liberty to promptly forget something after I’ve covered it. Therefore, at this point I’ve made the decision to restructure my learning technique to be more conducive to long-term retention.

I’ll need to reduce the pace at which I’m moving through textbooks. Up until now I’ve been reading textbooks at a pace of 10 pages per day, maybe reading 3 textbooks at a time. That’s too fast – it isn’t slow enough for me to be able to transfer material from short-term to long-term memory (see my post on spaced repetition). But then, if I did that, my learning would slow down to a crawl and I’d die before I really learned anything – so I’ll need to study additional subjects to make up for lost volume. Therefore, my schedule will resemble something closer to reading 10 books at the same time but only 3 pages per day. After each 3-page session, notecards are compiled, and after all 10 books are read the cards are combined and then reviewed. This technique is called interleaving – simultaneously learning from several different subjects at once. The reason why this technique is so effective is because rarely do real-life problems present themselves in the ordered, compartmentalized fashion that we are accustomed to in the classroom – essentially, uninterleavened practice fails to recognize that real-life problems are chaotic. Interleavened study makes practice closer to performance.

Selection_003

Current progress on spaced repetition

I haven’t decided what to do with the 70 Days of Linear Algebra project that I postponed after moving. That project doesn’t fit my new model of study, but I’d still like to finish it. Most likely, I’ll try to pick up where I left off while applying the new model to subsequent pursuits.

Posted in: Logs

No. 108: My Path to Enlightenment

12 August, 2014 12:50 AM / Leave a Comment / Gene Dan

So I want to understand everything. Unfortunately, my limited lifespan, along with the seemingly infinte amount of information there is out there makes this goal impossible. I suppose I should try to understand as much of the universe as possible while making my best effort to contribute to mankind’s body of knowledge.

I’ve been playing around with an open source diagramming tool called Dia, which makes it easy to draw all sorts of visual models from EER diagrams to project management flowcharts. One of the challenges that exists when it comes to discovery is being able to successfuly communicate your findings to a wider audience. You might discover something profound, but if you cannot get anyone else to understand what you have found, or at least be aware of it, what ever you have found will be lost to humanity after your passing.

Fortunately in this day in age, we have the internet. That gives society the capicity to share information at previously unimaginable speeds – and Dia is just one tool out of many that allows people to distill complex ideas into simple diagrams, to be sent to a wide audience via the information superhighway. There are tradeoffs, of course. A diagram cannot capture every single detail about a concept and thus can leave out crucial information. However, the need to quickly reach as many people as possible with a basic concept outweighs the need to cram every single detail possible into a single transmission.

Anyway, I have been using Dia in an attempt to further clarify my educational goals by sketching visual models of the interdependencies of the various subjects that I’ve been studying. I have had an increasing interest in studying the physical world – of the physical sciences, I’ve studied biology the most (3 semesters including genetics), but I never really got around to seriously studying chemistry or physics. So the initial goal for me in this realm (I’ve called it “Trinity”) is to get a firm footing in general bio/chem/phys. Using basic college texts, in combination with spaced repetition techniques, I think I’ll be able to understand and retain enough information to tackle the interdisciplinary subjects of physical chemistry, biochemistry, and biophysics.

Trinity

However, I’ve found out that you cannot study subjects in isolation, there will always be times where you’ll need to pull information from other fields to tackle a problem. I encountered this issue when studying genetics in college, where a good grasp of combinatorics is needed. Likewise in general chemistry, solving systems of linear equations is required to balance chemical formulas. I majored in math so I have visited quite a few of the subjects below. The diagram is oversimplified, as you cannot realistically expect such a clean linear progression when studying mathematics:

math

And then there’s Philosphy. I might have taken 6 or 7 philosphy courses in college, unfortunately most of them involved reading excerpts from famous philosophers (Socrates, Plato, Descartes, etc.) and didn’t cover any general philosphy, so I lack the vocabulary to articulate what I’d like to study here. As I look into this subject more deeply I’ll be able to add more things to the diagram:

ph

I majored in economics, the only topics I haven’t visited below are advanced macro/microeconomic theory, which are graduate subjects:

econ

The use of computers as greatly amplified mankind’s ability to synthesize and make use of information. And for individuals, as increased their ability to access and organize information for their own purposes. Computers are immensely useful. They allow people to calculate as well as conduct experiments via simulation that are pratically infeasible in society due to various constrants:

cs

So putting everything together…

cyb

In short I like to study systems. I want to know more about power and control, how economies rise and how they collapse, and how biological and social systems remain stable or evolve over time. The closest thing I could find that’s similar to this idea is cybernetics, but I’d have to admit that the wikipeida article is currently over my head, so I could be wrong, and I’d have to update the diagram if that’s the case.

Anyway the diagram isn’t accurate – many of these subjects aren’t concretely defined and there’s a lot of overlap between them. Likewise the order of study and the interdependencies aren’t as neat either, but at the very least, articulating my thoughts is a start and invites feedback. As I proceed, I’ll encounter mistakes and dead ends, and corrections will have to be made, but that’s all part of the learning process.

Posted in: Logs, Mathematics / Tagged: cybernetics, enlightenment, systems

No. 107: My Move to Chicago

4 August, 2014 10:38 PM / 1 Comment / Gene Dan

As I said in my previous post, I have a big announcement to make – last month I recieved two very competitive job offers from insurance firms in Chicago, both of which involved predictive analytics. While the choice between the two firms wasn’t easy, the choice to move to Chicago was – I have lived in Houston for more than two decades of my life and in Texas for almost all of it. As a young adult, I think moving to a city where I know almost no one would be good for my personal development and maturity. My opinion is that the lack of close social ties within this city will encourage me to make decisions on my own without relying on other people – the safety net of familiarity.

It’s a shame that I’m leaving Houston right when the economy is booming in Texas. People are flocking to the state and its cities are experiencing high growth rates. Having lived near the center of Houston for three years, I’ve seen the area rapidly transform as the number of modern, high-rise buildings have started to multiply at an astonishingly brisk pace. Areas that were once considered to be run-down, crime-infested slums such as the Heights are now considered some of the best places to move for someone fresh out of college.

As Houston metamorphosizes into a modern city, I’m sure the infrastructure and mass transit will become more accomodating to the influx of people coming here to find jobs in the energy sector. However, in its current state the city is a long ways from matching the transit systems of more developed establishments such as Manhattan or Chicago, or even that of smaller cities like Washington D.C. or San Francisco. Most people still need a car to accomplish simple tasks like picking up groceries, or even visiting friends within the same neighborhood. At the same time, Houston shows a lot of promise – the lack of zoning laws ought to make it easier for developers to construct high-density residential complexes to absorb the increased demand for housing – hopefully avoiding (or at least mitigating) a property bubble like the current one in the Bay Area. I have hopes that one day high-speed rail will connect the major metropolitan areas of Texas, theoretically allowing a person to live in Austin and work in Houston with less than an hour commute between, which would enable people to seek the best job opportunities in other cities without having to uproot their families – for example, a family based in Dallas might have a mother who works in Austin and a father who works in Houston – a mass transit system like this would provide the perfect combination of career flexibility and domestic stability that families need.

However these dreams are a long ways off. Even if such projects were to commence today, it would take an entire generation to realize the benefits from the investment At my age I can’t wait for something like this to be completed, or even to be approved in the first place. So, I have packed up my bags and moved across the country to experience city life while I still can, in a place where the population density is more than three times higher. So far, I’ve been here for a week – it takes me 15 minutes to walk to work, and there is a grocery store right across from my apartment. The night life is amazing, and trains and cabs can take me anywhere I might need to go. At the moment, Chicago can provide me with a lifestyle that Houston can’t – although I hope that as Houston matures, new residents and the next generation of college graduates can enjoy the features and opportunities that a modern city provides.

10498503_10202506001681171_1371097549280330539_o

Posted in: Logs

No. 106: A Quick Update

7 July, 2014 9:11 PM / Leave a Comment / Gene Dan

So, (as some of you already know) I had to temporarily halt my 70 Days of Linear Algebra because I had to attend to some extremely important matters over the last few weeks involving some big life changes, but all for a good reason. The outcome is mostly good news, and part of it is that I passed my last actuarial exam which means I have just 4 exams left to go. I will announce the rest of the news later.

Anyway, I am happy to pick up right where I left off. I have been reviewing my notes using the spaced-repetition techniques I had outlined earlier, so my memory should be fresh.

Posted in: Logs

Post Navigation

1 2 3 … 19 Next »

Archives

  • September 2023
  • February 2023
  • January 2023
  • October 2022
  • March 2022
  • February 2022
  • December 2021
  • July 2020
  • June 2020
  • May 2020
  • May 2019
  • April 2019
  • November 2018
  • September 2018
  • August 2018
  • December 2017
  • July 2017
  • March 2017
  • November 2016
  • December 2014
  • November 2014
  • October 2014
  • August 2014
  • July 2014
  • June 2014
  • February 2014
  • December 2013
  • October 2013
  • August 2013
  • July 2013
  • June 2013
  • March 2013
  • January 2013
  • November 2012
  • October 2012
  • September 2012
  • August 2012
  • July 2012
  • June 2012
  • May 2012
  • April 2012
  • March 2012
  • February 2012
  • January 2012
  • December 2011
  • September 2011
  • August 2011
  • July 2011
  • June 2011
  • January 2011
  • December 2010
  • October 2010
  • September 2010
  • August 2010
  • June 2010
  • May 2010
  • April 2010
  • March 2010
  • September 2009
  • August 2009
  • May 2009
  • December 2008

Categories

  • Actuarial
  • Cycling
  • Logs
  • Mathematics
  • MIES
  • Music
  • Uncategorized

Links

Cyclingnews
Jason Lee
Knitted Together
Megan Turley
Shama Cycles
Shama Cycles Blog
South Central Collegiate Cycling Conference
Texas Bicycle Racing Association
Texbiker.net
Tiffany Chan
USA Cycling
VeloNews

Texas Cycling

Cameron Lindsay
Jacob Dodson
Ken Day
Texas Cycling
Texas Cycling Blog
Whitney Schultz
© Copyright 2025 - Gene Dan's Blog
Infinity Theme by DesignCoral / WordPress