Hi everyone!
It’s been a very long time since I last updated and I have so much to write about – race reports, employment, learning math and computers, and so on and so forth, but I need to keep it short since I have to wake up early tomorrow morning. So, today I’ll be writing about a project that I’ve been working on – brute-force password cracking to help me learn more about permutations and combinations, recursion, efficiency, and possibly multithreading later on. Of course, the most important issue here is cybersecurity and how to prevent attacks.
Let me stress that cybersecurity awareness is an extremely important issue in our day and age and when you have your money, identity, and livelihood entrusted to computers, you need to know how people can access your information and how to reduce the likelihood of attacks or prevent them all together. So, I’m going to give a basic demonstration on one of the most basic methods of password cracking – brute force calculation.
Brute-force cracking is a method by which the computer attempts to crack a password by using every possible combination of passwords until it finds a match. The time it takes to crack a password depends on the length of the password and the speed of the computer. The longer the password, the longer it takes to crack a password, and the more powerful the computer, the faster it can crack the password. The following picture shows how quickly a slow program that I wrote (in the computing world) – in Microsoft Excel – can crack a 4-digit password with a set of 94 printable ASCII characters (click to enlarge):
As you can see, it took an average of 14 seconds for Excel to crack each password – and these aren’t your typical “dumb” passwords either. Combinations like “$HSp,” “DXxg,” and “<9N” extended out to 8 or more characters would be impossible for a human to guess and would be considered “good” by today’s standards. With brute-force calculation human creativity with respect to password creation doesn’t matter – since the computer checks all the combinations, theoretically, given enough time it will surely find with 100% probability, the password.
Luckily, the most powerful tool we humans have against this sort of attack is password length. From a library of 94 printable ASCII characters, each additional character in a password will make the computer work 94 times longer. For instance, it took about 0.005 seconds to crack a password of length 1, 0.07 seconds to crack a password of length 2, 1 second to crack a password of length 3, 14 seconds for 4, 196 seconds for 5, and so on:
As you can see, a password of length 12 will take about 700 years for my computer to solve, not including the time it takes to simulate keystrokes and button clicks, navigate dialog boxes, and so on. The simple solution? Make your password long! A long password means most people will not bother to use this method given current technological constraints. Of course, this presents a much bigger problem when we’re talking about governments, who control supercomputers that are much, much faster than what we can buy in the store. In this case, simple passwords won’t cut it.
Now the exciting part! I know you are all dying to see the code, so here it is in all its glory. It’s remarkably simple, and only takes up about 65 lines of code in three modules. Here’s the first one:
[sourcecode]
Option Explicit
Sub test(compstring As String, pstring As String, places As Long, attempts As Double, matchfound As Range)
If matchfound = True Then Exit Sub
Dim x As Integer, breaker As String
For x = 33 To 126 ‘ASCII character codes 33 to 126
breaker = pstring & Chr(x)
If places > 1 Then Call test(compstring, breaker, places – 1, attempts, matchfound)
attempts = attempts + 1
If compstring = breaker Then
matchfound.Offset(0, -1).Value = attempts
matchfound.Value = "True"
Exit Sub
End If
Next x
End Sub
[/sourcecode]
This module is the heart of the program. What it does is it takes five arguments, the first of which is the original, randomly generated password. Of course in the real world we wouldn’t actually know the password, so this, and the code taking action upon the values passed by this variable, will probably deal with pressing the enter button and seeing if we gain access. Anyway, the second argument is the comparison string used to break the password. The third argument specifies the length of the comparison string, the fourth argument keeps track of the number of attempts, and the fifth argument is a range that refers to a boolean telling us if we’ve successfully reached our goal.
Now, in the real world we wouldn’t know the length of the password, so I had to create a procedure that would automatically increase the length of the comparison string if a match wasn’t found by the end of all the iterations for the current length. At first I tried using loops, but I found that it was much easier to conceptualize a recursive procedure than a looping procedure. I think loops tend to be more efficient, but recursive functions may be a lot easier to read and I think in this case it gives the reader a better understanding of what is going on. When the procedure runs out of combinations of a given length, it calls itself, incrementing the places argument by one to increment the length of the comparison string.
The other modules are higher level modules that control this procedure. They aren’t as exciting, but here’s the second one:
[sourcecode]
Sub main(anchor As Range, digits As Long)
Dim password As String, truthrange As Range
Dim m As Long, bdigits As Long
Dim exectime As Single
Dim exectime2 As Date
exectime = Timer
exectime2 = Now()
bdigits = 1
password = ""
For m = 1 To digits
password = password & Chr(WorksheetFunction.RandBetween(33, 126))
Next m
anchor.Value = "’" & password
Set truthrange = anchor.Offset(0, 2)
truthrange.Value = "False"
Do Until truthrange.Value = True
Call test(password, "", bdigits, 0, truthrange)
bdigits = bdigits + 1
Loop
anchor.Offset(0, 3).Value = Format(Now() – exectime2, "hh:mm:ss")
anchor.Offset(0, 4).Value = Format(Timer – exectime, "00.00000")
End Sub
[/sourcecode]
This module generates a random string which we designate as our password, feeds it into the first module and keeps track of the execution time. It also contains methods for printing the results on the screen. Here’s the third module:
[sourcecode]
Sub macromain()
Dim i As Integer, trials As Integer, digits As Long
Application.ScreenUpdating = False
Application.Calculation = xlCalculationManual
Intersect(Sheets("sheet1").UsedRange, Range("A2:E1048576")).ClearContents
trials = Range("TRIALS")
digits = Range("Digits")
For i = 2 To trials + 1
Call main(Range("A" & i), digits)
Next i
Application.ScreenUpdating = True
Application.Calculation = xlCalculationAutomatic
End Sub
[/sourcecode]
This third module takes user input found on the spreadsheet page indicating the desired number of trials and password length, and feeds this information into the second module. Right now it runs pretty slow and is only programmed to run on one core. My computer at home has three cores and if I can get them to calculate combinations starting from different places in the ranges of ASCII character codes (different places in the range 33 to 126), I can get it to run three times as fast, in my opinion. But that will be for the distant future. One of the more immediate goals I can achieve is GUI programming with keyboard stroke and button clicking simulation to emulate how a human navigates password dialog and text boxes.
And that’s it! What lesson did we learn here?
1) Make your passwords long!
2) For administrators, lock users out if they cannot correctly type in the password after 3-5 attempts. This will prevent the computer from trying the millions of combinations necessary.
3) Check your login data to see if anything looks unusual! If you see a login at a time you did not log in, someone may have taken your data.
4) Always, always use different passwords for different accounts, and don’t reuse your passwords. This will require the thugs to run a new brute force for every account, or find another solution elsewhere.
5) If you have time, spend it on learning how to secure your information. You won’t regret it!
Thanks for reading!
Just to put a little real life grounding on your theoretical insight….
http://xkcd.com/538/
One lesson that I’m sure you understand, but your reader might not is to make sure you use the full ascii character space. If I’m sitting next to you and I catch a glimpse of you inputing your password, I see you only use numerals and your password is 7 characters long, thats a HUGE amount of information; That obviously greatly decreases the time spent bruteforcing. Now you might say thats cheating, but you’re going up against people who want to break you. To an extent, truly (in the theoretical sense) bruteforcing a legitimate target is impractical for the reasons you have demonstrated…
I read about a security firm a while back that uses your information (birthday, name, spouse, streets you’ve lived on, phone numbers) in combination with bruteforcing, dictionary attacks and smart substitution (ie @ for a or $ for s) to very effective use. Most humans can’t memorize even 10 truly random acsii characters. Every bit of information saves huge amounts of time in these kinds of attacks.
But before you decide to try and memorize that 20 character string, do think about who’s actually targetting you. If you really think you’re going to be going up again gov’t super computers, a) What data are your protecting? b) Can I take you hostage and hold you for ransom? As a general rule, security is expensive. In this case mental storage capacity, but generally expensive. Practicing memorization is a great mental exercise, but just like I don’t have a $10000 safe to keep my $300 in cash (i’m sure this is a concept you understand quite well mr. actuary), it’s not necessary for most people to have super super long passwords. Practicality is important too.
Last point on the recent spree of cyber attacks out there…a 100 character password wouldn’t have saved some of these people. Specifically the recent Sony attack, Sony kept their users data and passwords in unencrypted databases. All that hard work you spent memorizing that password became useless because Sony dropped the ball and let someone dump the database and read it. But thats also another good reason to use different passwords for different accounts.
VBA for excel won’t get you any street cred btw 😉 Back in theoretical mode, if this is strictly an academic/educational endevor, once you get it to within a constant order of the theortical limit, you’re more or less done. I wouldn’t worry about getting it 3x faster (for the sake of learning at least). Doing that is where computer science ends, and software engineering begins.
Some random notes on perf stuff:
– This code will never find a password with a space in it. Including whitespace will change your numbers a lot.
– Generally speaking, recursion will be a bit slower than iteration through a loop, unless you’re using a functional language that optimizes for it.
– I’m pretty sure the & operator in VB creates a copy of the string. If I had to guess just by looking at the code, I’d guess that that’s where your code is spending most of its time.
– In general, passwords will be hashed by a cryptographic hash function, which is a lot more expensive to compute than a string comparison. So that actually skews in the direction of making your code a couple times faster. 😉
– As far as parallelizing this: password cracking is what’s known as an “embarrassingly parallel” problem, where you can basically split the up as finely as you want, and run it on as many threads and processors as you have available with very little overhead. So, I’m not sure that you can do multiple threads in VBA, but if you can then you should be able to get pretty close to a perfect linear speedup by doing so.