Tuesday, February 9, 2010

Introduction to Probabilistic Theory

Introduction

During several of the missions posted on the site here, you may be required to crack a password hash that was created using real-world algorithms such as MD5 and SHA1. There are, however, a few missions in which you are required to exploit a proprietary algorithm that was created by the devs of the mission. These algorithms are intentionally created with a weakness so that you can learn to exploit these weaknesses and later apply those skills to more complex algorithms.

Combinations vs. Permutations

The main focus of this article is based around brute-force cracking, though the principles here can be applied to more efficient methods such as birthday and collision attacks. As you probably already know, the basic idea of a brute-force attack is to try every possible password in order to find the desired key. Though the idea is simple enough, the implementation can be much trickier. While you are guaranteed to eventually get the right answer, brute-force is the most inefficient way to go about it.

For example, if you know that the password you are trying to crack has no more than 15 letters (lowercase only), this means that there are over 1.67 septillion different passwords. These passwords make up what is called the keyspace, or the total number of possible passwords. My laptop crunches about 5.5 to 6 million hashes per second per Cain and Abel. With the keyspace mentioned above, it will take just under 8.9 million years try every password, and that's just with lowercase letters! If you include capital letters and numbers, the computational time required skyrockets to over 4 trillion years, or 290 times the age of the universe. Clearly, we need to reduce the keyspace, but how?

This is where permutations and combinations come in. Though the two phrases are often used interchangeably, they actually have two very different meanings. A "permutation" is a list of every possible sequence of a given set where order matters. Basically, it is the basis of most brute-force attacks. If you think of a bike lock with 3 numbers from 0 to 9, you can easily figure out how many different numbers there are by raising 10 (the number of choices per slot) to the power of 3 (the number of slots), just as we got our keyspace above (26 letters per slot, 15 slots). This is what you usually assume you must do when brute-forcing a password.

Using an example from Extended Basic 11, a password is calculated by assigning each letter, a to z, a prime number. When the user inputs their password, each letter is replaced with the corresponding prime and then all the numbers are multiplied together to form the hash. Since the hash is comprised of prime numbers, the only possible solution is to multiply the correct prime factors together. The easiest way to solve this would be to write a program to decompose the hash into its prime factors, but since the point of the article is to discuss brute-force attacks, we will take that route (additionally, nearly every hashing algorithm is one-way, making decomposition of the hash into its original state impossible).

(It is important to note that the example that follows is NOT how you solve ExtBasic 11). For this example, the passphrase we are trying to find is "zzz". From our table, the letter "z" has a prime value of 101. The corresponding hash is:

CODE :
101*101*101 = 1030301 (hey, it's a palindrome!)

Assuming we know that the password has three letters, this gives us a keyspace of 17,576 (26^3). With this in mind, we will begin our brute-force attack, starting with the string "aaa".

CODE :
aaa = 2*2*2 = 8 != 1030301
aab = 2*2*3 = 12 != 1030301
aac = 2*2*5 = 20 ! = 1030301
.
.
.
aaz = 2*2*101 = 404 != 1030301


Having reached the end of the third column, we rollover the second column and begin again.

CODE :
aba = 2*3*2 = 12 !=1030301

Hey, wait a second…we've already tried the number 12! Looking at the password and the algorithm, it's pretty apparent why this happened. Even though the algorithm uses prime numbers, it also uses multiplication, which is commutative; that is, a*b*c = a*c*b = b*c*a and so on. So that means since we've already tried "aab", we don't have to try "aba" or "baa", since they will just give us the same hash again. This also applies to all the other passwords we've already tried, "aca" and "caa", all the way to "aza" and "zaa". So for the first column we did, for every password we tried, we were actually trying three. Therefore, without any additional computation, we just reduced our keyspace by two-thirds!

This is where "combinations" come in. A combination is basically the same as a permutation with one big difference; in a combination, order doesn't matter! While it may seem rather insignificant at first, the difference between a combination and a permutation is vast.

Take the lottery, for example. An oft-quoted statistic lists extremely lower odds of winning the lottery, usually on the order of 1 in several billion, sometimes even lower. If the lottery we play has 45 numeric balls and we are required to pick six, many people's first instinct is to assume the total number of sequences is 45^6 = 8.3 billion possible lottery numbers. But this assumption makes two important mistakes.

First, it doesn't take into account that once a number is drawn for our lottery, it cannot be drawn again (you can't have two 45's drawn, for example). In probabilistic theory, this is called a "permutation without replacement." So that means for each ball we pull, the next slot has a fewer pool of numbers to draw from; the first slot has 45 balls, the second slot has 44 balls, and so on. This reduces the number of possible sequences to 45*44*43*42*41*40 = 5.8 billion sequences. Though we've made a dent, it's still pretty long odds.

The second and more profound mistake made is that we calculated the lottery numbers as permutations and not combinations. If you've ever played the lottery, you know that order doesn't matter; 1-2-3-4-5-6 is the same as 6-5-4-3-2-1. So knowing that, we can greatly reduce the number possible lottery numbers. To calculate the number of combinations without replacement, we use the following equation:

CODE :
n!
---------
k!(n-k)!


Where "n" is the number of choices and "k" is the number of slots. Since there are 45 numbers and 6 slots, this comes out to be just over 8.1 million combinations. While the odds are still extremely long, we just improved our odds by over 1,000!

Applying this to our password example, we now realize that we don't have to test every permutation, just every combination. Since we can have more than one occurrence of the same letter, we have to figure out the combinations with replacement. The formula is a bit more complex, however:

CODE :
(n+k-1)!
---------
k!(n-1)!


Plugging in our numbers, we discover there are only 3,276 possible passwords, only 18% of our original keyspace! More importantly, this difference increases with the length of the password; to demonstrate, imagine if we had a password 15 letters long:

CODE :
Permutation:
26^15 = 1.67e21 permutations
This would take 8.9 million years to crack.

Combination:
(26 + 15 – 1)!
-------------- = 40.2 billion combinations
15! (26 – 1)!
This would take 1.86 hours to crack.

Conclusion

As you can see, reducing your keyspace can have a profound impact on your ability to crack hashes. Though ostensibly modern hash algorithms don't use such easily exploitable methods and will require you to test permutations instead of just combinations, the principle behind reducing the keyspace remains the same.

source : http://www.hackthissite.org

No comments:

Post a Comment

try to make something then you never be lost

+++

Share |

"make something then You never be lost"

wibiya widget