|

|
|
If protection system is safe enough the only way to find password is search
all of the password variants. This way of cracking is usually called a
'brute force attack'. It takes a great amount of resources to crack even
a not very long password. Brute force attack will not help you to break
a 9-character password even if all the letters of the password are in
the same case. But why must smart guys use brute force? For the most part
there are combinations like jkqmzwd which are totally senseless among
billions and trillions of passwords being searched. I attended this problem
some time ago. My purpose was creation of a search algorithm which would
only try "reasonable" passwords. I named this algorithm "smart
force attack" as opposed to "brute force attack" methods.
I was proceeding with the following assumptions:
it is important that the "smart force" engine was fast enough,
'cause the time taken to generate a password is added to the time taken
by the verification itself.
linguistics is to be considered. English language has its own rules. If,
while reading a text, you engage an unfamiliar word you nevertheless can
feel that it's a word, not an abracadabra of random characters. A person
identifies validity of a word intuitively, although some rules can be
written expressly. Some of them are evident - for example, a password
starting with "p" is much more probable than a password with
"k".
psychology is to be considered. The following fact is known: when making
up a random number a person clearly prefers some digits to others (the
preference is mainly defined by the last digit in a number). By the way,
some pretty curious mathematical laws regarding gambling and guessing
numbers could be deduced from this rule, but this is to be discussed separately.
The same preferences work with letters, too. I have analyzed the distribution
of probabilities of actual passwords' distribution depending on the first
letter. It can be stated that the obtained distribution greatly differs
from a usual words' distribution. The phrase greatly differs should be
understood in such a way that 'I know the basics of math statistics well,
I know how to count probabilities and what criteria to use but I'm too
lazy to actually employ this in practice and I prefer to assess them roughly
:-). Thus, we should add psychological rules to the linguistic ones, and
the former are even hardly to formalize than the latter.
the "extraneous force" is to be considered. With all considerations
accepted, a sequence of characters which goes like "qwerty"
should be quite random, yet anyone who has seen a keyboard knows it's
not true. I take a chance and make a suggestion that there's a whole lot
of other factors that can hardly all be taken into consideration and correctly
formulated. With all the aforementioned, a very curious idea comes to
existence: to tackle the problem one should use artificial intelligence
methods. Neuron networks are the best way to solve poorly determined problems.
Neuron networks is an interesting field, yet it's completely new to me.
(There is, however, a functioning model, created by me. There is no way
to state its practical value, because it's performance is sluggish; moreover,
the passwords it generates often look like an outright abracadabra :-).
Let us now discuss a more traditional method. It is evident that some
letters are encountered more frequently than others. Therefore the search
can be more reasonably started with the most probable letters. The probability
of encountering a letter very much depends on its position within a word.
For example, for a "Y" letter a probability to be found in the
beginning of a word equals to 0.003. Yet it can be found forty times more
frequently at the last position (a chance of 0.120). It can be a second
letter in a word with a probability of 0.030, and a last but one letter
with a probability of 0.004.
It is evident that the probability of encountering a letter not only depends
on its position within a word, it also depends on preceding letters. For
example, for an "H" to appear after a "C" is four
hundred times more probable (sic!) than after an "E". Of the
greatest significance is the fact that many combinations are not encountered
at all and they can be excluded from the search altogether. Clearly the
probability of a double letter occurrence also greatly depends on its
position within a word. For example, the "SS" combination as
an ending occupies the fifth position in a list of the most popular endings,
meanwhile it cannot be used as a beginning at all. Moreover, there exists
a connection with all of preceding letters, not just the immediate one.
I decided to simplify the task, for starters. By means of statistical
analysis I have created a table of values of a function P(a,p,i,n) - a
Probability to encounter an "a" letter after a "p"
letter at a position of "i" in a word of "n" letters.
Knowing these values one could get something that to an extent could be
called a probability of a given sequence of characters making sense for
a human being - this being obtained for any array of characters by multiplying
respective probabilities. Further, one should search character sequences
in order of decreasing probability. This task is not that simple. Unfortunately,
this is about where my desire to publish the results of my research ends
:-). I will only mention that the problem of an optimal password search
is not unlike an iceberg - only its tip is marked here. The obtained results
indicate that the time of a search of all variants with a non-zero probability
for a 10-character password will total about 8 days instead of 45 years
(as is in the case of a "brute force attack") with a search
speed of 100,000 passwords per second. Notice any difference? If we limit
ourselves to passwords of a significant probability, we can cut the waiting
time to about 10 hours. If, furthermore, we take advantages of improving
the algorithm, we can see that even twelve-characters-long passwords can
be cracked in a reasonable time. Of course, the "smart force attack"
won"t work with a password like "I"m$smarter!"
Read more
here .
|