Rulesfinder, automatically create good password cracking rulesets

Written by Simon Marechal - 14/05/2020 - in Outils - Download
We wrote a new tool that automates the creation of efficient mutation rules for password crackers, such as John the Ripper or hashcat.

This posts describes the high level ideas behind this tool, along with some history. If you just want to use it, check our Github repo!

Password cracking and mutation rules

Passwords are usually stored hashed, using a one way function. In this post, we will use the following terminology:

  • cleartext: this is the password that you type when authenticating (for example, password) ;
  • hash: this is how the password is stored (for example, 5f4dcc3b5aa765d61d8327deb882cf99) ;
  • dictionary: a list of words, or character sequences, that are used to crack the hashes.

Password cracking is the act of finding a cleartext from a hash. This is done by trying to guess what the password could be, and hashing that value. If they match, we found it!

  • candidate: this is a string that we are going to hash, and compare againsts the hashes we want to crack.

Of course, checking if a candidate matches a hash has a cost. In order to crack as many hashes as possible, we can try to reduce that cost (optimize the "check" step, speeding up the password cracking process), or to be clever about candidate selection (increase the probability that a candidate matches).

A key observation for password crackers is that people who do not use password managers choose passwords that they can remember. For this reason, they are much more likely to use a password like sunlight than @#q6qPKw;<"T. That is why the most efficient strategy for cracking password usually is the "dictionary attack", where the candidates are picked from a dictionary of common passwords.

However, people usually do not directly use common words, but twist them a bit. For example, instead of sunlight, someone could use Sunlight123!.

From this observation, Alex Muffett implemented in Crack (the first standalone password cracker) a system of rules that simulates these little twists. That system was reused and extended in John the Ripper and Hashcat, where it is known as password mangling rules, or, simply, rules.

One could observe that in order to go from sunlight to Sunlight123!, the user has capitalized a known word, and suffixed it with 123!. The corresponding rule could be cAz"123!". The rule description language is very terse (yes, it looks like a golfing language). While it might have been better, in hindsight, to optimize for readability, it is not so bad once you are used to it. If you want to soak in all the gory details, please take a look at the John the Ripper or hashcat variants.

Motivation

Creating a set of mutation rules that is efficient has mostly always been about hunches and experience. Here are the exceptions I know about:

  • In 2011, I started working on rulesfinder, the predecessor to this tool ;
  • In 2012, the best64 challenge (forum thread) goal was to find the set of 64 rules that would be the most efficient for cracking a given set of hashes (the phpbb.com leak) with a given dictionary (so called top10k). While not having the best methodology (training and validation sets are identical), the winner (Arex1337) produced a set of 64 rules that is quite good in practice ;
  • ... and that's all I know about. Please do not hesitate to correct me!

This project is a spiritual successor to rulesfinder, and shares its name. It aims at being a lot easier to use, and a lot more efficient. You will need to provide a set of cleartext, a dictionary, and it will work on its own in order to produce a good set of rules.

But how does it actually work?

Intuition

A naïve idea for finding the best set of rules would be to:

  • for each cleartext, determine if it is based on a dictionary word ;
  • determine the mutation rules that have been applied, and store them ;
  • once this is done, find the set of rules that cracks the most passwords.

Unfortunately, none of these steps could be realistically written as an algorithm :

  • for the l1nkedin cleartext, is it based on link, linked or something else? It is pretty obvious to a human that it is based on a well know website, but we could be wrong, and anyway that is something that can't easily be written as an algorithm ;
  • if we determine that l1nked is based on the linked word, how was it mutated? Did the user replace i with 1, the second character with 1, the first vowel ? There will usually be a lot of possible mutations, and a lot of possible base words for a given plaintext.
  • Finally, the last problem is the maximum coverage problem, which is NP-hard (exact solutions get too expensive to compute as the problem gets bigger).

Approach

Fortunately, it is possible to find approximate solutions that are still pretty good!

The algorithm that is implemented in rulesfinder starts with a set of mangling rules, and follows these steps:

  • for each mangling rule, each word in the dictionay is mangled ;
    • for each mangled word, check if it is a substring of one of the cleartext passwords ;
      • if that is the case, store that the given rule, along with the proper suffix and prefix cracks this password ;
  • once this is done, greedily compute a good coverage for the rules.

An example

Hopefully this will make more sense with an example! Let's start with the following data:

  • rules : T0 (toggle case of the first character), r (reverse word) ;
  • wordlist : sunlight, dog, cherry ;
  • cleartexts : Sunlight123!, Dog123!, !*yrrehc*!.

Step 1, mangle all input:

With rule T0:

Sunlight
Dog
Cherry

With rule r:

thgilnus
god
yrrehc

Step 2, find substrings:

  • password Sunlight123! contains Sunlight ;
  • password Dog123! contains Dog ;
  • password !*yrrehc*! contains yrrehc.

Assemble the database:

  • rule T0 with suffix 123! (this is T0Az"123!") cracks two passwords, Sunlight123! and Dog123! ;
  • rule r along with prefix !* and suffix *! (this is rA0"!*"Az"*!") cracks one password, !*yrrehc*!.

Step 3, greedy coverage:

From there, two rules are sufficient to crack all passwords:

T0Az"123!"
rA0"!*"Az"*!"

Conclusion

Hopefully, this shed some light on the process of cracking passwords, and especially on automatically finding good password mangling rules. Do not hesitate to use the tool, and to let us know how it went for you.

Happy cracking!