Writing a (toy) symbolic interpreter, and solving challenges, part 4

Rédigé par Simon Marechal - 30/07/2021 - dans Tools - Téléchargement
This is the last part of series, where we solve the challenge using our symbolic interpreter, and an external SMT solver. Huge success!

This is a series, please check part 1, part 2 and part 3 if you have not done already! Do not hesitate running the code alongside, especially if you want to discover the actual solution to the challenge, that I redacted in this article :)

The situation so far

We wrote a symbolic interpreter that can run the challenge code, traverse all possible paths, and for each path, give us the list of constraints that lead to that path, and the result this path returns. Reading the challenge JavaScript code, it can be deduced that the return code can be either 1 (for a success), or 2 (for an epic success).

For example, when running with an unknown password1 that is one character long, we get:

$ stack run /path/to/keygenme.wasm username '?'
----
constraints
oneif(u32(a10) == 0) != 0 == False
oneif(u32(a10) == 0) | oneif((u32(a10) - u32(u8((3405691582 ^ (u32(117) * 1111638594) ^ (u32(115) * 1111638594) ^ (u32(101) * 1111638594) ^ (u32(114) * 1111638594) ^ (u32(110) * 1111638594) ^ (u32(97) * 1111638594) ^ (u32(109) * 1111638594) ^ (u32(101) * 1111638594) >> u8(4) >> u8(4) >> u8(4) >> u8(4) >> u8(4) >> u8(4) >> u8(4)) & 15 + 48))) != 0) != 0 == False
result
oneif((u32(0) - u32(u8((3405691582 ^ (u32(117) * 1111638594) ^ (u32(115) * 1111638594) ^ (u32(101) * 1111638594) ^ (u32(114) * 1111638594) ^ (u32(110) * 1111638594) ^ (u32(97) * 1111638594) ^ (u32(109) * 1111638594) ^ (u32(101) * 1111638594) >> u8(4) >> u8(4) >> u8(4) >> u8(4) >> u8(4) >> u8(4)) & 15 + 48))) == 0) + (oneif((0 + 1) == 18) & (oneif((3735929054 rotl u8(7)) ^ (u32(a10) * 3) == 1655684815) & (oneif((3405691582 rotl u8(6)) ^ (u32(a10) * 5) == 94170085) & (oneif((322420947 rotl u8(5)) ^ (u32(a10) * 7) == 1350428601) & (oneif((1108555586 rotl u8(4)) ^ (u32(a10) * 11) == 3977135300) & oneif((3203380031 rotl u8(3)) ^ (u32(a10) * 13) == 231166559))))) * 2)
----
constraints
oneif(u32(a10) == 0) != 0 == True
oneif(u32(a10) == 0) | oneif((u32(a10) - u32(u8((3405691582 ^ (u32(117) * 1111638594) ^ (u32(115) * 1111638594) ^ (u32(101) * 1111638594) ^ (u32(114) * 1111638594) ^ (u32(110) * 1111638594) ^ (u32(97) * 1111638594) ^ (u32(109) * 1111638594) ^ (u32(101) * 1111638594) >> u8(4) >> u8(4) >> u8(4) >> u8(4) >> u8(4) >> u8(4) >> u8(4)) & 15 + 48))) != 0) != 0 == False
result
oneif((u32(0) - u32(u8((3405691582 ^ (u32(117) * 1111638594) ^ (u32(115) * 1111638594) ^ (u32(101) * 1111638594) ^ (u32(114) * 1111638594) ^ (u32(110) * 1111638594) ^ (u32(97) * 1111638594) ^ (u32(109) * 1111638594) ^ (u32(101) * 1111638594) >> u8(4) >> u8(4) >> u8(4) >> u8(4) >> u8(4) >> u8(4)) & 15 + 48))) == 0) + (oneif(0 == 18) & (oneif(3203380031 == 1655684815) & (oneif(3735929054 == 94170085) & (oneif(3405691582 == 1350428601) & (oneif(322420947 == 3977135300) & oneif(1108555586 == 231166559))))) * 2)
----
constraints
oneif(u32(a10) == 0) != 0 == False
oneif(u32(a10) == 0) | oneif((u32(a10) - u32(u8((3405691582 ^ (u32(117) * 1111638594) ^ (u32(115) * 1111638594) ^ (u32(101) * 1111638594) ^ (u32(114) * 1111638594) ^ (u32(110) * 1111638594) ^ (u32(97) * 1111638594) ^ (u32(109) * 1111638594) ^ (u32(101) * 1111638594) >> u8(4) >> u8(4) >> u8(4) >> u8(4) >> u8(4) >> u8(4) >> u8(4)) & 15 + 48))) != 0) != 0 == True
result
oneif((u32(a10) - u32(u8((3405691582 ^ (u32(117) * 1111638594) ^ (u32(115) * 1111638594) ^ (u32(101) * 1111638594) ^ (u32(114) * 1111638594) ^ (u32(110) * 1111638594) ^ (u32(97) * 1111638594) ^ (u32(109) * 1111638594) ^ (u32(101) * 1111638594) >> u8(4) >> u8(4) >> u8(4) >> u8(4) >> u8(4) >> u8(4) >> u8(4)) & 15 + 48))) == 0) + (oneif((0 + 1) == 18) & (oneif((3735929054 rotl u8(7)) ^ (u32(a10) * 3) == 1655684815) & (oneif((3405691582 rotl u8(6)) ^ (u32(a10) * 5) == 94170085) & (oneif((322420947 rotl u8(5)) ^ (u32(a10) * 7) == 1350428601) & (oneif((1108555586 rotl u8(4)) ^ (u32(a10) * 11) == 3977135300) & oneif((3203380031 rotl u8(3)) ^ (u32(a10) * 13) == 231166559))))) * 2)
----
constraints
oneif(u32(a10) == 0) != 0 == True
oneif(u32(a10) == 0) | oneif((u32(a10) - u32(u8((3405691582 ^ (u32(117) * 1111638594) ^ (u32(115) * 1111638594) ^ (u32(101) * 1111638594) ^ (u32(114) * 1111638594) ^ (u32(110) * 1111638594) ^ (u32(97) * 1111638594) ^ (u32(109) * 1111638594) ^ (u32(101) * 1111638594) >> u8(4) >> u8(4) >> u8(4) >> u8(4) >> u8(4) >> u8(4) >> u8(4)) & 15 + 48))) != 0) != 0 == True
result
oneif((u32(a10) - u32(u8((3405691582 ^ (u32(117) * 1111638594) ^ (u32(115) * 1111638594) ^ (u32(101) * 1111638594) ^ (u32(114) * 1111638594) ^ (u32(110) * 1111638594) ^ (u32(97) * 1111638594) ^ (u32(109) * 1111638594) ^ (u32(101) * 1111638594) >> u8(4) >> u8(4) >> u8(4) >> u8(4) >> u8(4) >> u8(4) >> u8(4)) & 15 + 48))) == 0) + (oneif(0 == 18) & (oneif(3203380031 == 1655684815) & (oneif(3735929054 == 94170085) & (oneif(3405691582 == 1350428601) & (oneif(322420947 == 3977135300) & oneif(1108555586 == 231166559))))) * 2)


As a reminded, a10 is the variable that corresponds to the first character of the password.

Execution paths are separated by ----, constraints are the constraints that have been gathered while following this path, and the expression under result is the expression that represents the result of the checkSerial function.

However these expressions are huge, and not really helpful.

Thinking in symbols

We can try to evaluate all the parts that are not dependent on a variable:

----
constraints
oneif(u32(a10) == 0) != 0 == False
oneif(u32(a10) == 0) | oneif((u32(a10) - 57) != 0) != 0 == False
result
0 + (0 & (oneif(1457549167 ^ (u32(a10) * 3) == 1655684815) & (oneif(3215896498 ^ (u32(a10) * 5) == 94170085) & (oneif(1727535714 ^ (u32(a10) * 7) == 1350428601) & (oneif(557020196 ^ (u32(a10) * 11) == 3977135300) & oneif(4152203773 ^ (u32(a10) * 13) == 231166559))))) * 2)
----
constraints
oneif(u32(a10) == 0) != 0 == True
oneif(u32(a10) == 0) | oneif((u32(a10) - 57) != 0) != 0 == False
result
0
----
constraints
oneif(u32(a10) == 0) != 0 == False
oneif(u32(a10) == 0) | oneif((u32(a10) - 57) != 0) != 0 == True
result
oneif((u32(a10) - 57) == 0) + (0 & (oneif(1457549167 ^ (u32(a10) * 3) == 1655684815) & (oneif(3215896498 ^ (u32(a10) * 5) == 94170085) & (oneif(1727535714 ^ (u32(a10) * 7) == 1350428601) & (oneif(557020196 ^ (u32(a10) * 11) == 3977135300) & oneif(4152203773 ^ (u32(a10) * 13) == 231166559))))) * 2)
----
constraints
oneif(u32(a10) == 0) != 0 == True
oneif(u32(a10) == 0) | oneif((u32(a10) - 57) != 0) != 0 == True
result
oneif((u32(a10) - 57) == 0) + 0

Then use the following simplifications:

  • 0 & x = 0
  • 0 + x = 0
  • x != y == False -> x == y
  • x == True -> x
  • a - b != c -> a != b + c
  • oneif(c1) | oneif(c2) != 0 -> c1 | c2

We end up with:

----
constraints
oneif(a10 == 0) == 0
(a10 == 0) | (u32(a10) != 57) == False
result
0
----
constraints
oneif(a10 == 0) != 0
(a10 == 0) | (u32(a10) != 57) == False
result
0
----
constraints
oneif(a10 == 0) == 0
(a10 == 0) | (u32(a10) != 57)
result
oneif((u32(a10) - 57) == 0)
----
constraints
oneif(a10 == 0) != 0
(a10 == 0) | (u32(a10) != 57)
result
oneif((u32(a10) - 57) == 0)

The first two cases always return 0, which is a failure. In order to win, we must find an input that returns 1 or more. We can ignore the first two cases. In cases 3 and 4, there is the following constraint:

(a10 == 0) | (u32(a10) != 57)

It is true for any value that a10 can take, except for 57. However, the result is only equal to 1 if a10 is equal to 57! There is no solution where username is username, and the password is exactly one character.

So, let's try this time with an 8 letters password. Well, we end up with 81 distinct paths. This is clearly too tedious to analyze by hand!

Scaling up

First of all, we are going to start simple, and aim for the case where the result is 1.

Doing so, we can take all of our constraints, and add result == 1 to the list of constraints. Then we need to look for a path where we can find input values that verify all the constraints.

By turning the result value into an extra constraint, and with more simplifications, we get constraints systems such as:

u32(a10) == 57
u32(a13) == 0
u32(a12) != 0
u32(a11) != 0
u32(a10) != 0
(u32(a10) == 0) | ((u32(a10) - 57) != 0)

This is also something that is obviously false, because the first and last constraints are contradictory. We can automate this simple analysis by traversing the constraints, and repeatedly simplifying everything. When we get a constraint with the form "variable == value", we can assume the variable holds this value, and try to simplify even further. If we end up with a contradiction, we can drop this path.

Part 1, complete!

This approach, while extremely simple, works very well. We can increase the size of the password until ...

$ stack run -- --mode Symbolic /path/to/keygenme.wasm 'username' '????????????????????????' --expected 1
$ stack run -- --mode Symbolic /path/to/keygenme.wasm 'username' '?????????????????????????' --expected 1
$ stack run -- --mode Symbolic /path/to/keygenme.wasm 'username' '??????????????????????????' --expected 1
# Solution:
username: username
password: 95a1e512-530dd28b-a8c6285f
$ stack run -- --mode Symbolic /path/to/keygenme.wasm 'username1' '??????????????????????????' --expected 1
# Solution:
username: username1
password: 3b0f4bb0-88df1b4f-3822a4d7

We got a keygen without ever looking at the code! Of course, this challenge is particularly well suited to that approach, at least up to now.

Fully solving the challenge

Now, can we use the same approach to solve the second part of the challenge? It turns out that we can discover the length of the expected password, just like with the first part. However, when the correct length is found, we end up with 18 possible paths, that have constraints like:

# Solution:
Constraints:
 * (oneif(a11 == 51) + (1 & (oneif((((((((((((((((((1727535714 ^ (u32(a11) * 7) rotl 6) ^ (u32(a12) * 5) rotl 7) ^ (u32(a13) * 3) rotl 3) ^ (u32(a14) * 13) rotl 4) ^ (u32(a15) * 11) rotl 5) ^ (u32(a16) * 7) rotl 6) ^ (u32(a17) * 5) rotl 7) ^ (u32(a18) * 3) rotl 3) ^ (u32(a19) * 13) rotl 4) ^ (u32(a20) * 11) rotl 5) ^ (u32(a21) * 7) rotl 6) ^ (u32(a22) * 5) rotl 7) ^ (u32(a23) * 3) rotl 3) ^ (u32(a24) * 13) rotl 4) ^ (u32(a25) * 11) rotl 5) ^ (u32(a26) * 7) rotl 6) ^ (u32(a27) * 5) rotl 7) ^ (u32(a28) * 3) == 1655684815) & (oneif((((((((((((((((((557020196 ^ (u32(a11) * 11) rotl 5) ^ (u32(a12) * 7) rotl 6) ^ (u32(a13) * 5) rotl 7) ^ (u32(a14) * 3) rotl 3) ^ (u32(a15) * 13) rotl 4) ^ (u32(a16) * 11) rotl 5) ^ (u32(a17) * 7) rotl 6) ^ (u32(a18) * 5) rotl 7) ^ (u32(a19) * 3) rotl 3) ^ (u32(a20) * 13) rotl 4) ^ (u32(a21) * 11) rotl 5) ^ (u32(a22) * 7) rotl 6) ^ (u32(a23) * 5) rotl 7) ^ (u32(a24) * 3) rotl 3) ^ (u32(a25) * 13) rotl 4) ^ (u32(a26) * 11) rotl 5) ^ (u32(a27) * 7) rotl 6) ^ (u32(a28) * 5) == 94170085) & (oneif((((((((((((((((((4152203773 ^ (u32(a11) * 13) rotl 4) ^ (u32(a12) * 11) rotl 5) ^ (u32(a13) * 7) rotl 6) ^ (u32(a14) * 5) rotl 7) ^ (u32(a15) * 3) rotl 3) ^ (u32(a16) * 13) rotl 4) ^ (u32(a17) * 11) rotl 5) ^ (u32(a18) * 7) rotl 6) ^ (u32(a19) * 5) rotl 7) ^ (u32(a20) * 3) rotl 3) ^ (u32(a21) * 13) rotl 4) ^ (u32(a22) * 11) rotl 5) ^ (u32(a23) * 7) rotl 6) ^ (u32(a24) * 5) rotl 7) ^ (u32(a25) * 3) rotl 3) ^ (u32(a26) * 13) rotl 4) ^ (u32(a27) * 11) rotl 5) ^ (u32(a28) * 7) == 1350428601) & (oneif((((((((((((((((((1457549167 ^ (u32(a11) * 3) rotl 3) ^ (u32(a12) * 13) rotl 4) ^ (u32(a13) * 11) rotl 5) ^ (u32(a14) * 7) rotl 6) ^ (u32(a15) * 5) rotl 7) ^ (u32(a16) * 3) rotl 3) ^ (u32(a17) * 13) rotl 4) ^ (u32(a18) * 11) rotl 5) ^ (u32(a19) * 7) rotl 6) ^ (u32(a20) * 5) rotl 7) ^ (u32(a21) * 3) rotl 3) ^ (u32(a22) * 13) rotl 4) ^ (u32(a23) * 11) rotl 5) ^ (u32(a24) * 7) rotl 6) ^ (u32(a25) * 5) rotl 7) ^ (u32(a26) * 3) rotl 3) ^ (u32(a27) * 13) rotl 4) ^ (u32(a28) * 11) == 3977135300) & oneif((((((((((((((((((3215896498 ^ (u32(a11) * 5) rotl 7) ^ (u32(a12) * 3) rotl 3) ^ (u32(a13) * 13) rotl 4) ^ (u32(a14) * 11) rotl 5) ^ (u32(a15) * 7) rotl 6) ^ (u32(a16) * 5) rotl 7) ^ (u32(a17) * 3) rotl 3) ^ (u32(a18) * 13) rotl 4) ^ (u32(a19) * 11) rotl 5) ^ (u32(a20) * 7) rotl 6) ^ (u32(a21) * 5) rotl 7) ^ (u32(a22) * 3) rotl 3) ^ (u32(a23) * 13) rotl 4) ^ (u32(a24) * 11) rotl 5) ^ (u32(a25) * 7) rotl 6) ^ (u32(a26) * 5) rotl 7) ^ (u32(a27) * 3) rotl 3) ^ (u32(a28) * 13) == 231166559))))) * 2)) == 2
 * a28 != 0
 * a27 != 0
 * a26 != 0
 * a25 != 0
 * a24 != 0
 * a23 != 0
 * a22 != 0
 * a21 != 0
 * a20 != 0
 * a19 != 0
 * a18 != 0
 * a17 != 0
 * a16 != 0
 * a15 != 0
 * a14 != 0
 * a13 != 0
 * a12 != 0
 * a11 != 0
 * (a11 == 0) | ((u32(a11) - 51) != 0) == True

That is one hell of an expression! Solving this by extending the previous approach will be extremely tedious, if not outright impossible.

Using an SMT solver

As described in this series introduction, an SMT solver appears to be the perfect tool for this problem, as we already have a bunch of equations that need to be solved.

For each path, we have a lot of logical propositions. For this path to be the winning path, all propositions must hold true. That means that we can combine them with an and operator to create a (very) large proposition.

Then, any path can succeed, so we can combine all the paths propositions into one (huge) proposition, using the or operator.

However, how can we interface with the solver? There is a very easy way to do so, using the sbv package. This is so easy that the only thing that needed to be done was to write the Symb instance, and the code that actually runs the solver.

We can check that it works by checking again if it can solve the first part of the challenge:

$ stack run -- --mode SMT /path/to/keygenme.wasm 'username' '??????????????????????????' --expected 1
# Solution:
username: username
password: 95a1e512-530dd28b-a8c6285f

And for the second part:

$ time stack run -- --mode SMT /path/to/keygenme.wasm 'username1' '??????????????????' --expected 2 
username: username1
password: ************** # REDACTED, run the program yourself to get the solution :)
stack run -- --mode SMT /path/to/keygenme.wasm    2  25.64s user 0.09s system 100% cpu 25.691 total

And it is a win!

Conclusion

We managed to completely solve the challenge without ever reversing it. Of course, this challenge is particularly well suited for such an approach. Hopefully, if you have followed so far, you now have a good idea of what writing a simple symbolic execution engine entails. This means that you can either try to write your own, or that you should be comfortable using something a little more battle tested. Maybe even in an language that is not Python!2

I also hope that I made obvious the fact that SMT solvers are easy to use and extremely powerful. They can be used in a variety of situations. For example, we use them in our password cracking product, Kraqozorus, to generate optimal cracking strategies.

That is all for now, let's get cracking!

 

  • 1. It turns out that the challenge can be solved with an arbitrary username, so we will just pick "username", and try to find the winning passwords. Of course, it is possible to know that beforehand by manually reversing, but it can also be deduced by looking at the constraints generated from each path.
  • 2. Yes, you should write all your security tools in Haskell. Or at least in a strongly typed languages.