This challenge consists of a precompiled python file generated with permuted opcodes. The description is as follows:
Challenge
Once we download the zip, we get two files files:
Archive: py_d5764c66f02cccdb356c532d60d4d079.zip
inflating: crypt.pyc
inflating: encrypted_flag
By looking the output of strings command in
crypt.pyc it seems that there is a
decrypt function implemented in the compiled file. We have to reconstruct the opcodes in a proper way to be loaded and executed by our standard interpreter, because it's impossible to do it in the current format.
[kimera@kimera zip]$ strings crypt.pyc
f4oXc
!@#$%^&*t
abcdefghs
<>{}:"i
EOF(
rotort
newrotort
encrypt(
datat
key_at
key_bt
key_ct
secrett
rot(
/Users/hen/Lab/0CTF/py/crypt.pyR
!@#$%^&*R
<>{}:"i
decrypt(
/Users/hen/Lab/0CTF/py/crypt.pyR
/Users/hen/Lab/0CTF/py/crypt.pyt
<module>
[kimera@kimera zip]$ python2
Python 2.7.13 (default, Dec 21 2016, 07:16:46)
[GCC 6.2.1 20160830] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import crypt
XXX lineno: 1, opcode: 153
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/hen/Lab/0CTF/py/crypt.py", line 1, in <module>
SystemError: unknown opcode
>>>
As stated above, the module loading process fails with unknown opcode in first line. We will have to go through the python precompiled opcodes to make it works.
Understanding .pyc files
The
.pyc structure has basicaly three fields:
- 4 bytes: Magic number two bytes for the loading engine and "\r\n" to prevent corruption if processes as text since the magic number would change.
- 4 bytes: Timestamp Unix timestamp modification of the source file that generated the .pyc file.
- N bytes: Marshaled Object the actual data to be loaded and executed.
In the following image, the first mark correspond to the
magic number, the second one is the
timestamp and the rest of the data, unmarked, is the marshaled object.
.pyc overview
Now that we know the existence of the marshaled object, we can try to load it and start working by mapping the content into actual objects in python.
Working with .pyc files
To work with
.pyc files we are going to use two modules:
- marshal This module contains functions that can read and write Python values in a binary format.
- dis The dis module supports the analysis of CPython byte-code by disassembling it
The way to load the
.pyc in memory is using the marshal module. First of all we have to load just the marshaled data and not the whole file. Just like this:
In [4]: marshaled_data = open("crypt.pyc").read()[8:]
In [5]: m = marshal.loads(marshaled_data)
In the first line, all meta data from headers have been removed, and only the marshal payload is stored in the variable to finally load it using the
loads method of marshal module. Now that we have the object, we can access to all mapped fields in object
m.
Marshal initial values
In the above example we can see different attributes:
- co_filename original name of the source file
- co_names names declared in the scope
- co_varnames local variables, normally used in functions
- co_consts all declared objects marshaled
- co_code opcode of the scope, in this case, the __main__ scope
We can also see the same attributes but referred to the functions. For example, you may notice
in the
co_consts attribute two elements, which are of type
code. Those elements are functions, and the array contains a marshaled object containing all the opcodes and meta information, just like our current object.
Let's see an example, we are going to get the
decrypt function, which is the forth position of the
co_consts array.
m_decrypt = m.co_consts[3]
As expected, now we can access to a all the attributes related to
decrypt function, and we can see that is different compared to the previous loaded
__main__ object.
Marshal initial values
Understanding OpCodes
We just got all what we need to properly load a
.pyc file and map it into
an object. It's time to continue with the challenge. First of all, we are going take
a look to the current disassembled code.
In order to disassemble the opcodes we are going to use the
dis module, method
dis also. If the value of the parameter is a buffer, it just disassemble the whole data, but, if a
marshal
object is used, it makes cross-relations between names/n_constants/opcodes which helps a lot to
understand what is going on.
Let's see how the disassembled code looks like, let's start with the
__main__ scope, there is a full list of the opcode's meaning in the
dis module homepage:
__main__ marshaled object disassembled
The module fails representing the opcodes, it happens because we are using the whole object as parameter and the
dis module tries to match lot of attributes and values in a consistent way. Since the opcodes are
permuted, a lot of inconsistencies appears making the interpreter to crash.
Let's try to disassemble just the
co_code property, which is the raw data, without attributes such variable names or constants and see if it works.
__main__ raw opcodes disassembled
This time we can see that all opcodes have been interpreted by
dis, it just may help to get an idea. But the current interpretation is still wrong,
If we want to understand what is wrong, the easiest way to do it IMHO is to create our own simple script and see how correct disassembled code looks like. For example, let's create a file
hello.py:
#!/usr/bin/python
def hello(name):
print "Hi!" + name
if __name__ == "__main__":
hello("foo")
We are going to reproduce the same steps than before to load the marshaled object from the
.pyc file generated using the above code.
>>> import dis
>>> import marshal
>>> import hello # it generates the '.pyc' file
>>> m = marshal.loads(open("hello.pyc").read()[8:])
>>> dis.dis(m)
3 0 LOAD_CONST 0 (<code object hello at 0x10ddfbb30, file "hello.py", line 3>)
3 MAKE_FUNCTION 0
6 STORE_NAME 0 (hello)
6 9 LOAD_NAME 1 (__name__)
12 LOAD_CONST 1 ('__main__')
15 COMPARE_OP 2 (==)
18 POP_JUMP_IF_FALSE 34
7 21 LOAD_NAME 0 (hello)
24 LOAD_CONST 2 ('foo')
27 CALL_FUNCTION 1
30 POP_TOP
31 JUMP_FORWARD 0 (to 34)
>> 34 LOAD_CONST 3 (None)
37 RETURN_VALUE
>>>
Now all make sense, the opcodes are properly interpreted and disassembled, also the correlation made by
dis is working much better. Let's see the above output more in detail. The opcodes seems to be separated in three different blocks.
> First Block
It is formed by the opcodes from 0 to 6, both included. This block describes the global scope, in this case, the declaration of the function
hello and it makes sense because of the opcodes we have:
- LOAD_CONST(consti)
- Pushes co_consts[consti] onto the stack.
- MAKE_FUNCTION(argc)
- Pushes a new function object on the stack. Top-of-stackk (TOS) is the code associated with the function. The function object is defined to have argc default parameters, which are found below TOS when it is called.
- STORE_NAME(namei)
- Implements name = TOS. namei is the index of name in the attribute co_names of the code object. The compiler tries to use STORE_FAST or STORE_GLOBAL if possible.
The overall behavior is that a reference to
hello function code, which is stored in the
co_consts array, is put into the stack with the first opcode. Then,
MAKE_FUNCTION is creating a
function object using the code reference loaded by the last opcode. Once the object is created it's put into the stack again. Finally,
STORE_NAME assigns the
TOS object/value to a name. Since the last opcode execute was a
push of a function object, the current opcode is loading such
function to the name set by parameter.
> Second Block
It is formed by the opcodes from 9 to 18, both included. This block describes is just the following comparison:
if __name__ == "__main__":
This block is useful to easy understand how comparison works and how it is represented by the
dis module. We also have some new opcodes:
- LOAD_NAME(namei)
- Pushes the value associated with co_names[namei] onto the stack.
- COMPARE_OP(opname)
- Performs a Boolean operation and put the result on TOS. The operation name can be found in cmp_op[opname].
- POP_JUMP_IF_FALSE(target)
- If TOS is false, sets the byte-code counter to target. TOS is popped.
This second block is loading the variable
__name__ using
LOAD_NAME and the string constant
__main__ with the
LOAD_CONST opcode. Once both are on the stack, the compare operation is executed using the
COMPARE_OP opcode, which uses an attribute, the comparison name, and also both values in TOS.
Finally, the last opcode
POP_JUMP_IF_FALSE, as its name says, change the execution depending on the result of the comparison. In this case, if the result is
false, the execution flow is moved to opcode 34, just before the last opcode.
> Third Block
It is formed by the opcodes from 21 to 37, both included. This block details the last block of the code, which is executed if the file is executed standalone by the interpreter. Again, let's see the new opcodes:
- CALL_FUNCTION(argc)
- Calls a function. The low byte of argc indicates the number of positional parameters, the high byte the number of keyword parameters. On the stack, the opcode finds the keyword parameters first. For each keyword argument, the value is on top of the key. Below the keyword parameters, the positional parameters are on the stack, with the right-most parameter on top. Below the parameters, the function object to call is on the stack. Pops all function arguments, and the function itself off the stack, and pushes the return value.
- POP_TOP
- Removes the top-of-stack (TOS) item.
- JUMP_FORWARD(delta)
- Increments byte-code counter by delta.
- RETURN_VALUE
- Returns with TOS to the caller of the function.
This block describes what is going to happen if the script is executed standalone. First of all the previously defined function is loaded using the
LOAD_NAME opcode. Then, the
LOAD_CONST puts the string
foo into TOS. Finally, the
CALL_FUNCTION calls the
hello function using
foo as argument.
Once the function is executed, the returned value is removed from stack an assigned to a variable, in this case, we don't have any, this is why we see the
POP_TOP opcode,
STORE_NAME is used if the returned value has to be assigned to another variable.
The
JUMP_FORWARD opcode is added when the conditional block ends. You may notice that we are still analyzing the opcodes executed only if the script is called standalone.
The final two opcodes it is just a
return None instruction. The
None constant is loaded and finally the program exits and return the value using
RETURN_VALUE. Precisely, this is the block where the flow jumps although the comparison returns
false.
All those opcodes, as stated above, only represents the global scope, we still have not seen the
hello function itself. Remember that, in the marshaled object, we still have the array
co_consts, which is an array with all the constants used, and also, the defined functions. Here you have the opcodes of the function:
>>> m.co_consts
(<code object hello at 0x1004df930, file "hello.py", line 3>, 'foo', None)
>>> dis.dis(m.co_consts[0])
4 0 LOAD_CONST 1 ('Hi!')
3 LOAD_FAST 0 (name)
6 BINARY_ADD
7 PRINT_ITEM
8 PRINT_NEWLINE
9 LOAD_CONST 0 (None)
12 RETURN_VALUE
>>>
In this case, since it's a function, it uses the some different opcodes, but based on what we hace seen, it's easy to understand how it works if we know the meaning of each opcode:
- LOAD_FAST(var_num)
- Pushes a reference to the local co_varnames[var_num] onto the stack.
- BINARY_ADD
- Implements TOS = TOS1 + TOS.
- PRINT_ITEM
- Prints TOS to the file-like object bound to sys.stdout. There is one such instruction for each item in the print statement.
- PRINT_NEWLINE
- Prints a new line on sys.stdout. This is generated as the last operation of a print statement, unless the statement ends with a comma.
It just does a concatenation of strings using
BINARY_ADD, and loaded by
LOAD_CONST and
LOAD_FAST for the local variable. Finally prints the value
PRINT_ITEM to the output adding a new line character at the end
PRINT_NEWLINE. The function returns
None.
OpCodes Permutation
Now that we have all the needed tools we have to find a way to manipulate byte-codes and see the result. To simulate the opcodes permutation we can just modify how
dis interprets the data. To do it, we can use the following attribute available in the module.
- dis.opname Sequence of operation names, indexable using the byte-code.
It seems that
dis is using this attribute to represent the opcodes. Let's see the content of each one:
>>> dis.opname
['STOP_CODE', 'POP_TOP', 'ROT_TWO', 'ROT_THREE', 'DUP_TOP', 'ROT_FOUR', '<6>', '<7>', '<8>', 'NOP', 'UNARY_POSITIVE', 'UNARY_NEGATIVE', 'UNARY_NOT', 'UNARY_CONVERT', '<14>', 'UNARY_INVERT', '<16>', '<17>', '<18>', 'BINARY_POWER', 'BINARY_MULTIPLY', 'BINARY_DIVIDE', 'BINARY_MODULO', 'BINARY_ADD', 'BINARY_SUBTRACT', 'BINARY_SUBSCR', 'BINARY_FLOOR_DIVIDE', 'BINARY_TRUE_DIVIDE', 'INPLACE_FLOOR_DIVIDE', 'INPLACE_TRUE_DIVIDE', 'SLICE+0', 'SLICE+1', 'SLICE+2', 'SLICE+3', '<34>', '<35>', '<36>', '<37>', '<38>', '<39>', 'STORE_SLICE+0', 'STORE_SLICE+1', 'STORE_SLICE+2', 'STORE_SLICE+3', '<44>', '<45>', '<46>', '<47>', '<48>', '<49>', 'DELETE_SLICE+0', 'DELETE_SLICE', .. ]
opname is just an array with the name of each opcode in each position. Where there is no name related to an index, the content is
<$IDX$>, where
$IDX$ is the element position in the array.
dis uses this array to put the proper name of the opcode in the output.
If we modified the name, we can see how the name of the opcode changes. For example, let's change the name of the opcode
100, which is
LOAD_CONST:
>>> dis.opname[100]
'LOAD_CONST'
>>> dis.opname[100] = "TEST"
>>> dis.dis(m)
3 0 TEST 0 (<code object hello at 0x1004df930, file "hello.py", line 3>)
3 MAKE_FUNCTION 0
6 STORE_NAME 0 (hello)
6 9 LOAD_NAME 0 (hello)
12 TEST 1 ('foo')
15 CALL_FUNCTION 1
18 STORE_NAME 1 (ret)
21 TEST 2 (None)
24 RETURN_VALUE
>>>
We can see now the
TEST string instead of
LOAD_CONST, so it is working. Although we are only modifying how the opcode is represented and not the value itself, it is still useful to do some tests before actually save the modified opcodes. To automatize the permutation, we are going to create a function to make those changes in this array:
def permute(opname1, opname2):
idx1 = dis.opname.index(opname1)
idx2 = dis.opname.index(opname2)
dis.opname[idx1],dis.opname[idx2] = dis.opname[idx2], dis.opname[idx1]
But this is only a representation, it is useful to see what we have done before actually save the permutations into a file, but we still need to save it somehow. So, we will need another function to do it.
The following function does exactly this, using an input opcode array, the input filename, output filename and the changes to be done, it makes the replacement of the opcodes in a really silly way (It could fail since all bytes of a certain value are changed, but it worked for this challenge):
def store_code(code, ifile, changes , ofile):
ret = []
for i in code:
if changes.get(ord(i),None) is not None:
ret.append(changes.get(ord(i)))
continue
ret.append(ord(i))
ret_str = "".join(map(chr,ret))
nbuf = open(ifile).read()
nbuf = nbuf.replace(code, ret_str)
open(ofile, "wb").write(nbuf)
To properly use the above function we need to create a
dict with all the permutations performed. Its format is simple, we have to create two entries for each permutation, for example:
>>> changes = {}
>>> changes[100] = 40
>>> changes[40] = 100
For usability, we are going to change a bit the
permute function to automatically populate the
changes variable:
def permute(opname1, opname2):
global changes
idx1 = dis.opname.index(opname1)
idx2 = dis.opname.index(opname2)
dis.opname[idx1],dis.opname[idx2] = dis.opname[idx2], dis.opname[idx1]
# Add the changes to the dict
changes[idx1] = idx2
changes[idx2] = idx1
It's time to use both function and see if they do what we expect. Let's do simple modification, for example the first opcode of the
crypt.pyc global scope with the third one. And then, store the modified opcodes to a new file.
>>> dis.dis(m)
1 0 <153> 0
3 <153> 1
6 MAKE_CLOSURE 0
9 EXTENDED_ARG 0
# < .. cut cut cut... >
>>> permute("<153>","MAKE_CLOSURE")
>>> changes
{134: 153, 153: 134}
>>> store_code(m.co_code, "crypt.pyc", changes, "crypt2.pyc")
>>> permute("<153>","MAKE_CLOSURE") # To reverse the permutation of the opmap attribute.
>>> m2 = marshal.loads(read("crypt2.pyc").read()[8:])
>>> dis.dis(m2)
1 0 MAKE_CLOSURE 0
3 MAKE_CLOSURE 1
6 <153> 0
9 EXTENDED_ARG 0
# < .. cut cut cut... >
>>> dis.dis(m)
1 0 <153> 0
3 <153> 1
6 MAKE_CLOSURE 0
9 EXTENDED_ARG 0
# < .. cut cut cut... >
We can see how the function
permute actually modified the
changes variable to be finally used in
store_code to modify the opcodes accordingly and save the file into the desired filename.
Challenge
We just created our tools to solve the challenge. Let's analyze a bit what information we have and see if we are able to reproduce parts of the code to understand which opcodes we need to permute. Let's do a recap of our current information: variable names, constants and initial opcodes structure:
Attributes
It seems that names
encrypt and
decrypt names refer clearly to the functions we have in the
co_consts, but what about
rotor? Well, just searching a bit through Internet, we can see that it is a deprecated
Python module which implements the
Enigma machine.
As stated above we are going to start reconstructing the
main scope trying to imitate the original source code, which seems to be easier and then, based on what we find, start the same process with the
decrypt function.
Fixing main
So, based on what we have seen we are going to do a similar source code and see how it is disassembled and compare with the
crypt.pyc one. Our script must include the
rotor import and two functions,
encrypt and
decypt, because we already saw those particularities:
import rotor
def encrypt():
pass
def decrypt():
pass
Let's see the disassembled code and compare it with the original
crypt.pyc code:
test1 and crypt disassemble comparison
Just looking at the output we can see direct matches. Let's see more details of our correct source code to understand the structure by disassembling the marshaled object instead of only the opcodes to allow the library to do the magic:
test1 disassembly with verbose
The output shows us the code divided in three different blocks: rotor import, declaration of
encrypt, declaration of
decrypt and finally the return value. All the opcodes we have in our script should be exactly the same than the
crypt.pyc because those are only declarations. We haven't added anything extra and opcodes number and structure matches.
Right now, we can make the following permutations:
- LOAD_CONST
- <153>
- IMPORT_NAME
- MAKE_CLOSURE
- EXTENDED_ARG
- STORE_NAME
- MAKE_FUNCTION
- LOAD_DEREF
We are going to make those changes and store the results using our functions:
>>> permute("LOAD_CONST","<153>")
>>> permute("IMPORT_NAME","MAKE_CLOSURE")
>>> permute("EXTENDED_ARG","STORE_NAME")
>>> permute("MAKE_FUNCTION","LOAD_DEREF")
>>> store_code(m_crypt.co_code, "crypt.pyc", changes, "crypt_test1.pyc")
Since we are directly changing the module internals attribute, we have to restore them every time we want to disassemble using the default values, therefore, we need to make a copy:
>>> import copy
>>> original_map = copy.deepcopy(dis.opname)
To verify the result, we need to load the new
crypt_test1.pyc file with modified opcodes and dissemble it:
In [37]: m_crypt_test1 = marshal.loads(open("crypt_test1.pyc").read()[8:])
In [38]: dis.dis(m_crypt_test1)
1 0 LOAD_CONST 0 (-1)
3 LOAD_CONST 1 (None)
6 IMPORT_NAME 0 (rotor)
9 STORE_NAME 0 (rotor)
2 12 LOAD_CONST 2 (<code object encrypt at 0x7f2c8188c430, file "/Users/hen/Lab/0CTF/py/crypt.py", line 2>)
15 MAKE_FUNCTION 0
18 STORE_NAME 1 (encrypt)
10 21 LOAD_CONST 3 (<code object decrypt at 0x7f2c8188c830, file "/Users/hen/Lab/0CTF/py/crypt.py", line 10>)
24 MAKE_FUNCTION 0
27 STORE_NAME 2 (decrypt)
30 LOAD_CONST 1 (None)
33 RETURN_VALUE
In [39]:
Finally we are able to see something that actually makes sense ;) We just started.
Fixing decrypt
Let's repeat the process with the
decrypt function now. Which is much bigger.
>>> decrypt_func = m_crypt_test1.co_consts[3]
>>> dis.dis(decrypt_func)
11 0 <153> 1
3 BUILD_SET 1
12 6 <153> 2
9 BUILD_SET 2
13 12 <153> 3
15 BUILD_SET 3
14 18 STORE_GLOBAL 1 (newrotor)
21 <153> 4
24 PRINT_EXPR
25 <153> 5
28 <39>
29 STORE_GLOBAL 2 (decrypt)
32 STORE_GLOBAL 1 (newrotor)
35 <39>
36 STORE_GLOBAL 3---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
# < .. cut cut cut... >
As we can see, all the previous permutations have not been applied to those opcodes yet. We have to repeat the operations with those opcodes first.
.. same permutations than before ...
In [34]: store_code(decrypt_func.co_code, "crypt_test1.pyc", changes, "crypt_test2.pyc")
In [35]: mtest2 = marshal.loads(open("crypt_test2.pyc").read()[8:])
In [36]: dis.opname = copy.deepcopy(original_map)
In [37]: dis.dis(mtest2.co_consts[3]) # This is the decrypt function
11 0 LOAD_CONST 1 ('!@#$%^&*')
3 BUILD_SET 1
12 6 LOAD_CONST 2 ('abcdefgh')
9 BUILD_SET 2
13 12 LOAD_CONST 3 ('<>{}:"')
15 BUILD_SET 3
14 18 STORE_GLOBAL 1 (newrotor)
21 LOAD_CONST 4 (4)
24 PRINT_EXPR
25 LOAD_CONST 5 ('|')
28 <39>
29 STORE_GLOBAL 2 (decrypt)
32 STORE_GLOBAL 1 (newrotor)
35 <39>
36 STORE_GLOBAL 3---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
# < .. cut cut cut... >
Remembering our first test script used to understand how the opcodes work, there is an interesting opcode that we can permute with
BUILD_SET, remember that within a function the
STORE_NAME was not used, instead it uses
STORE_FAST for local variables. Therefore, we can guess a new permutation here:
In [17]: decrypt_func = mtest2.co_consts[3]
In [18]: permute("BUILD_SET","STORE_FAST")
In [19]: store_code(decrypt_func.co_code, "crypt_test2.pyc", changes, "crypt_test3.pyc")
Let's do the same process that we have done before to try to figure out the structure of the original source code and try to reproduce something similar. First of all let's see what we have now, disassembled opcodes, all the variables and constants, in this function:
In [23]: dis.opname = copy.deepcopy(original_map)
In [24]: mtest3 = marshal.loads(open("crypt_test3.pyc").read()[8:])
In [25]: decrypt_func = mtest3.co_consts[3]
In [26]: dis.dis(decrypt_func)
11 0 LOAD_CONST 1 ('!@#$%^&*')
3 STORE_FAST 1 (key_a)
12 6 LOAD_CONST 2 ('abcdefgh')
9 STORE_FAST 2 (key_b)
13 12 LOAD_CONST 3 ('<>{}:"')
15 STORE_FAST 3 (key_c)
14 18 STORE_GLOBAL 1 (newrotor)
21 LOAD_CONST 4 (4)
24 PRINT_EXPR
25 LOAD_CONST 5 ('|')
28 <39>
29 STORE_GLOBAL 2 (decrypt)
32 STORE_GLOBAL 1 (newrotor)
35 <39>
36 STORE_GLOBAL 3---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
<ipython-input-26-769498bb44bf> in <module>()
----> 1 dis.dis(decrypt_func)
/usr/lib/python2.7/dis.pyc in dis(x)
41 print
42 elif hasattr(x, 'co_code'):
---> 43 disassemble(x)
44 elif isinstance(x, str):
45 disassemble_string(x)
/usr/lib/python2.7/dis.pyc in disassemble(co, lasti)
95 print '(' + repr(co.co_consts[oparg]) + ')',
96 elif op in hasname:
---> 97 print '(' + co.co_names[oparg] + ')',
98 elif op in hasjrel:
99 print '(to ' + repr(i + oparg) + ')',
IndexError: tuple index out of range
In [27]: decrypt_func.co_names
Out[27]: ('rotor', 'newrotor', 'decrypt')
In [28]: decrypt_func.co_varnames
Out[28]: ('data', 'key_a', 'key_b', 'key_c', 'secret', 'rot')
In [29]: decrypt_func.co_consts
Out[29]: (None, '!@#$%^&*', 'abcdefgh', '<>{}:"', 4, '|', 2, 'EOF')
In [30]:
With our last permutation, it seems that the beginning of the function makes more sense. We can see how some
local variables are defined, also, it seems to be a correct output, until opcode 18 at least. Now that we have those first opcodes, let's try to do a simple script in python to use the
rotor and
newrotor method and analyze what
dis gives us. We have to consider all the information we have and try to figure out the code.
We may have to make some assumptions here, for example:
We are going to use the following code, saved as
newrotor.py.
#!/usr/bin/python
import rotor
def decrypt(data):
key_a = '!@#$%^&*'
key_b = 'abcdefgh'
key_c = '<>{}:"'
secret = key_a + key_b + key_c
rot = rotor.newrotor(secret)
rot.decrypt(data)
Something must be wrong for sure, because there are still some constants not used by our script. Let's see the disassembled output. We should be able to understand a bit better how the
newrotor is called and the
secret key is prepared:
In [55]: import newrotor
In [56]: m_newrotor = marshal.loads(open("newrotor.pyc").read()[8:])
In [57]: dis.opname = copy.deepcopy(original_map)
In [58]: decr_nrotor = m_newrotor.co_consts[2]
In [59]: dis.dis(decr_nrotor)
5 0 LOAD_CONST 1 ('!@#$%^&*')
3 STORE_FAST 1 (key_a)
6 6 LOAD_CONST 2 ('abcdefgh')
9 STORE_FAST 2 (key_b)
7 12 LOAD_CONST 3 ('<>{}:"')
15 STORE_FAST 3 (key_c)
8 18 LOAD_FAST 1 (key_a)
21 LOAD_FAST 2 (key_b)
24 BINARY_ADD
25 LOAD_FAST 3 (key_c)
28 BINARY_ADD
29 STORE_FAST 4 (secret)
9 32 LOAD_GLOBAL 0 (rotor)
35 LOAD_ATTR 1 (newrotor)
38 LOAD_FAST 4 (secret)
41 CALL_FUNCTION 1
44 STORE_FAST 5 (rot)
10 47 LOAD_FAST 5 (rot)
50 LOAD_ATTR 2 (decrypt)
53 LOAD_FAST 0 (data)
56 CALL_FUNCTION 1
59 POP_TOP
60 LOAD_CONST 0 (None)
63 RETURN_VALUE
In [60]:
Nice, it seems that it is similar... kind of. Looking at how it handles the string concatenation we can see an interesting fact. It is using twice in a row the
LOAD_FAST and then the
BINARY_ADD.
Remember that we have a similar situation in our
crypt.pyc, only in one place:
14 18 STORE_GLOBAL 1 (newrotor)
21 LOAD_CONST 4 (4)
24 PRINT_EXPR
25 LOAD_CONST 5 ('|')
28 <39>
29 STORE_GLOBAL 2 (decrypt)
32 STORE_GLOBAL 1 (newrotor)
35 <39>
36 STORE_GLOBAL 3
Since it's the only time we have seen two opcodes followed we can guess that maybe is the same situation. Therefore, we can do the following permutations:
- LOAD_FAST
- STORE_GLOBAL
- <39>
- BINARY_ADD
Let's try and then we will see if we are right or not:
In [31]: permute("LOAD_FAST","STORE_GLOBAL")
In [32]: permute("<39>","BINARY_ADD")
In [33]: store_code(decrypt_func.co_code, "crypt_test3.pyc", changes, "crypt_test4.pyc")
In [34]: mtest4 = marshal.loads(open("crypt_test4.pyc").read()[8:])
In [35]: decrypt_func = mtest4.co_consts[3]
In [36]: dis.dis(decrypt_func)
11 0 LOAD_CONST 1 ('!@#$%^&*')
3 STORE_FAST 1 (key_a)
12 6 LOAD_CONST 2 ('abcdefgh')
9 STORE_FAST 2 (key_b)
13 12 LOAD_CONST 3 ('<>{}:"')
15 STORE_FAST 3 (key_c)
14 18 LOAD_FAST 1 (key_a)
21 LOAD_CONST 4 (4)
24 PRINT_EXPR
25 LOAD_CONST 5 ('|')
28 BINARY_ADD
29 LOAD_FAST 2 (key_b)
32 LOAD_FAST 1 (key_a)
35 BINARY_ADD
36 LOAD_FAST 3 (key_c)
39 BINARY_ADD
40 LOAD_CONST 6 (2)
43 PRINT_EXPR
44 BINARY_ADD
45 LOAD_CONST 5 ('|')
48 BINARY_ADD
49 LOAD_FAST 2 (key_b)
52 LOAD_CONST 6 (2)
55 PRINT_EXPR
56 BINARY_ADD
57 LOAD_CONST 7 ('EOF')
60 BINARY_ADD
61 STORE_FAST 4 (secret)
15 64 <155> 0
67 DELETE_ATTR 1 (newrotor)
70 LOAD_FAST 4 (secret)
73 CALL_FUNCTION 1
76 STORE_FAST 5 (rot)
16 79 LOAD_FAST 5 (rot)
82 DELETE_ATTR 2 (decrypt)
85 LOAD_FAST 0 (data)
88 CALL_FUNCTION 1
91 RETURN_VALUE
In [37]:
We are getting closer :) The code fully represented and also separated in blocks and it seems to be correct: first three blocks are just local declarations, the big one with id
14 just assigns the
secret value - string -, the next one is the actual call to
rotor.newrotor(secret), and finally, the last block is the call to
rot.decrypt(data) and return value.
We are going to make the permutations based on what we saw in our test script
newrotor.py. In concrete, remember this piece of code from there:
9 32 LOAD_GLOBAL 0 (rotor)
35 LOAD_ATTR 1 (newrotor)
38 LOAD_FAST 4 (secret)
41 CALL_FUNCTION 1
44 STORE_FAST 5 (rot)
Looking only at the block where the
newrotor method is called, and comparing it with the
crypt one, we can see the following matches to make the proper permutations:
- DELETE_ATTR
- LOAD_ATTR
- LOAD_GLOBAL
- <155>
Even with those permutations, we still have a wrong opcode, which is
PRINT_EXPR. By looking at its position we can see that it's always following an integer. Since the whole block is a concatenation of different variables and constants to get the
secret value, this opcode has to be replaced for something related to such operation, concatenation.
Doing some checks, we can see that the only sequence accepted in python between string and an integer in this context is using the operand
*. Now we need to know the opcode assigned to such operation, let's do a simple modification to our script
newrotor.py:
#!/usr/bin/python
import rotor
def decrypt(data):
key_a = '!@#$%^&*'
key_b = 'abcdefgh'
key_c = '<>{}:"'
secret = key_a + key_b + key_c * 4
rot = rotor.newrotor(secret)
rot.decrypt(data)
And now, we can disassemble it again:
In [52]: reload(newrotor)
Out[52]: <module 'newrotor' from 'newrotor.pyc'>
In [53]: m_newrotor = marshal.loads(open("newrotor.pyc").read()[8:])
In [54]: dis.dis(m_newrotor.co_consts[2])
# < .. cut cut cut... >
8 18 LOAD_FAST 1 (key_a)
21 LOAD_FAST 2 (key_b)
24 BINARY_ADD
25 LOAD_FAST 3 (key_c)
28 LOAD_CONST 4 (4)
31 BINARY_MULTIPLY
32 BINARY_ADD
33 STORE_FAST 4 (secret)
# < .. cut cut cut... >
In [55]:
And here we have the answer and the last permutation:
- BINARY_MULTIPLY
- PRINT_EXPR
Finally, let's perform all the changes, store the result and see if everything is ok.
In [57]: permute("BINARY_MULTIPLY","PRINT_EXPR")
In [58]: permute("DELETE_ATTR","LOAD_ATTR")
In [59]: permute("LOAD_GLOBAL","<155>")
In [60]: store_code(decrypt_func.co_code, "crypt_test4.pyc", changes, "crypt_test5.pyc")
In [61]: mtest5 = marshal.loads(open("crypt_test5.pyc").read()[8:])
In [62]: dis.opname = copy.deepcopy(original_map)
In [64]: dis.dis(mtest5.co_consts[3])
11 0 LOAD_CONST 1 ('!@#$%^&*')
3 STORE_FAST 1 (key_a)
12 6 LOAD_CONST 2 ('abcdefgh')
9 STORE_FAST 2 (key_b)
13 12 LOAD_CONST 3 ('<>{}:"')
15 STORE_FAST 3 (key_c)
14 18 LOAD_FAST 1 (key_a)
21 LOAD_CONST 4 (4)
24 BINARY_MULTIPLY
25 LOAD_CONST 5 ('|')
28 BINARY_ADD
29 LOAD_FAST 2 (key_b)
32 LOAD_FAST 1 (key_a)
35 BINARY_ADD
36 LOAD_FAST 3 (key_c)
39 BINARY_ADD
40 LOAD_CONST 6 (2)
43 BINARY_MULTIPLY
44 BINARY_ADD
45 LOAD_CONST 5 ('|')
48 BINARY_ADD
49 LOAD_FAST 2 (key_b)
52 LOAD_CONST 6 (2)
55 BINARY_MULTIPLY
56 BINARY_ADD
57 LOAD_CONST 7 ('EOF')
60 BINARY_ADD
61 STORE_FAST 4 (secret)
15 64 LOAD_GLOBAL 0 (rotor)
67 LOAD_ATTR 1 (newrotor)
70 LOAD_FAST 4 (secret)
73 CALL_FUNCTION 1
76 STORE_FAST 5 (rot)
16 79 LOAD_FAST 5 (rot)
82 LOAD_ATTR 2 (decrypt)
85 LOAD_FAST 0 (data)
88 CALL_FUNCTION 1
91 RETURN_VALUE
In [65]:
It seems that the whole thing makes sense now. Let's do the final test:
In [86]: import crypt_test5
In [87]: crypt_test5.decrypt(open("encrypted_flag").read())
Out[87]: 'flag{Gue55_opcode_G@@@me}'
In [88]:
Yey! We have the flag! ;)
EOF