OCTF-2017 : Py

This is a Chinese Capture The Flag hosted by Team 0ops. It lasts 48 hours and it started on 18th March at 00:00 GMT and ended on 20th.

files: download .zip

This challenge consists of a precompiled python file generated with permuted opcodes. The description is as follows:

Challenge


Once we download the zip, we get two files files:

Archive:  py_d5764c66f02cccdb356c532d60d4d079.zip
  inflating: crypt.pyc
  inflating: encrypted_flag
By looking the output of strings command in crypt.pyc it seems that there is a decrypt function implemented in the compiled file. We have to reconstruct the opcodes in a proper way to be loaded and executed by our standard interpreter, because it's impossible to do it in the current format.

[kimera@kimera zip]$ strings crypt.pyc
f4oXc
!@#$%^&*t
abcdefghs
<>{}:"i
EOF(
rotort
newrotort
encrypt(
datat
key_at
key_bt
key_ct
secrett
rot(
/Users/hen/Lab/0CTF/py/crypt.pyR
!@#$%^&*R
<>{}:"i
decrypt(
/Users/hen/Lab/0CTF/py/crypt.pyR
/Users/hen/Lab/0CTF/py/crypt.pyt
<module>
[kimera@kimera zip]$ python2
Python 2.7.13 (default, Dec 21 2016, 07:16:46)
[GCC 6.2.1 20160830] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import crypt
XXX lineno: 1, opcode: 153
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/hen/Lab/0CTF/py/crypt.py", line 1, in <module>
SystemError: unknown opcode
>>>
As stated above, the module loading process fails with unknown opcode in first line. We will have to go through the python precompiled opcodes to make it works.

Understanding .pyc files

The .pyc structure has basicaly three fields:

  • 4 bytes: Magic number two bytes for the loading engine and "\r\n" to prevent corruption if processes as text since the magic number would change.
  • 4 bytes: Timestamp Unix timestamp modification of the source file that generated the .pyc file.
  • N bytes: Marshaled Object the actual data to be loaded and executed.
In the following image, the first mark correspond to the magic number, the second one is the timestamp and the rest of the data, unmarked, is the marshaled object.

.pyc overview


Now that we know the existence of the marshaled object, we can try to load it and start working by mapping the content into actual objects in python.

Working with .pyc files

To work with .pyc files we are going to use two modules:

  • marshal This module contains functions that can read and write Python values in a binary format.
  • dis The dis module supports the analysis of CPython byte-code by disassembling it
The way to load the .pyc in memory is using the marshal module. First of all we have to load just the marshaled data and not the whole file. Just like this:

In [4]: marshaled_data = open("crypt.pyc").read()[8:]
In [5]: m = marshal.loads(marshaled_data)

In the first line, all meta data from headers have been removed, and only the marshal payload is stored in the variable to finally load it using the loads method of marshal module. Now that we have the object, we can access to all mapped fields in object m.

Marshal initial values


In the above example we can see different attributes:

  • co_filename original name of the source file
  • co_names names declared in the scope
  • co_varnames local variables, normally used in functions
  • co_consts all declared objects marshaled
  • co_code opcode of the scope, in this case, the __main__ scope
We can also see the same attributes but referred to the functions. For example, you may notice in the co_consts attribute two elements, which are of type code. Those elements are functions, and the array contains a marshaled object containing all the opcodes and meta information, just like our current object.

Let's see an example, we are going to get the decrypt function, which is the forth position of the co_consts array.

m_decrypt = m.co_consts[3]

As expected, now we can access to a all the attributes related to decrypt function, and we can see that is different compared to the previous loaded __main__ object.

Marshal initial values

Understanding OpCodes

We just got all what we need to properly load a .pyc file and map it into an object. It's time to continue with the challenge. First of all, we are going take a look to the current disassembled code.

In order to disassemble the opcodes we are going to use the dis module, method dis also. If the value of the parameter is a buffer, it just disassemble the whole data, but, if a marshal object is used, it makes cross-relations between names/n_constants/opcodes which helps a lot to understand what is going on.

Let's see how the disassembled code looks like, let's start with the __main__ scope, there is a full list of the opcode's meaning in the dis module homepage:

__main__ marshaled object disassembled


The module fails representing the opcodes, it happens because we are using the whole object as parameter and the dis module tries to match lot of attributes and values in a consistent way. Since the opcodes are permuted, a lot of inconsistencies appears making the interpreter to crash.

Let's try to disassemble just the co_code property, which is the raw data, without attributes such variable names or constants and see if it works.

__main__ raw opcodes disassembled


This time we can see that all opcodes have been interpreted by dis, it just may help to get an idea. But the current interpretation is still wrong,

If we want to understand what is wrong, the easiest way to do it IMHO is to create our own simple script and see how correct disassembled code looks like. For example, let's create a file hello.py:

#!/usr/bin/python

def hello(name):
   print "Hi!" + name

if __name__ == "__main__":
   hello("foo")

We are going to reproduce the same steps than before to load the marshaled object from the .pyc file generated using the above code.

>>> import dis
>>> import marshal
>>> import hello # it generates the '.pyc' file
>>> m = marshal.loads(open("hello.pyc").read()[8:])
>>> dis.dis(m)
  3           0 LOAD_CONST               0 (<code object hello at 0x10ddfbb30, file "hello.py", line 3>)
              3 MAKE_FUNCTION            0
              6 STORE_NAME               0 (hello)

  6           9 LOAD_NAME                1 (__name__)
             12 LOAD_CONST               1 ('__main__')
             15 COMPARE_OP               2 (==)
             18 POP_JUMP_IF_FALSE       34

  7          21 LOAD_NAME                0 (hello)
             24 LOAD_CONST               2 ('foo')
             27 CALL_FUNCTION            1
             30 POP_TOP
             31 JUMP_FORWARD             0 (to 34)
        >>   34 LOAD_CONST               3 (None)
             37 RETURN_VALUE
>>>

Now all make sense, the opcodes are properly interpreted and disassembled, also the correlation made by dis is working much better. Let's see the above output more in detail. The opcodes seems to be separated in three different blocks.

> First Block

It is formed by the opcodes from 0 to 6, both included. This block describes the global scope, in this case, the declaration of the function hello and it makes sense because of the opcodes we have:
LOAD_CONST(consti)
Pushes co_consts[consti] onto the stack.

MAKE_FUNCTION(argc)
Pushes a new function object on the stack. Top-of-stackk (TOS) is the code associated with the function. The function object is defined to have argc default parameters, which are found below TOS when it is called.

STORE_NAME(namei)
Implements name = TOS. namei is the index of name in the attribute co_names of the code object. The compiler tries to use STORE_FAST or STORE_GLOBAL if possible.
The overall behavior is that a reference to hello function code, which is stored in the co_consts array, is put into the stack with the first opcode. Then, MAKE_FUNCTION is creating a function object using the code reference loaded by the last opcode. Once the object is created it's put into the stack again. Finally, STORE_NAME assigns the TOS object/value to a name. Since the last opcode execute was a push of a function object, the current opcode is loading such function to the name set by parameter.

> Second Block

It is formed by the opcodes from 9 to 18, both included. This block describes is just the following comparison:

if __name__ == "__main__":

This block is useful to easy understand how comparison works and how it is represented by the dis module. We also have some new opcodes:
LOAD_NAME(namei)
Pushes the value associated with co_names[namei] onto the stack.

COMPARE_OP(opname)
Performs a Boolean operation and put the result on TOS. The operation name can be found in cmp_op[opname].

POP_JUMP_IF_FALSE(target)
If TOS is false, sets the byte-code counter to target. TOS is popped.
This second block is loading the variable __name__ using LOAD_NAME and the string constant __main__ with the LOAD_CONST opcode. Once both are on the stack, the compare operation is executed using the COMPARE_OP opcode, which uses an attribute, the comparison name, and also both values in TOS.

Finally, the last opcode POP_JUMP_IF_FALSE, as its name says, change the execution depending on the result of the comparison. In this case, if the result is false, the execution flow is moved to opcode 34, just before the last opcode.

> Third Block

It is formed by the opcodes from 21 to 37, both included. This block details the last block of the code, which is executed if the file is executed standalone by the interpreter. Again, let's see the new opcodes:
CALL_FUNCTION(argc)
Calls a function. The low byte of argc indicates the number of positional parameters, the high byte the number of keyword parameters. On the stack, the opcode finds the keyword parameters first. For each keyword argument, the value is on top of the key. Below the keyword parameters, the positional parameters are on the stack, with the right-most parameter on top. Below the parameters, the function object to call is on the stack. Pops all function arguments, and the function itself off the stack, and pushes the return value.

POP_TOP
Removes the top-of-stack (TOS) item.

JUMP_FORWARD(delta)
Increments byte-code counter by delta.

RETURN_VALUE
Returns with TOS to the caller of the function.
This block describes what is going to happen if the script is executed standalone. First of all the previously defined function is loaded using the LOAD_NAME opcode. Then, the LOAD_CONST puts the string foo into TOS. Finally, the CALL_FUNCTION calls the hello function using foo as argument.

Once the function is executed, the returned value is removed from stack an assigned to a variable, in this case, we don't have any, this is why we see the POP_TOP opcode, STORE_NAME is used if the returned value has to be assigned to another variable.

The JUMP_FORWARD opcode is added when the conditional block ends. You may notice that we are still analyzing the opcodes executed only if the script is called standalone.

The final two opcodes it is just a return None instruction. The None constant is loaded and finally the program exits and return the value using RETURN_VALUE. Precisely, this is the block where the flow jumps although the comparison returns false.

All those opcodes, as stated above, only represents the global scope, we still have not seen the hello function itself. Remember that, in the marshaled object, we still have the array co_consts, which is an array with all the constants used, and also, the defined functions. Here you have the opcodes of the function:

>>> m.co_consts
(<code object hello at 0x1004df930, file "hello.py", line 3>, 'foo', None)
>>> dis.dis(m.co_consts[0])
  4           0 LOAD_CONST               1 ('Hi!')
              3 LOAD_FAST                0 (name)
              6 BINARY_ADD
              7 PRINT_ITEM
              8 PRINT_NEWLINE
              9 LOAD_CONST               0 (None)
             12 RETURN_VALUE
>>>

In this case, since it's a function, it uses the some different opcodes, but based on what we hace seen, it's easy to understand how it works if we know the meaning of each opcode:
LOAD_FAST(var_num)
Pushes a reference to the local co_varnames[var_num] onto the stack.

BINARY_ADD
Implements TOS = TOS1 + TOS.

PRINT_ITEM
Prints TOS to the file-like object bound to sys.stdout. There is one such instruction for each item in the print statement.

PRINT_NEWLINE
Prints a new line on sys.stdout. This is generated as the last operation of a print statement, unless the statement ends with a comma.
It just does a concatenation of strings using BINARY_ADD, and loaded by LOAD_CONST and LOAD_FAST for the local variable. Finally prints the value PRINT_ITEM to the output adding a new line character at the end PRINT_NEWLINE. The function returns None.

OpCodes Permutation

Now that we have all the needed tools we have to find a way to manipulate byte-codes and see the result. To simulate the opcodes permutation we can just modify how dis interprets the data. To do it, we can use the following attribute available in the module.

  • dis.opname Sequence of operation names, indexable using the byte-code.
It seems that dis is using this attribute to represent the opcodes. Let's see the content of each one:

>>> dis.opname
['STOP_CODE', 'POP_TOP', 'ROT_TWO', 'ROT_THREE', 'DUP_TOP', 'ROT_FOUR', '<6>', '<7>', '<8>', 'NOP', 'UNARY_POSITIVE', 'UNARY_NEGATIVE', 'UNARY_NOT', 'UNARY_CONVERT', '<14>', 'UNARY_INVERT', '<16>', '<17>', '<18>', 'BINARY_POWER', 'BINARY_MULTIPLY', 'BINARY_DIVIDE', 'BINARY_MODULO', 'BINARY_ADD', 'BINARY_SUBTRACT', 'BINARY_SUBSCR', 'BINARY_FLOOR_DIVIDE', 'BINARY_TRUE_DIVIDE', 'INPLACE_FLOOR_DIVIDE', 'INPLACE_TRUE_DIVIDE', 'SLICE+0', 'SLICE+1', 'SLICE+2', 'SLICE+3', '<34>', '<35>', '<36>', '<37>', '<38>', '<39>', 'STORE_SLICE+0', 'STORE_SLICE+1', 'STORE_SLICE+2', 'STORE_SLICE+3', '<44>', '<45>', '<46>', '<47>', '<48>', '<49>', 'DELETE_SLICE+0', 'DELETE_SLICE', .. ]

opname is just an array with the name of each opcode in each position. Where there is no name related to an index, the content is <$IDX$>, where $IDX$ is the element position in the array. dis uses this array to put the proper name of the opcode in the output.

If we modified the name, we can see how the name of the opcode changes. For example, let's change the name of the opcode 100, which is LOAD_CONST:

>>> dis.opname[100]
'LOAD_CONST'
>>> dis.opname[100] = "TEST"
>>> dis.dis(m)
  3           0 TEST                     0 (<code object hello at 0x1004df930, file "hello.py", line 3>)
              3 MAKE_FUNCTION            0
              6 STORE_NAME               0 (hello)

  6           9 LOAD_NAME                0 (hello)
             12 TEST                     1 ('foo')
             15 CALL_FUNCTION            1
             18 STORE_NAME               1 (ret)
             21 TEST                     2 (None)
             24 RETURN_VALUE
>>>

We can see now the TEST string instead of LOAD_CONST, so it is working. Although we are only modifying how the opcode is represented and not the value itself, it is still useful to do some tests before actually save the modified opcodes. To automatize the permutation, we are going to create a function to make those changes in this array:

def permute(opname1, opname2):
    idx1 = dis.opname.index(opname1)
    idx2 = dis.opname.index(opname2)
    dis.opname[idx1],dis.opname[idx2] = dis.opname[idx2], dis.opname[idx1]

But this is only a representation, it is useful to see what we have done before actually save the permutations into a file, but we still need to save it somehow. So, we will need another function to do it.

The following function does exactly this, using an input opcode array, the input filename, output filename and the changes to be done, it makes the replacement of the opcodes in a really silly way (It could fail since all bytes of a certain value are changed, but it worked for this challenge):

def store_code(code, ifile, changes , ofile):
    ret = []
    for i in code:
        if changes.get(ord(i),None) is not None:
            ret.append(changes.get(ord(i)))
            continue
        ret.append(ord(i))
    ret_str = "".join(map(chr,ret))
    nbuf = open(ifile).read()
    nbuf = nbuf.replace(code, ret_str)
    open(ofile, "wb").write(nbuf)

To properly use the above function we need to create a dict with all the permutations performed. Its format is simple, we have to create two entries for each permutation, for example:

>>> changes = {}
>>> changes[100] = 40
>>> changes[40] = 100

For usability, we are going to change a bit the permute function to automatically populate the changes variable:

def permute(opname1, opname2):
    global changes
    idx1 = dis.opname.index(opname1)
    idx2 = dis.opname.index(opname2)
    dis.opname[idx1],dis.opname[idx2] = dis.opname[idx2], dis.opname[idx1]

    # Add the changes to the dict
    changes[idx1] = idx2
    changes[idx2] = idx1

It's time to use both function and see if they do what we expect. Let's do simple modification, for example the first opcode of the crypt.pyc global scope with the third one. And then, store the modified opcodes to a new file.

>>> dis.dis(m)
  1           0 <153>                    0
              3 <153>                    1
              6 MAKE_CLOSURE             0
              9 EXTENDED_ARG             0
 # < .. cut cut cut... >

>>> permute("<153>","MAKE_CLOSURE")
>>> changes
{134: 153, 153: 134}

>>> store_code(m.co_code, "crypt.pyc", changes, "crypt2.pyc")

>>> permute("<153>","MAKE_CLOSURE") # To reverse the permutation of the opmap attribute.

>>> m2 = marshal.loads(read("crypt2.pyc").read()[8:])
>>> dis.dis(m2)
  1           0 MAKE_CLOSURE             0
              3 MAKE_CLOSURE             1
              6 <153>                    0
              9 EXTENDED_ARG             0
 # < .. cut cut cut... >
>>> dis.dis(m)
  1           0 <153>                    0
              3 <153>                    1
              6 MAKE_CLOSURE             0
              9 EXTENDED_ARG             0
 # < .. cut cut cut... >

We can see how the function permute actually modified the changes variable to be finally used in store_code to modify the opcodes accordingly and save the file into the desired filename.

Challenge

We just created our tools to solve the challenge. Let's analyze a bit what information we have and see if we are able to reproduce parts of the code to understand which opcodes we need to permute. Let's do a recap of our current information: variable names, constants and initial opcodes structure:

Attributes


It seems that names encrypt and decrypt names refer clearly to the functions we have in the co_consts, but what about rotor? Well, just searching a bit through Internet, we can see that it is a deprecated Python module which implements the Enigma machine.

As stated above we are going to start reconstructing the main scope trying to imitate the original source code, which seems to be easier and then, based on what we find, start the same process with the decrypt function.

Fixing main

So, based on what we have seen we are going to do a similar source code and see how it is disassembled and compare with the crypt.pyc one. Our script must include the rotor import and two functions, encrypt and decypt, because we already saw those particularities:

import rotor

def encrypt():
    pass

def decrypt():
    pass

Let's see the disassembled code and compare it with the original crypt.pyc code:

test1 and crypt disassemble comparison


Just looking at the output we can see direct matches. Let's see more details of our correct source code to understand the structure by disassembling the marshaled object instead of only the opcodes to allow the library to do the magic:

test1 disassembly with verbose


The output shows us the code divided in three different blocks: rotor import, declaration of encrypt, declaration of decrypt and finally the return value. All the opcodes we have in our script should be exactly the same than the crypt.pyc because those are only declarations. We haven't added anything extra and opcodes number and structure matches.

Right now, we can make the following permutations:

  • LOAD_CONST
  • <153>

  • IMPORT_NAME
  • MAKE_CLOSURE

  • EXTENDED_ARG
  • STORE_NAME

  • MAKE_FUNCTION
  • LOAD_DEREF
We are going to make those changes and store the results using our functions:

>>> permute("LOAD_CONST","<153>")
>>> permute("IMPORT_NAME","MAKE_CLOSURE")
>>> permute("EXTENDED_ARG","STORE_NAME")
>>> permute("MAKE_FUNCTION","LOAD_DEREF")
>>> store_code(m_crypt.co_code, "crypt.pyc", changes, "crypt_test1.pyc")

Since we are directly changing the module internals attribute, we have to restore them every time we want to disassemble using the default values, therefore, we need to make a copy:

>>> import copy
>>> original_map = copy.deepcopy(dis.opname)

To verify the result, we need to load the new crypt_test1.pyc file with modified opcodes and dissemble it:

In [37]: m_crypt_test1 = marshal.loads(open("crypt_test1.pyc").read()[8:])

In [38]: dis.dis(m_crypt_test1)
  1           0 LOAD_CONST               0 (-1)
              3 LOAD_CONST               1 (None)
              6 IMPORT_NAME              0 (rotor)
              9 STORE_NAME               0 (rotor)

  2          12 LOAD_CONST               2 (<code object encrypt at 0x7f2c8188c430, file "/Users/hen/Lab/0CTF/py/crypt.py", line 2>)
             15 MAKE_FUNCTION            0
             18 STORE_NAME               1 (encrypt)

 10          21 LOAD_CONST               3 (<code object decrypt at 0x7f2c8188c830, file "/Users/hen/Lab/0CTF/py/crypt.py", line 10>)
             24 MAKE_FUNCTION            0
             27 STORE_NAME               2 (decrypt)
             30 LOAD_CONST               1 (None)
             33 RETURN_VALUE

In [39]:

Finally we are able to see something that actually makes sense ;) We just started.

Fixing decrypt

Let's repeat the process with the decrypt function now. Which is much bigger.

>>> decrypt_func = m_crypt_test1.co_consts[3]
>>> dis.dis(decrypt_func)
 11           0 <153>                    1
              3 BUILD_SET                1

 12           6 <153>                    2
              9 BUILD_SET                2

 13          12 <153>                    3
             15 BUILD_SET                3

 14          18 STORE_GLOBAL             1 (newrotor)
             21 <153>                    4
             24 PRINT_EXPR
             25 <153>                    5
             28 <39>
             29 STORE_GLOBAL             2 (decrypt)
             32 STORE_GLOBAL             1 (newrotor)
             35 <39>
             36 STORE_GLOBAL             3---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
 # < .. cut cut cut... >

As we can see, all the previous permutations have not been applied to those opcodes yet. We have to repeat the operations with those opcodes first.

 .. same permutations than before ...
In [34]: store_code(decrypt_func.co_code, "crypt_test1.pyc", changes, "crypt_test2.pyc")
In [35]: mtest2 = marshal.loads(open("crypt_test2.pyc").read()[8:])
In [36]: dis.opname = copy.deepcopy(original_map)
In [37]: dis.dis(mtest2.co_consts[3]) # This is the decrypt function
 11           0 LOAD_CONST               1 ('!@#$%^&*')
              3 BUILD_SET                1

 12           6 LOAD_CONST               2 ('abcdefgh')
              9 BUILD_SET                2

 13          12 LOAD_CONST               3 ('<>{}:"')
             15 BUILD_SET                3

 14          18 STORE_GLOBAL             1 (newrotor)
             21 LOAD_CONST               4 (4)
             24 PRINT_EXPR
             25 LOAD_CONST               5 ('|')
             28 <39>
             29 STORE_GLOBAL             2 (decrypt)
             32 STORE_GLOBAL             1 (newrotor)
             35 <39>
             36 STORE_GLOBAL             3---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
 # < .. cut cut cut... >

Remembering our first test script used to understand how the opcodes work, there is an interesting opcode that we can permute with BUILD_SET, remember that within a function the STORE_NAME was not used, instead it uses STORE_FAST for local variables. Therefore, we can guess a new permutation here:

In [17]: decrypt_func = mtest2.co_consts[3]

In [18]: permute("BUILD_SET","STORE_FAST")

In [19]: store_code(decrypt_func.co_code, "crypt_test2.pyc", changes, "crypt_test3.pyc")

Let's do the same process that we have done before to try to figure out the structure of the original source code and try to reproduce something similar. First of all let's see what we have now, disassembled opcodes, all the variables and constants, in this function:

In [23]: dis.opname = copy.deepcopy(original_map)

In [24]: mtest3 = marshal.loads(open("crypt_test3.pyc").read()[8:])

In [25]: decrypt_func = mtest3.co_consts[3]

In [26]: dis.dis(decrypt_func)
 11           0 LOAD_CONST               1 ('!@#$%^&*')
              3 STORE_FAST               1 (key_a)

 12           6 LOAD_CONST               2 ('abcdefgh')
              9 STORE_FAST               2 (key_b)

 13          12 LOAD_CONST               3 ('<>{}:"')
             15 STORE_FAST               3 (key_c)

 14          18 STORE_GLOBAL             1 (newrotor)
             21 LOAD_CONST               4 (4)
             24 PRINT_EXPR
             25 LOAD_CONST               5 ('|')
             28 <39>
             29 STORE_GLOBAL             2 (decrypt)
             32 STORE_GLOBAL             1 (newrotor)
             35 <39>
             36 STORE_GLOBAL             3---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-26-769498bb44bf> in <module>()
----> 1 dis.dis(decrypt_func)

/usr/lib/python2.7/dis.pyc in dis(x)
     41                 print
     42     elif hasattr(x, 'co_code'):
---> 43         disassemble(x)
     44     elif isinstance(x, str):
     45         disassemble_string(x)

/usr/lib/python2.7/dis.pyc in disassemble(co, lasti)
     95                 print '(' + repr(co.co_consts[oparg]) + ')',
     96             elif op in hasname:
---> 97                 print '(' + co.co_names[oparg] + ')',
     98             elif op in hasjrel:
     99                 print '(to ' + repr(i + oparg) + ')',

IndexError: tuple index out of range

In [27]: decrypt_func.co_names
Out[27]: ('rotor', 'newrotor', 'decrypt')

In [28]: decrypt_func.co_varnames
Out[28]: ('data', 'key_a', 'key_b', 'key_c', 'secret', 'rot')

In [29]: decrypt_func.co_consts
Out[29]: (None, '!@#$%^&*', 'abcdefgh', '<>{}:"', 4, '|', 2, 'EOF')

In [30]:

With our last permutation, it seems that the beginning of the function makes more sense. We can see how some local variables are defined, also, it seems to be a correct output, until opcode 18 at least. Now that we have those first opcodes, let's try to do a simple script in python to use the rotor and newrotor method and analyze what dis gives us. We have to consider all the information we have and try to figure out the code.

We may have to make some assumptions here, for example:

  • secret may be the actual key when the rotor is created, and according to what we see, maybe it's derived from the other variables, something like:

    secret = key_a + key_b + key_c
    

  • rot seems to be the name of the variable used to store the rotor object:

    rot = rotor.new_rotor(secret)
    

  • decrypt name indicates that the decrypt method of the rotor object is used in a some way. Also, probably the data varname is the parameter used to receive input data to decrypt.

    rot.decrypt(data)
    
We are going to use the following code, saved as newrotor.py.

#!/usr/bin/python
import rotor

def decrypt(data):
    key_a = '!@#$%^&*'
    key_b = 'abcdefgh'
    key_c = '<>{}:"'
    secret = key_a + key_b + key_c
    rot = rotor.newrotor(secret)
    rot.decrypt(data)

Something must be wrong for sure, because there are still some constants not used by our script. Let's see the disassembled output. We should be able to understand a bit better how the newrotor is called and the secret key is prepared:

In [55]: import newrotor

In [56]: m_newrotor = marshal.loads(open("newrotor.pyc").read()[8:])

In [57]: dis.opname = copy.deepcopy(original_map)

In [58]: decr_nrotor = m_newrotor.co_consts[2]

In [59]: dis.dis(decr_nrotor)
  5           0 LOAD_CONST               1 ('!@#$%^&*')
              3 STORE_FAST               1 (key_a)

  6           6 LOAD_CONST               2 ('abcdefgh')
              9 STORE_FAST               2 (key_b)

  7          12 LOAD_CONST               3 ('<>{}:"')
             15 STORE_FAST               3 (key_c)

  8          18 LOAD_FAST                1 (key_a)
             21 LOAD_FAST                2 (key_b)
             24 BINARY_ADD
             25 LOAD_FAST                3 (key_c)
             28 BINARY_ADD
             29 STORE_FAST               4 (secret)

  9          32 LOAD_GLOBAL              0 (rotor)
             35 LOAD_ATTR                1 (newrotor)
             38 LOAD_FAST                4 (secret)
             41 CALL_FUNCTION            1
             44 STORE_FAST               5 (rot)

 10          47 LOAD_FAST                5 (rot)
             50 LOAD_ATTR                2 (decrypt)
             53 LOAD_FAST                0 (data)
             56 CALL_FUNCTION            1
             59 POP_TOP
             60 LOAD_CONST               0 (None)
             63 RETURN_VALUE

In [60]:

Nice, it seems that it is similar... kind of. Looking at how it handles the string concatenation we can see an interesting fact. It is using twice in a row the LOAD_FAST and then the BINARY_ADD.

Remember that we have a similar situation in our crypt.pyc, only in one place:

 14          18 STORE_GLOBAL             1 (newrotor)
             21 LOAD_CONST               4 (4)
             24 PRINT_EXPR
             25 LOAD_CONST               5 ('|')
             28 <39>
             29 STORE_GLOBAL             2 (decrypt)
             32 STORE_GLOBAL             1 (newrotor)
             35 <39>
             36 STORE_GLOBAL             3
      

Since it's the only time we have seen two opcodes followed we can guess that maybe is the same situation. Therefore, we can do the following permutations:

  • LOAD_FAST
  • STORE_GLOBAL

  • <39>
  • BINARY_ADD
Let's try and then we will see if we are right or not:

In [31]: permute("LOAD_FAST","STORE_GLOBAL")

In [32]: permute("<39>","BINARY_ADD")

In [33]: store_code(decrypt_func.co_code, "crypt_test3.pyc", changes, "crypt_test4.pyc")

In [34]: mtest4 = marshal.loads(open("crypt_test4.pyc").read()[8:])

In [35]: decrypt_func = mtest4.co_consts[3]

In [36]: dis.dis(decrypt_func)
 11           0 LOAD_CONST               1 ('!@#$%^&*')
              3 STORE_FAST               1 (key_a)

 12           6 LOAD_CONST               2 ('abcdefgh')
              9 STORE_FAST               2 (key_b)

 13          12 LOAD_CONST               3 ('<>{}:"')
             15 STORE_FAST               3 (key_c)

 14          18 LOAD_FAST                1 (key_a)
             21 LOAD_CONST               4 (4)
             24 PRINT_EXPR
             25 LOAD_CONST               5 ('|')
             28 BINARY_ADD
             29 LOAD_FAST                2 (key_b)
             32 LOAD_FAST                1 (key_a)
             35 BINARY_ADD
             36 LOAD_FAST                3 (key_c)
             39 BINARY_ADD
             40 LOAD_CONST               6 (2)
             43 PRINT_EXPR
             44 BINARY_ADD
             45 LOAD_CONST               5 ('|')
             48 BINARY_ADD
             49 LOAD_FAST                2 (key_b)
             52 LOAD_CONST               6 (2)
             55 PRINT_EXPR
             56 BINARY_ADD
             57 LOAD_CONST               7 ('EOF')
             60 BINARY_ADD
             61 STORE_FAST               4 (secret)

 15          64 <155>                    0
             67 DELETE_ATTR              1 (newrotor)
             70 LOAD_FAST                4 (secret)
             73 CALL_FUNCTION            1
             76 STORE_FAST               5 (rot)

 16          79 LOAD_FAST                5 (rot)
             82 DELETE_ATTR              2 (decrypt)
             85 LOAD_FAST                0 (data)
             88 CALL_FUNCTION            1
             91 RETURN_VALUE

In [37]:

We are getting closer :) The code fully represented and also separated in blocks and it seems to be correct: first three blocks are just local declarations, the big one with id 14 just assigns the secret value - string -, the next one is the actual call to rotor.newrotor(secret), and finally, the last block is the call to rot.decrypt(data) and return value.

We are going to make the permutations based on what we saw in our test script newrotor.py. In concrete, remember this piece of code from there:

  9          32 LOAD_GLOBAL              0 (rotor)
             35 LOAD_ATTR                1 (newrotor)
             38 LOAD_FAST                4 (secret)
             41 CALL_FUNCTION            1
             44 STORE_FAST               5 (rot)

Looking only at the block where the newrotor method is called, and comparing it with the crypt one, we can see the following matches to make the proper permutations:

  • DELETE_ATTR
  • LOAD_ATTR

  • LOAD_GLOBAL
  • <155>
Even with those permutations, we still have a wrong opcode, which is PRINT_EXPR. By looking at its position we can see that it's always following an integer. Since the whole block is a concatenation of different variables and constants to get the secret value, this opcode has to be replaced for something related to such operation, concatenation.

Doing some checks, we can see that the only sequence accepted in python between string and an integer in this context is using the operand *. Now we need to know the opcode assigned to such operation, let's do a simple modification to our script newrotor.py:

#!/usr/bin/python
import rotor

def decrypt(data):
    key_a = '!@#$%^&*'
    key_b = 'abcdefgh'
    key_c = '<>{}:"'
    secret = key_a + key_b + key_c * 4
    rot = rotor.newrotor(secret)
    rot.decrypt(data)

And now, we can disassemble it again:

In [52]: reload(newrotor)
Out[52]: <module 'newrotor' from 'newrotor.pyc'>

In [53]: m_newrotor = marshal.loads(open("newrotor.pyc").read()[8:])

In [54]: dis.dis(m_newrotor.co_consts[2])
# < .. cut cut cut... >
8          18 LOAD_FAST                1 (key_a)
           21 LOAD_FAST                2 (key_b)
           24 BINARY_ADD
           25 LOAD_FAST                3 (key_c)
           28 LOAD_CONST               4 (4)
           31 BINARY_MULTIPLY
           32 BINARY_ADD
           33 STORE_FAST               4 (secret)
# < .. cut cut cut... >
In [55]:
      

And here we have the answer and the last permutation:

  • BINARY_MULTIPLY
  • PRINT_EXPR
Finally, let's perform all the changes, store the result and see if everything is ok.

In [57]: permute("BINARY_MULTIPLY","PRINT_EXPR")

In [58]: permute("DELETE_ATTR","LOAD_ATTR")

In [59]: permute("LOAD_GLOBAL","<155>")

In [60]: store_code(decrypt_func.co_code, "crypt_test4.pyc", changes, "crypt_test5.pyc")

In [61]: mtest5 = marshal.loads(open("crypt_test5.pyc").read()[8:])

In [62]: dis.opname = copy.deepcopy(original_map)

In [64]: dis.dis(mtest5.co_consts[3])
 11           0 LOAD_CONST               1 ('!@#$%^&*')
              3 STORE_FAST               1 (key_a)

 12           6 LOAD_CONST               2 ('abcdefgh')
              9 STORE_FAST               2 (key_b)

 13          12 LOAD_CONST               3 ('<>{}:"')
             15 STORE_FAST               3 (key_c)

 14          18 LOAD_FAST                1 (key_a)
             21 LOAD_CONST               4 (4)
             24 BINARY_MULTIPLY
             25 LOAD_CONST               5 ('|')
             28 BINARY_ADD
             29 LOAD_FAST                2 (key_b)
             32 LOAD_FAST                1 (key_a)
             35 BINARY_ADD
             36 LOAD_FAST                3 (key_c)
             39 BINARY_ADD
             40 LOAD_CONST               6 (2)
             43 BINARY_MULTIPLY
             44 BINARY_ADD
             45 LOAD_CONST               5 ('|')
             48 BINARY_ADD
             49 LOAD_FAST                2 (key_b)
             52 LOAD_CONST               6 (2)
             55 BINARY_MULTIPLY
             56 BINARY_ADD
             57 LOAD_CONST               7 ('EOF')
             60 BINARY_ADD
             61 STORE_FAST               4 (secret)

 15          64 LOAD_GLOBAL              0 (rotor)
             67 LOAD_ATTR                1 (newrotor)
             70 LOAD_FAST                4 (secret)
             73 CALL_FUNCTION            1
             76 STORE_FAST               5 (rot)

 16          79 LOAD_FAST                5 (rot)
             82 LOAD_ATTR                2 (decrypt)
             85 LOAD_FAST                0 (data)
             88 CALL_FUNCTION            1
             91 RETURN_VALUE

In [65]:

It seems that the whole thing makes sense now. Let's do the final test:

In [86]: import crypt_test5

In [87]: crypt_test5.decrypt(open("encrypted_flag").read())
Out[87]: 'flag{Gue55_opcode_G@@@me}'

In [88]:

Yey! We have the flag! ;)


EOF