深入剖析Quarkslab 2014安全挑战:Python虚拟机逆向与代码混淆

本文详细解析了Quarkslab 2014年安全挑战的全过程,包括如何通过逆向工程和代码分析破解一个高度混淆的Python代码片段,以及如何调试自定义的Python虚拟机来获取隐藏的URL和最终flag。

寻找挑战的URL

一行代码,无数lambda,痛苦不堪

挑战的第一部分是检索隐藏在以下Python一行代码中的URL:

1
(lambda g, c, d: (lambda _: (_.__setitem__('$', ''.join([(_['chr'] if ('chr' in _) else chr)((_['_'] if ('_' in _) else _)) for _['_'] in (_['s'] if ('s' in _) else s)[::(-1)]])), _)[-1])( (lambda _: (lambda f, _: f(f, _))((lambda __,_: ((lambda _: __(__, _))((lambda _: (_.__setitem__('i', ((_['i'] if ('i' in _) else i) + 1)),_)[(-1)])((lambda _: (_.__setitem__('s',((_['s'] if ('s' in _) else s) + [((_['l'] if ('l' in _) else l)[(_['i'] if ('i' in _) else i)] ^ (_['c'] if ('c' in _) else c))])), _)[-1])(_))) if (((_['g'] if ('g' in _) else g) % 4) and ((_['i'] if ('i' in _) else i)< (_['len'] if ('len' in _) else len)((_['l'] if ('l' in _) else l)))) else _)), _) ) ( (lambda _: (_. __setitem__('!', []), _.__setitem__('s', _['!']), _)[(-1)] ) ((lambda _: (_. __setitem__('!', ((_['d'] if ('d' in _) else d) ^ (_['d'] if ('d' in _) else d))), _.__setitem__('i', _['!']), _)[(-1)])((lambda _: (_.__setitem__('!', [ (_['j'] if ('j' in _) else j) for _[ 'i'] in (_['zip'] if ('zip' in _) else zip)((_['l0'] if ('l0' in _) else l0), (_['l1'] if ('l1' in _) else l1)) for _['j'] in (_['i'] if ('i' in _) else i)]), _.__setitem__('l', _['!']), _)[-1 ])((lambda _: (_.__setitem__('!', [1373, 1281, 1288, 1373, 1290, 1294, 1375, 1371,1289, 1281, 1280, 1293, 1289, 1280, 1373, 1294, 1289, 1280, 1372, 1288, 1375,1375, 1289, 1373, 1290, 1281, 1294, 1302, 1372, 1355, 1366, 1372, 1302, 1360, 1368, 1354, 1364, 1370, 1371, 1365, 1362, 1368, 1352, 1374, 1365, 1302 ]), _.__setitem__('l1',_['!']), _)[-1])((lambda _: (_.__setitem__('!',[1375, 1368, 1294, 1293, 1373, 1295, 1290, 1373, 1290, 1293, 1280, 1368, 1368,1294, 1293, 1368, 1372, 1292, 1290, 1291, 1371, 1375, 1280, 1372, 1281, 1293,1373, 1371, 1354, 1370, 1356, 1354, 1355, 1370, 1357, 1357, 1302, 1366, 1303,1368, 1354, 1355, 1356, 1303, 1366, 1371]), _.__setitem__('l0', _['!']), _)[(-1)])            ({ 'g': g, 'c': c, 'd': d, '$': None})))))))['$'])

这是我第一次看到混淆的Python代码,确实让人感到困惑。但通过耐心分析,我们可以更好地理解其工作原理。

整理最后一个lambda

通过仔细观察代码片段,我们可以直接观察到以下信息:

  • 该函数有三个参数,但目前未知具体值。
  • 代码片段似乎大量重用__setitem__,这可能意味着:
    • 我知道的唯一具有__setitem__函数的标准Python对象是字典。
    • 一旦我们理解了其中一个__setitem__调用,就能理解所有其他调用。
  • 使用了以下标准函数:chrlenzip,这意味着涉及字符串、整数和可迭代对象的操作。
  • 有两个明显的运算符:模运算和异或运算。

有了这些信息,我首先尝试从代码片段的最后一个lambda开始清理。结果如下:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
tab0 = [
    1375, 1368, 1294, 1293, 1373, 1295, 1290, 1373, 1290, 1293,
    1280, 1368, 1368, 1294, 1293, 1368, 1372, 1292, 1290, 1291,
    1371, 1375, 1280, 1372, 1281, 1293, 1373, 1371, 1354, 1370,
    1356, 1354, 1355, 1370, 1357, 1357, 1302, 1366, 1303, 1368,
    1354, 1355, 1356, 1303, 1366, 1371
]

z = lambda x: (
    x.__setitem__('!', tab0),
    x.__setitem__('l0', x['!']),
    x
)[-1]

该lambda接受一个字典x,设置两个项,生成一个元组,并在元组末尾包含对字典的引用;最后,lambda将返回该字典。它还使用x['!']作为临时变量,然后将其值赋给x['l0']。简而言之,它基本上接受一个字典,更新它并将其返回给调用者:这是在lambdas之间传递同一对象的巧妙技巧。我们也可以在Python中直接看到这一点:

1
2
3
4
In [8]: d = {}
In [9]: z(d)
Out[9]:
{'!': [1375, ...], 'l0': [1375, ...]}

该lambda甚至使用一个字典调用,该字典将包含,除其他外,三个用户控制的变量:g、c、d。该字典似乎是一种存储,用于跟踪将在这些lambdas中使用的所有变量。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# 返回 { 'g' : g, 'c', 'd': d, '$':None, '!':tab0, 'l0':tab0}
last_res = (
    (
        lambda x: (
            x.__setitem__('!', tab0),
            x.__setitem__('l0', x['!']),
            x
        )[-1]
    )
    ({ 'g': g, 'c': c, 'd': d, '$': None})
)

然后是前一个lambda

如果我们对最后一个lambda之前的lambda重复相同的操作,我们会得到完全相同的模式:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
tab1 = [
    1373, 1281, 1288, 1373, 1290, 1294, 1375, 1371, 1289, 1281,
    1280, 1293, 1289, 1280, 1373, 1294, 1289, 1280, 1372, 1288,
    1375, 1375, 1289, 1373, 1290, 1281, 1294, 1302, 1372, 1355,
    1366, 1372, 1302, 1360, 1368, 1354, 1364, 1370, 1371, 1365,
    1362, 1368, 1352, 1374, 1365, 1302
]

zz = lambda x: (
    x.__setitem__('!', tab1),
    x.__setitem__('l1', x['!']),
    x
)[-1]

完美,现在让我们一遍又一遍地重复相同的操作。在某个时刻,整个事情变得清晰起来(某种程度上):

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
# 返回 { 
    # 'g':g, 'c':c, 'd':d,
    # '!':[],
    # 's':[],
    # 'l':[j for i in zip(tab0, tab1) for j in i],
    # 'l1':tab1,
    # 'l0':tab0,
    # 'i': 0,
    # 'j': 1302,
    # '$':None
#}
res_after_all_operations = (
    (
    lambda x: (
        x.__setitem__('!', []),
        x.__setitem__('s', x['!']),
        x
    )[-1]
    )
    # ..
    (
    (
        lambda x: (
            x.__setitem__('!', ((x['d'] if ('d' in x) else d) ^ (x['d'] if ('d' in x) else d))),
            x.__setitem__('i', x['!']),
            x
        )[-1]
    )
    # ..
    (
        (
        lambda x: (
            x.__setitem__('!', [(x['j'] if ('j' in x) else j) for x[ 'i'] in (x['zip'] if ('zip' in x) else zip)((x['l0'] if ('l0' in x) else l0), (x['l1'] if ('l1' in x) else l1)) for x['j'] in (x['i'] if ('i' in x) else i)]),
            x.__setitem__('l', x['!']),
            x
        )[-1]
        )
        # 返回 { 'g':g, 'c':c, 'd':d, '!':tab1, 'l1':tab1, 'l0':tab0, '$':None}
        (
        (
            lambda x: (
                x.__setitem__('!', tab1),
                x.__setitem__('l1', x['!']),
                x
            )[-1]
        )
        # 返回 { 'g' : g, 'c', 'd': d, '!':tab0, 'l0':tab0, '$':None }
        (
            (
            lambda x: (
                x.__setitem__('!', tab0),
                x.__setitem__('l0', x['!']),
                x
            )[-1]
            )
            ({ 'g': g, 'c': c, 'd': d, '$': None})
        )
        )
    )
    )
)

整合所有内容

完成所有这些后,我们现在知道函数正常工作所需的三个变量的类型(老实说,我们不需要更多):

  • g是一个整数,将被模4;如果值可被4整除,函数不返回任何内容;因此我们需要将此变量设置为1。
  • c是另一个看起来像异或密钥的整数;如果我们查看代码片段,此变量用于对x['l'](包含tab0和tab1的表)的每个字节进行异或;这是有趣的参数。
  • d是另一个整数,我们也可以忽略:它仅用于通过对x['d']自身进行异或来将x['i']设置为零。

我们现在真的不需要其他任何东西:没有更多的lambda,没有更多的痛苦,没有更多的眼泪。是时候编写我所谓的“有根据的暴力破解器”来找到c的正确值了:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
import sys

def main(argc, argv):
    tab0 = [1375, 1368, 1294, 1293, 1373, 1295, 1290, 1373, 1290, 1293, 1280, 1368, 1368,1294, 1293, 1368, 1372, 1292, 1290, 1291, 1371, 1375, 1280, 1372, 1281, 1293,1373, 1371, 1354, 1370, 1356, 1354, 1355, 1370, 1357, 1357, 1302, 1366, 1303,1368, 1354, 1355, 1356, 1303, 1366, 1371]
    tab1 = [1373, 1281, 1288, 1373, 1290, 1294, 1375, 1371,1289, 1281, 1280, 1293, 1289, 1280, 1373, 1294, 1289, 1280, 1372, 1288, 1375,1375, 1289, 1373, 1290, 1281, 1294, 1302, 1372, 1355, 1366, 1372, 1302, 1360, 1368, 1354, 1364, 1370, 1371, 1365, 1362, 1368, 1352, 1374, 1365, 1302]

    func = (
        lambda g, c, d: 
        (
            lambda x: (
                x.__setitem__('$', ''.join([(x['chr'] if ('chr' in x) else chr)((x['_'] if ('_' in x) else x)) for x['_'] in (x['s'] if ('s' in x) else s)[::-1]])),
                x
            )[-1]
        )
        (
            (
                lambda x: 
                    (lambda f, x: f(f, x))
                (
                    (
                        lambda __, x: 
                        (
                            (lambda x: __(__, x))
                            (
                                # i += 1
                                (
                                    lambda x: (
                                        x.__setitem__('i', ((x['i'] if ('i' in x) else i) + 1)),
                                        x
                                    )[-1]
                                )
                                (
                                    # s += [c ^ l[i]]
                                    (
                                        lambda x: (
                                            x.__setitem__('s', (
                                                    (x['s'] if ('s' in x) else s) +
                                                    [((x['l'] if ('l' in x) else l)[(x['i'] if ('i' in x) else i)] ^ (x['c'] if ('c' in x) else c))]
                                                )
                                            ),
                                            x
                                        )[-1]
                                    )
                                    (x)
                                )
                            )
                            # if ((x['g'] % 4) and (x['i'] < len(l))) else x
                            if (((x['g'] if ('g' in x) else g) % 4) and ((x['i'] if ('i' in x) else i)< (x['len'] if ('len' in x) else len)((x['l'] if ('l' in x) else l))))
                            else x
                        )
                    ),
                    x
                )
            )
            # 返回 { 'g':g, 'c':c, 'd':d, '!':zip(tab1, tab0), 'l':zip(tab1, tab0), l1':tab1, 'l0':tab0, 'i': 0, 'j': 1302, '!':0, 's':[] }
            (
                (
                    lambda x: (
                        x.__setitem__('!', []),
                        x.__setitem__('s', x['!']),
                        x
                    )[-1]
                )
                # 返回 { 'g':g, 'c':c, 'd':d, '!':zip(tab1, tab0), 'l':zip(tab1, tab0), l1':tab1, 'l0':tab0, 'i': 0, 'j': 1302, '!':0}
                (
                    (
                        lambda x: (
                            x.__setitem__('!', ((x['d'] if ('d' in x) else d) ^ (x['d'] if ('d' in x) else d))),
                            x.__setitem__('i', x['!']),
                            x
                        )[-1]
                    )
                    # 返回 { 'g' : g, 'c', 'd': d, '!':zip(tab1, tab0), 'l':zip(tab1, tab0), l1':tab1, 'l0':tab0, 'i': (1371, 1302), 'j': 1302}
                    (
                        (
                            lambda x: (
                                x.__setitem__('!', [(x['j'] if ('j' in x) else j) for x[ 'i'] in (x['zip'] if ('zip' in x) else zip)((x['l0'] if ('l0' in x) else l0), (x['l1'] if ('l1' in x) else
comments powered by Disqus
使用 Hugo 构建
主题 StackJimmy 设计