Il2cpp逆向:global-metadata解密


前言

  关于Il2cpp的资料网上有很多,简而言之,Il2cpp就是unity用来代替原来的基于Mono虚拟机的一种新的打包方式,它先生成IL(中间语言),然后再转换成Cpp文件,提高运行效率的同时增加了安全性。原本基于Mono的打包方式极其容易被逆向,现在市面上的新游戏基本上都是用Il2cpp的方式打包的,当然Il2cpp的逆向教程也很多,但是都是千篇一律,教你用国内大佬写的Il2cppDumper去dump就完事,毫无技术含量。事实上,由于这个工具太过出名,很多游戏厂商都采取了对抗措施,导致就算你照着教程来,大多数情况下也不会成功的。因此打算学习一下Il2cpp相关的攻防技术,于是在网上找了一个Il2cpp的CTF题来练手。题目来源:n1ctf-2018

baby unity3d

  题目要求很明确,输入正确的flag即可。既然已经知道它是一个用了Il2cpp的unity程序了,那么就直接去找它的libil2cpp.so以及global-metadata.dat文件,然后尝试用Il2cppDumper进行解析,当然肯定会解析失败。解析失败的原因肯定是出在这两个文件中,至少有一个文件是被加密了,导致无法正常解析。这题比较基础,只加密了global-metadata.dat文件,可以将global-metadata.dat拖入010Editor查看,正常的global-metadata.dat开头的四个字节应该是AF 1B B1 FA,而这题的global-metadata.dat显然被加密过了,因此只需要将其解密即可完成解析,后面的flag问题就迎刃而解。

  想要解密global-metadata.dat我们有两种思路,一种是dump解密结果,另一种是分析加密算法。对于第一种思路,这里有个frida脚本

function frida_Memory(pattern)
{
Java.perform(function ()
{
    console.log("头部标识:" + pattern);
    var addrArray = Process.enumerateRanges("r--");
    for (var i = 0; i < addrArray.length; i++)
    {
        var addr = addrArray[i];
        Memory.scan(addr.base, addr.size, pattern,
        {
            onMatch: function (address, size)
            {
                console.log('搜索到 ' + pattern + " 地址是:" + address.toString());
                console.log(hexdump(address,
                    {
                        offset: 0,
                        length: 64,
                        header: true,
                        ansi: true
                    }
                    ));
                //0x108,0x10C如果不行,换0x100,0x104
                var DefinitionsOffset = parseInt(address, 16) + 0x108;
                var DefinitionsOffset_size = Memory.readInt(ptr(DefinitionsOffset));

                var DefinitionsCount = parseInt(address, 16) + 0x10C;
                var DefinitionsCount_size = Memory.readInt(ptr(DefinitionsCount));

                //根据两个偏移得出global-metadata大小
                var global_metadata_size = DefinitionsOffset_size + DefinitionsCount_size
                    console.log("大小:", global_metadata_size);
                var file = new File("/data/data/" + get_self_process_name() + "/global-metadata.dat", "wb");
                file.write(Memory.readByteArray(address, global_metadata_size));
                file.flush();
                file.close();
                console.log('导出完毕...');
            },
            onComplete: function ()
            {
                //console.log("搜索完毕")
            }
        }
        );
    }
}
);
}
setImmediate(frida_Memory("AF 1B B1 FA")); //global-metadata.dat头部特征

  大概流程就是通过魔术来定位到文件在内存中的起始地址,然后通过解析文件头来计算出文件的大小,最后进行dump。该脚本的适用条件是global-metadata.dat在解密后必须要有正常的魔术即AF 1B B1 FA否则定位,同时文件头信息要正确否则无法计算文件大小。这个脚本有一定的参考价值,然而对于这题起不到作用,脚本执行后没有找到起始地址,看来即使解密后,内存中也没有AF 1B B1 FA存在。所以这种通用的dump方式应该是不行了,只能找到global-metadata.dat的加载函数,待其解密完成后再进行dump,所以我们需要对global-metadata.dat的加载流程进行分析。

global-metadata.dat加载流程

  这篇文章IL2CPP Tutorial: Finding loaders for obfuscated global-metadata.dat filesglobal-metadata.dat加载流程有很详细的介绍,非常值得一读。我这边简要概括下,在libil2cpp.so里面有个il2cpp_init函数是加载函数调用链中的第一个函数,整个调用链是这样的

il2cpp_init
  -> il2cpp::vm::Runtime::Init
    -> il2cpp::vm::MetadataCache::Initialize
      -> il2cpp::vm::MetadataLoader::LoadMetadataFile

  我们可以在libil2cpp.so里面搜索il2cpp_init或者整个调用链里的关键字来定位到其中一个函数,最简单的是通过搜索global-metadata.dat来直接定位到MetadataCache::Initialize,但是这题不行,因为出题人特意把global-metadata.dat这个字符串加密了,所以搜索不到。所以这边我们搜索il2cpp_init来对照源码往下定位到MetadataCache::Initialize

  假如说上面几个函数一个都没找到该怎么办,其实上面那篇文章也有提到,在libunity.so里面会对il2cpp_init做符号解析,得到它的地址。具体参考上面那篇文章就行。另外那篇文章中有个例子是il2cpp_init被做了ROT-5处理,函数名变成了nq2huu_nsny,然后我发现在我自己找的一些case里搜索nq2huu_nsny也能找到,所以这个nq2huu_nsny也值得一试。

il2cpp_init

int __fastcall il2cpp_init(int a1)
{
  setlocale(6, "");
  return sub_4C4770(a1, "v2.0.50727");
}

sub_4C4770

int __fastcall sub_4C4770(int a1)
{
.......

  v1 = nullsub_3();
  v2 = nullsub_1(v1);
  v3 = sub_514E34(v2);
  dword_695A80 = (int)"2.0";
  v4 = sub_4F8468(v3);
  v5 = sub_5171B8(v4);
  v6 = sub_4B5564(v5);
  v7 = sub_501A60(v6);
  v8 = sub_4FA8B8(v7);
  v9 = sub_4E0D84(v8);
  sub_4D566C(v9);
  memset(&dword_695AB0, 0, 0x13Cu);
  v10 = sub_5017E4("mscorlib.dll");
  dword_695AB0 = il2cpp_assembly_get_image_0(v10);
  dword_695AB4 = ((int (*)(void))il2cpp_class_from_name_0)();
  dword_695ABC = il2cpp_class_from_name_0(dword_695AB0, "System", "Void");
  dword_695AC0 = il2cpp_class_from_name_0(dword_695AB0, "System", "Boolean");
  dword_695AB8 = il2cpp_class_from_name_0(dword_695AB0, "System", "Byte");
  dword_695AC4 = il2cpp_class_from_name_0(dword_695AB0, "System", &unk_5BA5E1);
  dword_695AC8 = il2cpp_class_from_name_0(dword_695AB0, "System", "Int16");
  dword_695ACC = il2cpp_class_from_name_0(dword_695AB0, "System", &unk_5BA5E7);
  dword_695AD0 = il2cpp_class_from_name_0(dword_695AB0, "System", "Int32");

......

一个个点进去看,发现sub_4B5564其实就是MetadataCache::Initialize

void sub_4B5564()
{
  void *v0; // r4
  int v1; // r4
  unsigned int v2; // r7
  int v3; // r0
  int v4; // lr
  int v5; // r2
  int v6; // r4
  int v7; // r3
  _DWORD *v8; // r1
  int v9; // r6
  int v10; // r0
  unsigned int v11; // r3
  int v12; // r7
  int v13; // r1
  unsigned int v14; // r1
  unsigned int v15; // r9
  int v16; // r6
  unsigned __int16 v17; // r0
  unsigned __int16 *v18; // r6
  int v19; // t1
  _DWORD *v20; // r7
  unsigned __int16 v21; // r4
  int v22; // r2
  int v23; // r1
  int v24; // r0
  int v25; // r6
  int v26; // r7
  int v27; // r1
  int v28; // [sp+8h] [bp-48h]
  unsigned int v29; // [sp+Ch] [bp-44h]
  int v30; // [sp+10h] [bp-40h]
  int v31; // [sp+14h] [bp-3Ch]
  int v32[2]; // [sp+18h] [bp-38h] BYREF
  int v33; // [sp+20h] [bp-30h] BYREF
  int v34; // [sp+24h] [bp-2Ch]
  double v35; // [sp+28h] [bp-28h] BYREF
  int v36; // [sp+30h] [bp-20h]

  v0 = (void *)sub_4B5518("CLKFIL\rMETIDITI\nDIT", 19);
  dword_6959CC = sub_513060();
  free(v0);
  dword_6959D0 = dword_6959CC;
  v28 = dword_6959CC + *(_DWORD *)(dword_6959CC + 184);
  if ( *(_DWORD *)(dword_6959CC + 188) >= 0x44u )
  {
    v1 = dword_6959CC + *(_DWORD *)(dword_6959CC + 184);
    v2 = 0;
    do
    {
      sub_5019F8(v1);
      v1 += 68;
      ++v2;
    }
    while ( v2 < *(_DWORD *)(dword_6959D0 + 188) / 0x44u );
  }
  dword_6959D4 = sub_5169D4(*(_DWORD *)(dword_6959C4 + 24), 4);
  dword_6959D8 = sub_5169D4(*(_DWORD *)(dword_6959D0 + 164) / 0x68u, 4);
  dword_6959DC = sub_5169D4(*(_DWORD *)(dword_6959D0 + 52) / 0x38u, 4);
  dword_6959E0 = sub_5169D4(*(_DWORD *)(dword_6959C4 + 32), 4);
  dword_6959E4 = *(_DWORD *)(dword_6959D0 + 180) / 0x18u;
  v3 = sub_5169D4(dword_6959E4, 28);
  dword_6959E8 = v3;
  if ( dword_6959E4 >= 1 )
  {
    v4 = dword_6959CC;
    v5 = 0;
    v6 = dword_6959D0;
    v7 = 12;
    v8 = (_DWORD *)(*(_DWORD *)(dword_6959D0 + 176) + dword_6959CC + 12);
    while ( 1 )
    {
      v9 = v3 + v7;
      ++v5;
      *(_DWORD *)(v9 - 12) = v4 + *(_DWORD *)(v6 + 24) + *(v8 - 3);
      *(_DWORD *)(v9 - 8) = *(v8 - 2);
      *(_DWORD *)(v9 - 4) = *(v8 - 1);
      *(_DWORD *)(v3 + v7) = *v8;
      *(_DWORD *)(v9 + 4) = v8[1];
      *(_DWORD *)(v9 + 12) = v8[2];
      if ( v5 >= dword_6959E4 )
        break;
      v7 += 28;
      v8 += 6;
      v6 = dword_6959D0;
      v4 = dword_6959CC;
      v3 = dword_6959E8;
    }
  }
  sub_4B5A28();
  v35 = 0.0;
  v36 = 0;
  v10 = dword_6959D0;
  if ( *(_DWORD *)(dword_6959D0 + 188) >= 0x44u )
  {
    v11 = 0;
    v31 = dword_6959CC + *(_DWORD *)(dword_6959D0 + 160);
    do
    {
      v12 = 0;
      v13 = *(_DWORD *)(v28 + 68 * v11);
      if ( v13 != -1 )
        v12 = dword_6959E8 + 28 * v13;
      v30 = v12;
      v14 = *(_DWORD *)(v12 + 12);
      if ( v14 )
      {
        v15 = 0;
        v29 = v11;
        do
        {
          v16 = v31 + 104 * (*(_DWORD *)(v12 + 8) + v15);
          v19 = *(unsigned __int16 *)(v16 + 80);
          v18 = (unsigned __int16 *)(v16 + 80);
          v17 = v19;
          if ( v19 )
          {
            v20 = (_DWORD *)(v31 + 104 * (*(_DWORD *)(v12 + 8) + v15) + 52);
            v21 = 0;
            do
            {
              v22 = *(_DWORD *)(dword_6959D0 + 48);
              v34 = *v20 + v21;
              v23 = *(_DWORD *)(dword_6959CC + v22 + 56 * v34 + 24);
              if ( v23 == -1 )
              {
                v33 = 0;
              }
              else
              {
                v33 = *(_DWORD *)(*(_DWORD *)(dword_6959C0 + 4) + 4 * v23);
                if ( v33 )
                {
                  sub_4B5CFC(&v35, &v33);
                  v17 = *v18;
                }
              }
              ++v21;
            }
            while ( v21 < (unsigned int)v17 );
            v12 = v30;
            v14 = *(_DWORD *)(v30 + 12);
          }
          ++v15;
        }
        while ( v15 < v14 );
        v11 = v29;
        v10 = dword_6959D0;
      }
      ++v11;
    }
    while ( v11 < *(_DWORD *)(v10 + 188) / 0x44u );
  }
  v24 = dword_6959C4;
  if ( *(int *)(dword_6959C4 + 16) >= 1 )
  {
    v25 = 0;
    v26 = 0;
    do
    {
      v27 = *(_DWORD *)(v24 + 20);
      v32[1] = *(_DWORD *)(*(_DWORD *)(v24 + 36) + 12 * *(_DWORD *)(v27 + v25));
      v32[0] = *(_DWORD *)(*(_DWORD *)(dword_6959C0 + 20) + 4 * *(_DWORD *)(v27 + v25 + 4));
      sub_4B5CFC(&v35, v32);
      v24 = dword_6959C4;
      v25 += 12;
      ++v26;
    }
    while ( v26 < *(_DWORD *)(dword_6959C4 + 16) );
  }
  sub_4C70FC(&v35);
  if ( LODWORD(v35) )
    operator delete((void *)LODWORD(v35));
}

其中这个sub_4B5518("CLKFIL\rMETIDITI\nDIT", 19);就是将字符串解密成global-metadata.dat的位置。

_BYTE *__fastcall sub_4B5518(char *a1, int a2)
{
  _BYTE *result; // r0
  int v5; // r1
  _BYTE *v6; // r2
  char v7; // t1

  result = malloc(a2 + 1);
  if ( a2 >= 1 )
  {
    v5 = a2;
    v6 = result;
    do
    {
      v7 = *a1++;
      --v5;
      *v6++ = (v7 - 2) ^ 0x26;
    }
    while ( v5 );
  }
  result[a2] = 0;
  return result;
}

然后sub_513060则实际上就是MetadataLoader::LoadMetadataFile

int __fastcall sub_513060(const char *a1)
{
  void *v2; // r0
  int v3; // r4
  int v4; // r6
  size_t v5; // r5
  int v6; // r8
  unsigned int *v7; // r2
  int v8; // r1
  void *v9; // r0
  void *v10; // r0
  unsigned int *v12; // r2
  int v13; // r1
  unsigned int *v14; // r2
  int v15; // r1
  int v16; // [sp+Ch] [bp-4Ch] BYREF
  int v17[2]; // [sp+10h] [bp-48h] BYREF
  int v18; // [sp+18h] [bp-40h] BYREF
  int v19[2]; // [sp+1Ch] [bp-3Ch] BYREF
  int v20; // [sp+24h] [bp-34h] BYREF
  int v21; // [sp+28h] [bp-30h] BYREF
  int v22[2]; // [sp+2Ch] [bp-2Ch] BYREF
  int v23[2]; // [sp+34h] [bp-24h] BYREF

  sub_4C5B40(&v20);
  v19[0] = (int)"Metadata";
  v19[1] = 8;
  v22[0] = v20;
  v22[1] = *(_DWORD *)(v20 - 12);
  sub_4C7F74(&v21, v22, v19);
  v2 = (void *)(v20 - 12);
  if ( (_UNKNOWN *)(v20 - 12) != &unk_6A25F4 )
  {
    v7 = (unsigned int *)(v20 - 4);
    __dmb(0xBu);
    do
      v8 = __ldrex(v7);
    while ( __strex(v8 - 1, v7) );
    __dmb(0xBu);
    if ( v8 <= 0 )
      j_operator delete(v2);
  }
  v17[0] = (int)a1;
  v17[1] = strlen(a1);
  v23[0] = v21;
  v23[1] = *(_DWORD *)(v21 - 12);
  sub_4C7F74(&v18, v23, v17);
  v3 = 0;
  v16 = 0;
  v4 = sub_4CDA80(&v18, 3, 1, 1, 0, &v16);
  if ( !v16 )
  {
    v5 = sub_4CDE4C(v4, &v16);
    if ( !v16 )
    {
      v6 = sub_5163A8(v4, 0, 0);
      sub_4CDCF4(v4, &v16);
      if ( v16 )
      {
        v3 = 0;
        sub_516540(v6, 0);
      }
      else
      {
        v3 = sub_512FDC(v6, v5);
      }
    }
  }
  v9 = (void *)(v18 - 12);
  if ( (_UNKNOWN *)(v18 - 12) != &unk_6A25F4 )
  {
    v12 = (unsigned int *)(v18 - 4);
    __dmb(0xBu);
    do
      v13 = __ldrex(v12);
    while ( __strex(v13 - 1, v12) );
    __dmb(0xBu);
    if ( v13 <= 0 )
      j_operator delete(v9);
  }
  v10 = (void *)(v21 - 12);
  if ( (_UNKNOWN *)(v21 - 12) != &unk_6A25F4 )
  {
    v14 = (unsigned int *)(v21 - 4);
    __dmb(0xBu);
    do
      v15 = __ldrex(v14);
    while ( __strex(v15 - 1, v14) );
    __dmb(0xBu);
    if ( v15 <= 0 )
      j_operator delete(v10);
  }
  return v3;
}

对比源码发现这个sub_512FDC就是解密函数

char *__fastcall sub_512FDC(int a1, size_t size)
{
  char *result; // r0
  size_t v5; // r2

  result = (char *)malloc(size);
  if ( size )
  {
    v5 = 0;
    do
    {
      *(_DWORD *)&result[v5 & 0xFFFFFFFC] = *(_DWORD *)(a1 + (v5 & 0xFFFFFFFC)) ^ dword_5DCF6C[(v5 + v5 / 0x84) % 0x84];
      v5 += 4;
    }
    while ( v5 < size );
  }
  return result;
}

写出解密脚本

import struct
f = open('global-metadata.dat', 'rb')
a = ""
a = f.read()
key = [0xF83DA249, 0x15D12772, 0x40C50697, 0x984E2B6B, 0x14EC5FF8, 0xB2E24927,
       0x3B8F77AE, 0x472474CD, 0x5B0CE524, 0xA17E1A31, 0x6C60852C, 0xD86AD267, 0x832612B7, 0x1CA03645, 0x5515ABC8,
       0xC5FEFF52, 0xFFFFAC00, 0x0FE95CB6, 0x79CF43DD, 0xAA48A3FB, 0xE1D71788, 0x97663D3A, 0xF5CFFEA7, 0xEE617632,
       0x4B11A7EE, 0x040EF0B5, 0x0606FC00, 0xC1530FAE, 0x7A827441, 0xFCE91D44, 0x8C4CC1B1, 0x7294C28D, 0x8D976162,
       0x8315435A, 0x3917A408, 0xAF7F1327, 0xD4BFAED7, 0x80D0ABFC, 0x63923DC3, 0xB0E6B35A, 0xB815088F, 0x9BACF123,
       0xE32411C3, 0xA026100B, 0xBCF2FF58, 0x641C5CFC, 0xC4A2D7DC, 0x99E05DCA, 0x9DC699F7, 0xB76A8621, 0x8E40E03C,
       0x28F3C2D4, 0x40F91223, 0x67A952E0, 0x505F3621, 0xBAF13D33, 0xA75B61CC, 0xAB6AEF54, 0xC4DFB60D, 0xD29D873A,
       0x57A77146, 0x393F86B8, 0x2A734A54, 0x31A56AF6, 0x0C5D9160, 0xAF83A19A, 0x7FC9B41F, 0xD079EF47, 0xE3295281,
       0x5602E3E5, 0xAB915E69, 0x225A1992, 0xA387F6B2, 0x7E981613, 0xFC6CF59A, 0xD34A7378, 0xB608B7D6, 0xA9EB93D9,
       0x26DDB218, 0x65F33F5F, 0xF9314442, 0x5D5C0599, 0xEA72E774, 0x1605A502, 0xEC6CBC9F, 0x7F8A1BD1, 0x4DD8CF07,
       0x2E6D79E0, 0x6990418F, 0xCF77BAD9, 0xD4FE0147, 0xFEF4A3E8, 0x85C45BDE, 0xB58F8E67, 0xA63EB8D7, 0xC69BD19B,
       0xDA442DCA, 0x3C0C1743, 0xE6F39D49, 0x33568804, 0x85EB6320, 0xDA223445, 0x36C4A941, 0xA9185589, 0x71B22D67,
       0xF59A2647, 0x3C8B583E, 0xD7717DED, 0xDF05699C, 0x4378367D, 0x1C459339, 0x85133B7F, 0x49800CE2, 0x3666CA0D,
       0xAF7AB504, 0x4FF5B8F1, 0xC23772E3, 0x3544F31E, 0x0F673A57, 0xF40600E1, 0x7E967417, 0x15A26203, 0x5F2E34CE,
       0x70C7921A, 0xD1C190DF, 0x5BB5DA6B, 0x60979C75, 0x4EA758A4, 0x078FE359, 0x1664639C, 0xAE14E73B, 0x2070FF03]
with open('decrypt', 'wb') as fp:
    n = 0
    while n < len(a):
        num = struct.unpack("<I", a[n:n + 4])[0]
        num = num ^ key[(n + n // 0x84) % 0x84]
        d = struct.pack('I', num)
        fp.write(d)
        n = n + 4

解密完成后发现还是不能用Il2cppDumper,将解密后的文件放到010editor里发现魔数不对,改成AF 1B B1 FA就行了,原来他把魔数校验的那一步给去掉了,所以可以改魔数,这样就可以防止用前面提到的通用frida脚本来dump了。

题解

  用Il2cppDumper解析完成后发现它有个下面几个函数

public Void .ctor() { }
// RVA: 0x518834 VA: 0xc575a834
private Void Start() { }
// RVA: 0x518838 VA: 0xc575a838
private Void Update() { }
// RVA: 0x51883c VA: 0xc575a83c
public Void Click() { }
// RVA: 0x518a24 VA: 0xc575aa24
private Boolean CheckFlag(String input) { }
// RVA: 0x518b54 VA: 0xc575ab54
public static String AESEncrypt(String text, String password, String iv) { }
// RVA: 0x518ee4 VA: 0xc575aee4
public static String AESDecrypt(String text, Byte[] password, Byte[] iv) { }
// RVA: 0x5191f0 VA: 0xc575b1f0
private static Void .cctor() { }

  CheckFlag的偏移是0x518a24,把libil2cpp.so放IDA里然后按G跳转过去,查看函数

int __fastcall sub_518A24(int a1, int a2)
{
  int v3; // r0
  int v4; // r4

  if ( !byte_69C825 )
  {
    sub_4B82BC(1279);
    byte_69C825 = 1;
  }
  v3 = dword_698140;
  if ( (*(_BYTE *)(dword_698140 + 178) & 1) != 0 && !*(_DWORD *)(dword_698140 + 96) )
  {
    il2cpp_runtime_class_init_0();
    v3 = dword_698140;
  }
  v4 = sub_518B54(
         *(_DWORD *)(v3 + 80),
         a2,
         *(_DWORD *)(*(_DWORD *)(v3 + 80) + 4000),
         *(_DWORD *)(*(_DWORD *)(v3 + 80) + 2364));
  if ( (*(_BYTE *)(dword_696FB8 + 178) & 1) != 0 && !*(_DWORD *)(dword_696FB8 + 96) )
    il2cpp_runtime_class_init_0();
  return sub_7D644(0, v4, dword_69B7F0, 0);
}

这里这个sub_518B54函数其实就是AESDecrypt,可以用Il2cppDumper的IDA脚本来还原函数名,这边就偷懒不还原了,大概的代码逻辑就是将你的输入做一下加密然后和flag进行对比,所以我们打印一下AES的key以及flag做一下解密就行了。

其他方法

有几个比较先进的工具可以帮助我们对Il2cpp进行逆向分析,用来解这题也非常方便

  • Zygisk-Il2CppDumper:一个Magisk插件,可以动态dump函数名和函数偏移,最初需要自己配置安卓开发环境,现在作者改用github Action,直接fork一份填个包名就能编译使用,很方便。
  • frida-il2cpp-bridge:一个frida库,功能非常强大,不仅有动态dump函数名和函数偏移的功能,还有trace、hook等多种功能,非常值得尝试。

Reference

unity3d il2cpp原理解析及逆向分析
IL2CPP Tutorial: Finding loaders for obfuscated global-metadata.dat files
Unity之IL2CPP解析
Baby unity3D
Il2cpp源码


文章作者: 大A
版权声明: 本博客所有文章除特別声明外,均采用 CC BY 4.0 许可协议。转载请注明来源 大A !
评论
  目录