_ _ (_) | | __ ____ __ _ _ _ _ __ ___ _ __ _ __ ___ | |_ \ \ / /\ \/ /| || | | || '_ ` _ \ | '_ \ | '_ \ / _ \| __| \ V / > < | || |_| || | | | | || |_) |_ | | | || __/| |_ \_/ /_/\_\| | \__,_||_| |_| |_|| .__/(_)|_| |_| \___| \__| _/ | | | |__/ |_| /---------------------------------------------------------------------------------------\ |>.................[ Windows 10 x64中的RFG(Return Flow Guard)技术研究 ]................<| |>......................[ by nEINEI/vxjump.net ]......................<| |>......................[ first update: 2017-02-11 ...............<| |>......................[ last update: 2017-03-21 ]..............<| \>...................... [ neineit_AT_gmail.com ] .........................::RegisterWinRTObject(wchar_t const *, wchar_t const * *, struct _unnamed_type_RO_REGISTRATION_COOKIE_ * *, unsigned int) .text:00000001400026C0 xchg ax, ax <******< .text:00000001400026C2 nop dword ptr [rax+00000000h] <******< .text:00000001400026C9 sub rsp, 28h .text:00000001400026CD xor edx, edx .text:00000001400026CF mov ecx, 80004001h .text:00000001400026D4 call RoOriginateError_0 .text:00000001400026D9 mov eax, 80004001h .text:00000001400026DE add rsp, 28h .text:00000001400026E2 retn .text:00000001400026E2 ; --------------------------------------------------------------------------- .text:00000001400026E3 db 0Eh dup(90h) <******< .text:00000001400026F1 ; --------------------------------------------------------------------------- .text:00000001400026F1 retn 在Edge运行起来后,函数开头和结尾被动态的填入如下代码星号部分 MicrosoftEdgeCP!Microsoft::WRL::Module<1,Platform::Details::InProcModule>::RegisterCOMObject: 00007ff7`58e526c0 488b0424 mov rax,qword ptr [rsp] <******< 00007ff7`58e526c4 6448890424 mov qword ptr fs:[rsp],rax <******< 00007ff7`58e526c9 4883ec28 sub rsp,28h 00007ff7`58e526cd 33d2 xor edx,edx 00007ff7`58e526cf b901400080 mov ecx,80004001h 00007ff7`58e526d4 e8ddfbffff call MicrosoftEdgeCP!RoOriginateError (00007ff7`58e522b6) 00007ff7`58e526d9 b801400080 mov eax,80004001h 00007ff7`58e526de 4883c428 add rsp,28h 00007ff7`58e526e2 644c8b1c24 mov r11,qword ptr fs:[rsp] <******< 00007ff7`58e526e7 4c3b1c24 cmp r11,qword ptr [rsp] <******< 00007ff7`58e526eb 0f85ef350100 jne MicrosoftEdgeCP!_guard_ss_verify_failure (00007ff7`58e65ce0) <******< 00007ff7`58e526f1 c3 ret //RFG校验失败运行到这里 MicrosoftEdgeCP!_guard_ss_verify_failure!_guard_ss_verify_failure: 00007ffb`c52b0580 4d33db xor r11,r11 00007ffb`c52b0583 ff25e7f33800 jmp qword ptr [chakra!_guard_ss_verify_failure_fptr (00007ffb`c563f970)] 00007ffb`c52b0589 cc int 3 //RFG校验失败运行到这里 ntdll!LdrpHandleInvalidReturnAddress: 00007ffb`e754e8c0 498bc3 mov rax,r11 00007ffb`e754e8c3 4883e007 and rax,7 00007ffb`e754e8c7 85c0 test eax,eax 00007ffb`e754e8c9 7511 jne ntdll!LdrpHandleInvalidReturnAddress+0x1c (00007ffb`e754e8dc) [br=0] 00007ffb`e754e8cb 488b1424 mov rdx,qword ptr [rsp] 00007ffb`e754e8cf 644c8b0424 mov r8,qword ptr fs:[rsp] 00007ffb`e754e8d4 b92c000000 mov ecx,2Ch 00007ffb`e754e8d9 cd29 int 29h 00007ffb`e754e8db 90 nop 让我们梳理下一RFG防护的基本思路: 1) 在每一个函数开始处,”读”取当前栈rsp里面的值到rax中,该值就是函数的返回地址,我们记作return_value 2) 然后将rax里面值”写”入fs:[rsp] 偏移处 , 也就是保留函数返回地址return__value的值到”影子栈“Thread Control stack当中。 3) 在函数即将结束的时候检测当前返回地址值return_value和fs:[rsp]中保存的是否一致,如果一致,函数正常返回,如果不一致将跳向 xxx!_guard_ss_verfiy_failure_rdx, 表明可能受到攻击者利用,进而会引发int29异常,进程崩溃,阻止漏洞利用的进行。 那么有哪些机制决定RFG开启功能呢? 1 内核里面的全局变量nt!MnEnableRfg用来决定系统是否开启支持RFG功能 2 针对具体进程的PE头部,PE.LoadConfigDirectory.GuardFlag标记用来决定该进程是否支持RFG功能 3 可以通过SetProcessMitigationPolicy来动态设定一个进程是否开启RFG功能。 关于RFG涉及到的很多细节大家可以参考腾讯玄武实验室的一篇blog: http://xlab.tencent.com/cn/2016/11/02/return-flow-guard/ [0x02] 防护层面分析: 现在我们需要回答3个问题以及为什么这样会起到很强的保护作用。 1 fs可否被攻击者控制进而改写? 2 fs对应的影子栈指向何处? 3 如何保证fs指向的“影子栈”位置是随机的,可以不被攻击者预测到? 下面我将从用户层和内核层面来分析这个3个问题。 [0x02.1] 用户层面的分析 首先,在用户层面fs值本身是不可被改写(换句话说,特殊的方式下需要可利用的漏洞具有相当的便利性才有可能,存在fs被清空为0的可能,此时2个栈重合,防御失效),也就不存攻击者可以通过修改fs寄存器的值来 伪造出“影子栈”的内容,绕过上面提到的步骤3的验证。 因为用户层面,fs选择子的值不具有意义,例如下面的fs寄存器的值,见下面代码 0:018> r rax=00007ffd5bb757b5 rbx=0000023b69a2cea0 rcx=0000023b69a2cea0 rdx=00007ffd5c09dff8 rsi=0000007311dfb730 rdi=0000023b69a2cea0 rip=00007ffd5bb3d250 rsp=0000007311dfb588 rbp=0000023b6a736050 r8=0000007311dfb600 r9=0000023b5092ce50 r10=00000fffab76eaf2 r11=0000007311dfb5c8 r12=fffc000000000000 r13=0000000000000124 r14=0000023b50943040 r15=0000023b6a736050 iopl=0 nv up ei pl nz na po nc cs=0033 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00000206 chakra!Js::ExternalObject::IsObjectAlive+0x4: 00007ffd`5bb3d250 6448890424 mov qword ptr fs:[rsp],rax fs:00000073`11dfb588=00007ffd5bb757b5 这里的fs=53 不具有操作上的含义,53是选择子,你可以任意修改它为其它任何值,不影响程序的运行(在x86下是不可以被修改的)。同样我们 也不能在调试器通过dg命令表来获得fs真正含义。 0:018> dg fs P Si Gr Pr Lo Sel Base Limit Type l ze an es ng Flags ---- ----------------- ----------------- ---------- - -- -- -- -- -------- 0053 00000000`10181000 00000000`00000fff Data RW Ac 3 Bg By P Nl 000004f3 也就是说上面的00000000`10181000并不是fs所指向的真正的基地址,在我调试的环境下总是指向一个未被映射的地址空间。 0:018> !address 00000000`10181000 Usage: Free Base Address: 00000000`00000000 End Address: 00000000`7ffe0000 Region Size: 00000000`7ffe0000 ( 2.000 GB) State: 00010000 MEM_FREE Protect: 00000001 PAGE_NOACCESS Type: 当然fs指向的地址未必一定要被映射,但fs指向的地址 + rsp 所指向的地址一定是被映射的地址,因为这就是影子栈的真实内存区域。 另外,在x64位下,cs,ds,ss,es都是平坦模式,也就是基地址都是从0地址开始,gs,fs特殊一下。windows 在x64下使用gs这个段寄存器代 替原来的fs段寄存器的功能,例如,gs:[30]指向TEB,不再是x86下面的fs:[30]指向TEB,对gs的操作一般通过swapgs指令把 IA32_KERNEL_GS_BASE值装填到GS.base当中,从而得到内核的数据结构完成相应的服务例程,例如下面的操作。 .text:0000000140172A80 KiSystemService proc near .. .text:0000000140172A80 .text:0000000140172A80 cmp [rsp+arg_0], 23h .text:0000000140172A86 jz KiSystemService32User .text:0000000140172A8C swapgs <******< .text:0000000140172A8F mov rcx, r10 .text:0000000140172A92 sub rsp, 8 ... .text:0000000140172ABB lea r11, KiSystemServiceUser .text:0000000140172AC2 jmp r11 .text:0000000140172AC2 KiSystemService endp 但fs并没有对应的swap fs指令,在此之前,fs段寄存器一直是被保留的。在设计RFG功能后被重新启用了,使其指向“影子栈”地址。 在Insider Preview14986版本的下,windows10只针对少量应用程序开启了RFG功能,例如针对svchost.exe,但其中Edge是 不支持RFG功能的。虽然可以在函数开头看到这样的指令,但这时fs指向的线性地址是0。 mov qword ptr fs:[rsp],rax fs:00000073`11dfb588=00007ffd5bb757b5 也就是说此时,fs指向值是0,那么“影子栈”同真实栈是重合的,而在Insider Preview 15002版本后Edge开启了RFG保护,fs指向了真实的“影子栈”地址。 总之,攻击者不能在用户态通过控制fs段寄存器来伪造“影子栈”。当然构造出写操作mov fs:[...],eax指令并且也能获得控制权 执行的情况除外(在Edge浏览器中,这样做的前提是你需要绕过CFG),我们强调的是fs指向的地址是不能在用户态被操纵的。 [0x02.2] 内核层面的分析 首先fs = 53的选择子的赋值来自于内核方面,我们可以在KiSystemStartup函数内部看到 PAGELK:00000001403B10C1 assume ds:nothing PAGELK:00000001403B10C1 mov es, ax PAGELK:00000001403B10C4 assume es:nothing PAGELK:00000001403B10C4 mov ax, 53h PAGELK:00000001403B10C8 mov fs, ax PAGELK:00000001403B10CB assume fs:nothing PAGELK:00000001403B10CB test cs:VslVsmEnabled, 0FFh PAGELK:00000001403B10D2 jnz short loc_1403B10D9 1: kd> r gdtr gdtr=ffffd68137fd0fb0 我们关心的是fs指向的具体地址,在x86下,我们可以通过选择子53,和gdtr的地址推算获得fs的具体的地址值。但windows x64 long模式 下fs,gs 并不是通过GDT来获得的,而是通过读/写模式寄存器MSR寄存器来读取和设置的。 我们可以看到在内核KiSwapThreadControlStack中看到, KiSwapThreadControlStack proc near ... .text:00000001401726FE test r8, r8 .text:0000000140172701 jz loc_14017285B .text:0000000140172707 mov rcx, rsi .text:000000014017270A mov rdx, [rbp+0E8h+var_128] .text:000000014017270E call KiSwapThreadControlStackDispatch .text:0000000140172713 test al, al .text:0000000140172715 jz loc_14017285B .text:000000014017271B mov eax, [rsi+7A0h] .text:0000000140172721 mov edx, [rsi+7A4h] .text:0000000140172727 mov ecx, 0C0000100h <<********<< .text:000000014017272C wrmsr <<********<< .text:000000014017272E cli .text:000000014017272F test [rbp+0E8h+arg_0], 1 .text:0000000140172736 jz loc_140172810 ... ecx = 0xc0000100 对应的是fs段寄存器,gs对应的是0xc0000101。由上面的代码我们可推测出 KiSwapThreadControlStackDispatch负责获得线程的“影子栈”区域的地址,其地址值存储在eax,edx中是一个64bit地址, 然后通过wrmsr来写入fs,此时fs指向的地址就是由edx,eax共同决定的一个64位地址值了。 此时的rsi实际上指向了当前要“切换”线程的ethread首地址,所以0x7A0对应一个叫做UserFsBase的64位地址值。 0: kd> dt _ethread fffff801c0a2da40 ntdll!_ETHREAD +0x000 Tcb : _KTHREAD +0x5e8 CreateTime : _LARGE_INTEGER 0x0 ... +0x798 PicoContext : (null) +0x7a0 UserFsBase : 0 +0x7a8 UserGsBase : 0 ... 如果开启RFG保护的进程UserFsBase就会指向一个具体的地址值,当然这个地址值并不代表该区域一定会被映射。 3)“影子栈”的内存布局 在Win10 Insider Preview14986的版本中KiSwapThreadControlStack并没有被调用,直到15002版本后才被调用。我们可以通过 分析一下14986和15002,15016这几个不同的版本看到“影子栈”在设计上的变化。在14986版本中,我们以分析svchost进程为例来说明 “影子栈”指向的具体地址(Edge在该版本中没开启RFG)。 通过KiSwapThreadControlStack和PspAllocateThread的分析我们可以确定“影子栈”区域的值是通过MmSwapThreadControlStack 函数来获得。在MmSwapThreadControlStack内部我们可以知道“影子栈”的地址值的来源于进程的eprocess.vadroot结构里面标记的栈信息。 在MmSwapThreadControlStack中,通过解析进程vadroot树得到了“影子栈”的内存区域。win10已经更新对应的vadroot结构,但可能因为MS 符号没有完全给出,我进行查询的时候是存在问题的。但其中的逻辑可以很清楚的看出,现在,我们也可以明确的回答问题3,那就是“影子栈”的随机性由 win10内存管理机制设计的随机性来保证,鉴于x64平台的巨大地址空间,猜测出“影子栈”的难度还是很大的。 这里面我们需要清楚就是ethread.UsFsBase的概念其实就是“影子栈”基址与“真实栈”基址的diff差值。Fs的值就是ethread.UsFsBase的值。 对于任意一个开启RFG保护的进程,所有线程的UsFsBase都是同一个值。用户态fs:[rsp]指令“透明的”计算了“影子栈”的地址,即 影子栈地址区域 = ethread.UsFsBase+ rsp = diff + rsp ; 为什么我称之为区域呢 ? 因为随着真实栈的rsp的值不同对应“影子栈”的值也不同。“影子栈”基址值存在的vadroot树当中。 Vadroot->_RTL_AVL_TREE-> _RTL_BALANCED_NODE-> _MMVAD_SHORT->_ MI_VAD_EVENT_BLOCK-> _MI_RFG_PROTECTED_STACK (包含了“影子栈”的具体信息) 这里我们可以看到ControlStackBase就是“影子栈”的基地址。 kd> dt _MI_RFG_PROTECTED_STACK nt!_MI_RFG_PROTECTED_STACK +0x000 ControlStackBase : Ptr64 Void +0x008 ControlStackVad : Ptr64 _MMVAD_SHORT +0x010 Busy : Int4B 通过调试我们就可以获得svchost进程的“影子栈”的区域了,见图2 "Process:" "svchost.exe" "PID:" "3596" "Type" "Size" "Committed" "Private" "Total WS" "Private WS" "Shareable WS" "Shared WS" "Locked WS" "Blocks" "Largest" "Total" "2,684,450,952" "63,752" "4,368" "14,904" "2,932" "11,972" "11,280" "" "541" "" "Image" "44,936" "44,936" "1,476" "11,888" "1,088" "10,800" "10,128" "" "331" "7,136" "Mapped File" "4,104" "4,104" "" "384" "" "384" "384" "" "4" "3,292" "Shareable" "2,147,508,760" "11,756" "" "780" "" "780" "760" "" "83" "2,147,483,648" "Heap" "3,220" "768" "704" "700" "696" "4" "4" "" "29" "1,024" "Managed Heap" "" "" "" "" "" "" "" "" "" "" "Stack" "13,312" "664" "664" "176" "176" "" "" "" "45" "1,024" ----------------------------- "Private Data" "536,873,168" "812" "812" "264" "260" "4" "4" "" "49" "536,870,912" ----------------------------- "Page Table" "712" "712" "712" "712" "712" "" "" "" "" "" "Unusable" "2,740" "" "" "" "" "" "" "" "" "60" "Free" "134,754,503,168" "" "" "" "" "" "" "" "54" "133,102,038,528" -------------------------------------------------- "0000018000000000" "Private Data" "536,870,912" "664" "664" "124" "124" "" "" "" "31" "Read/Write" "" -------------------------------------------------- " 0000018000000000" "Private Data" "279,882,192" "" "" "" "" "" "" "" "" "Reserved" "" " 000001C2BAA74000" "Private Data" "48" "48" "48" "16" "16" "" "" "" "" "Read/Write" "" " 000001C2BAA80000" "Private Data" "5,588" "" "" "" "" "" "" "" "" "Reserved" "" " 000001C2BAFF5000" "Private Data" "44" "44" "44" "20" "20" "" "" "" "" "Read/Write" "" " 000001C2BB000000" "Private Data" "980" "" "" "" "" "" "" "" "" "Reserved" "" " 000001C2BB0F5000" "Private Data" "44" "44" "44" "4" "4" "" "" "" "" "Read/Write" "" " 000001C2BB100000" "Private Data" "980" "" "" "" "" "" "" "" "" "Reserved" "" " 000001C2BB1F5000" "Private Data" "44" "44" "44" "20" "20" "" "" "" "" "Read/Write" "" " 000001C2BB200000" "Private Data" "980" "" "" "" "" "" "" "" "" "Reserved" "" " 000001C2BB2F5000" "Private Data" "44" "44" "44" "8" "8" "" "" "" "" "Read/Write" "" " 000001C2BB300000" "Private Data" "2,004" "" "" "" "" "" "" "" "" "Reserved" "" " 000001C2BB4F5000" "Private Data" "44" "44" "44" "16" "16" "" "" "" "" "Read/Write" "" " 000001C2BB500000" "Private Data" "980" "" "" "" "" "" "" "" "" "Reserved" "" " 000001C2BB5F5000" "Private Data" "44" "44" "44" "16" "16" "" "" "" "" "Read/Write" "" " 000001C2BB600000" "Private Data" "980" "" "" "" "" "" "" "" "" "Reserved" "" " 000001C2BB6F5000" "Private Data" "44" "44" "44" "4" "4" "" "" "" "" "Read/Write" "" " 000001C2BB700000" "Private Data" "2,004" "" "" "" "" "" "" "" "" "Reserved" "" " 000001C2BB8F5000" "Private Data" "44" "44" "44" "4" "4" "" "" "" "" "Read/Write" "" " 000001C2BB900000" "Private Data" "980" "" "" "" "" "" "" "" "" "Reserved" "" " 000001C2BB9F5000" "Private Data" "44" "44" "44" "4" "4" "" "" "" "" "Read/Write" "" " 000001C2BBA00000" "Private Data" "2,004" "" "" "" "" "" "" "" "" "Reserved" "" " 000001C2BBBF5000" "Private Data" "44" "44" "44" "4" "4" "" "" "" "" "Read/Write" "" " 000001C2BBC00000" "Private Data" "980" "" "" "" "" "" "" "" "" "Reserved" "" " 000001C2BBCF5000" "Private Data" "44" "44" "44" "8" "8" "" "" "" "" "Read/Write" "" " 000001C2BBD00000" "Private Data" "468" "" "" "" "" "" "" "" "" "Reserved" "" " 000001C2BBD75000" "Private Data" "44" "44" "44" "" "" "" "" "" "" "Read/Write" "" " 000001C2BBD80000" "Private Data" "980" "" "" "" "" "" "" "" "" "Reserved" "" " 000001C2BBE75000" "Private Data" "44" "44" "44" "" "" "" "" "" "" "Read/Write" "" " 000001C2BBE80000" "Private Data" "980" "" "" "" "" "" "" "" "" "Reserved" "" " 000001C2BBF75000" "Private Data" "44" "44" "44" "" "" "" "" "" "" "Read/Write" "" " 000001C2BBF80000" "Private Data" "256,967,168" "" "" "" "" "" "" "" "" "Reserved" "" "00007FFFFFFE0000" "Private Data" "64" "" "" "" "" "" "" "" "1" "Reserved" "" (图2) Private Data是一个512G大小的内存空间,起始地址是0x0000180~00000000. 我们可以看见,其中有很多是Read/Write属性的内存区块。 在这些区块间插入很多reserved属性的内存区域。仔细看可以发现,这些大小44k,48k的区域就是不同线程的“影子栈”的区域。这里我们可以看 到MS采取的一个技巧就是存在一个大小是279G左右的reserved区域从180~00000000开始。其实这个reserved区域在每个进程它都大小不同, 我把它叫做“随机的reserved”区域,目的就是使得后面的“影子栈”区域变得更随机化,因为它的大小不同,导致“影子栈”分配的地址有变化。 因为这些“影子栈”区域是存在的vadroot树中,显然,我们如果能找到用户态的API也是通过解析vadroot来获得信息,我们就有可能获得影 子栈的内存信息,进而通过修改这块区域的数据来绕过RFG的防护。NtQueryVirtualMemory或者VirtualQueryEx都满足上述要求,NtQueryVirtualMemory 实际上是后者的最终的调用。 这样,我们利用这样的API满足下面2个条件就可以获得“影子栈”的区域了。 1 连续的内存类型是MEM_PRIVATE , protection 是READ/WRITE属性。 2 连续的内存大小是44k ~ 48k 大小,中间夹杂reversed属性的内存。 满足上面条件就可以判断出搜索到的了一块“影子栈”区域了。 但目前情况是,NtQueryVirtualMemory/VirtualQueryEx 在浏览器Edge中都是被CFG保护的函数,我们不能利用这2个API来获得“影子栈”区域。 我这里面说一种情况供大家参考(因为后续版本已经改变了这样的影子栈结构,这样的猜测已经意义不大了),假设浏览器中我有一个能任意 地址读写漏洞,那么要猜测到这个“影子栈”地址需要怎么样的复杂性呢? 首先我们知道这片大小512G的“影子栈”区间是在进程创建时分配出来的,在开启RFG保护后,每一个线程创建时都在这片区间内提交一个部分内存 区域当作“影子栈”。反复的测试,我发现这片大的区域差不多分布在如下几个区域: ULONG64 nShadowAddr[] = { 0x00000200~00000000, 0x00000280~00000000, 0x00000300~00000000, 0x00000180~00000000 }; 我们以0x000001c2~BAFF5000大小是44k的这个“影子栈”举例, 1 显然,“影子栈”地址的开头5位是0,是不需要猜测的,结尾3位也是0,也是不需要测试的。因此对于上边界就是从200,280,180,300开始的一块内存 区域,仅需要猜4次。 2 多数“影子栈”以F5结尾,所以这个2位我不去猜,我默认猜测的区域以F5结尾。 3 因为总大小固定为512G左右,故“随机的reserved“区域只能出现为0~100G,100~200G,200~300G,300G~400G. 500G = 0x80~00000000 , 故“随机的reserved”出现的大小仅可能是10~00000000,20~0000000,30~0000000,40~0000000,50~0000000,60~0000000,70~0000000这样的边界。 以180~00000000 为例,“影子栈”可能的边界就是18+(0~7)介于{18,1F}这个区域,只需要遍历7次。 总结,就是开头8位我都不需要猜,1c2我不需要测,结尾的F5我也不需要测。其实我只需要测试中间的3位,为什么是BAF,当然还是不能避免依次 遍历200,280,300开始的上边界,直道遇到180的边界开始猜测{18~1F}区域以F5结尾的可读写内存区域。0x000001c2~BAFF5000这个“影子栈”地址 在一定的技巧下猜测可以在大概不到1分钟的情况下获得一个“影子栈”的边界,写程序测试如下, not found a readable memory of target process at 000001C2B11F5000 not found a readable memory of target process at 000001C2B21F5000 not found a readable memory of target process at 000001C2B31F5000 not found a readable memory of target process at 000001C2B41F5000 not found a readable memory of target process at 000001C2B51F5000 not found a readable memory of target process at 000001C2B61F5000 not found a readable memory of target process at 000001C2B71F5000 not found a readable memory of target process at 000001C2B81F5000 not found a readable memory of target process at 000001C2B91F5000 not found a readable memory of target process at 000001C2BA1F5000 not found a readable memory of target process at 000001C2BB1F5000 .....>>> found a readable memory of target process at 000001C2BC1F5000 Message:[10] Message:[11] PID:596,read process handle:0 read a buff of target process: 但我们这样的猜测只能保证找到一个“影子栈”的边界,并不能知道你想要利用的漏洞具体的对应的那个一个线程。当然理想的情况下,你对所有 的“影子栈”都进行修改,保证漏洞触发shellcode能够运行,然后进程非常“理想”的crash掉。 我们继续看一下14986之后的版本,从15002之后微软修改了“影子栈”内存区域。Edge也开始支持RFG保护了。整体上512G的“影子栈”内存区域变成 了一个reserved的区域,而不是上述能看到的独立的具有边界的“影子栈”区域。这样导致用NtQueryVirtualMemory/VirtualQueryEx 或者其它工具看到的都是一个整体区域,我们不清楚里面的内存布局。见下面布局 ,15002版本中的Edge, "Process:" "MicrosoftEdgeCP.exe" "PID:" "6100" "Type" "Size" "Committed" "Private" "Total WS" "Private WS" "Shareable WS" "Shared WS" "Locked WS" "Blocks" "Total" "2,718,077,236" "119,620" "5,632" "24,340" "4,996" "19,344" "19,308" "" "617" "" "Image" "95,100" "95,100" "1,896" "19,388" "1,544" "17,844" "17,808" "" "393" "24,504" "Mapped File" "4,080" "4,080" "" "352" "" "352" "352" "" "2" "3,292" "Shareable" "2,147,513,376" "16,640" "" "1,364" "224" "1,140" "1,140" "" "106" "2,147,483,648" "Heap" "4,304" "1,276" "1,212" "1,208" "1,204" "4" "4" "" "37" "1,024" "Managed Heap" "" "" "" "" "" "" "" "" "" "" "Stack" "17,408" "424" "424" "216" "216" "" "" "" "51" "1,024" ------------------------------- "Private Data" "570,438,044" "512" "512" "224" "220" "4" "4" "" "28" "536,870,912" ------------------------------- "Page Table" "1,588" "1,588" "1,588" "1,588" "1,588" "" "" "" "" "" "Unusable" "3,336" "" "" "" "" "" "" "" "" "60" "Free" "134,720,877,760" "" "" "" "" "" "" "" "51" "131,739,939,968" "Address" "Type" "Size" "Committed" "Private" "Total WS" "Private WS" "Shareable WS" "Shared WS" "Locked WS" "Blocks" "Protection" "Details" "000000007FFE0000" "Private Data" "4" "4" "4" "4" "" "4" "4" "" "1" "Read" "" " 000000007FFE0000" "Private Data" "4" "4" "4" "4" "" "4" "4" "" "" "Read" "" "000000007FFE1000" "Private Data" "60" "" "" "" "" "" "" "" "1" "Reserved" "" " 000000007FFE1000" "Private Data" "60" "" "" "" "" "" "" "" "" "Reserved" "" "0000006435A00000" "Private Data" "2,048" "140" "140" "140" "140" "" "" "" "3" "Read/Write" "Thread Environment Block ID: 6104" " 0000006435A00000" "Private Data" "956" "" "" "" "" "" "" "" "" "Reserved" "Thread Environment Block ID: 6104" " 0000006435AEF000" "Private Data" "140" "140" "140" "140" "140" "" "" "" "" "Read/Write" "Thread Environment Block ID: 6104" " 0000006435B12000" "Private Data" "952" "" "" "" "" "" "" "" "" "Reserved" "Thread Environment Block ID: 6104" ... " 000001C6BD4C0000" "Private Data" "4" "4" "4" "4" "4" "" "" "" "" "Read/Write" "" "000001C6BD4D0000" "Private Data" "4" "4" "4" "4" "4" "" "" "" "1" "Read/Write" "" " 000001C6BD4D0000" "Private Data" "4" "4" "4" "4" "4" "" "" "" "" "Read/Write" "" "000001C6BD620000" "Private Data" "128" "128" "128" "4" "4" "" "" "" "1" "Read/Write" "" " 000001C6BD620000" "Private Data" "128" "128" "128" "4" "4" "" "" "" "" "Read/Write" "" "000001C6BF400000" "Private Data" "33,554,432" "12" "12" "" "" "" "" "" "7" "Read/Write" "" " 000001C6BF400000" "Private Data" "465,652" "" "" "" "" "" "" "" "" "Reserved" "" " 000001CEBF601000" "Private Data" "8,188" "" "" "" "" "" "" "" "" "Reserved" "" "000001CEBFE00000" "Private Data" "1,024" "4" "4" "4" "4" "" "" "" "2" "Read/Write" "" " 000001CEBFE00000" "Private Data" "4" "4" "4" "4" "4" "" "" "" "" "Read/Write" "" " 000001CEBFE01000" "Private Data" "1,020" "" "" "" "" "" "" "" "" "Reserved" "" ---------------------------------------------------- "00007A0000000000" "Private Data" "536,870,912" "" "" "" "" "" "" "" "1" "Reserved" "" " 00007A0000000000" "Private Data" "536,870,912" "" "" "" "" "" "" "" "" "Reserved" "" ---------------------------------------------------- "00007FFFFFFE0000" "Private Data" "64" "" "" "" "" "" "" "" "1" "Reserved" "" " 00007FFFFFFE0000" "Private Data" "64" "" "" "" "" "" "" "" "" "Reserved" "" 我们可以看到,整个“影子栈”区域是一个以0x00007A00~00000000开始的reserved区域。想来这里面应该有一些trick影藏在其中, 因为NtQueryVirtualMemory/VirtualQueryEx通过解析vadroot来获得当前进程的内存分配情况,如果vad里面存储的“影子栈”就是一个512G的 整体区域,那么在内核中针对每一个线程为什么能区分出这些“影子栈”的边界。显然上述API获得的信息是不全面的。通过调试我们来探测出这个整 体影子栈的内存布局情况。我们可以在nt!PspAllocateProcess中获得刚刚创建的Edge进程,然后在MmSwapThreadControlStack获得相应的“影子栈”地址 和真实线程栈地址,测试出的信息如下: shadowstackbase:00007adcbab00000 - threadstack:0000009547800000 = userfs:00007a4773300000 shadowstackbase:00007adcbac00000 - threadstack:0000009547900000 = userfs:00007a4773300000 shadowstackbase:00007adcbad00000 - threadstack:0000009547a00000 = userfs:00007a4773300000 shadowstackbase:00007adcbae00000 - threadstack:0000009547b00000 = userfs:00007a4773300000 shadowstackbase:00007adcbaf00000 - threadstack:0000009547c00000 = userfs:00007a4773300000 shadowstackbase:00007adcbb000000 - threadstack:0000009547d00000 = userfs:00007a4773300000 shadowstackbase:00007adcbb100000 - threadstack:0000009547e00000 = userfs:00007a4773300000 shadowstackbase:00007adcbb200000 - threadstack:0000009547f00000 = userfs:00007a4773300000 shadowstackbase:00007adcbb300000 - threadstack:0000009548000000 = userfs:00007a4773300000 shadowstackbase:00007adcbb400000 - threadstack:0000009548100000 = userfs:00007a4773300000 shadowstackbase:00007adcbb500000 - threadstack:0000009548200000 = userfs:00007a4773300000 shadowstackbase:00007adcbb600000 - threadstack:0000009548300000 = userfs:00007a4773300000 shadowstackbase:00007adcbb700000 - threadstack:0000009548400000 = userfs:00007a4773300000 shadowstackbase:00007adcbb800000 - threadstack:0000009548500000 = userfs:00007a4773300000 shadowstackbase:00007adcbb900000 - threadstack:0000009548600000 = userfs:00007a4773300000 shadowstackbase:00007adcbba00000 - threadstack:0000009548700000 = userfs:00007a4773300000 shadowstackbase:00007adcbbb00000 - threadstack:0000009548800000 = userfs:00007a4773300000 shadowstackbase:00007adcbbc00000 - threadstack:0000009548900000 = userfs:00007a4773300000 由上面的信息可以得知,在15002开始以后的版本“影子栈”的布局变成了以100000为边界的内存区域。但哪部分才是可写的内容,可以通过当前 线程的userfs + rsp的值来计算出。 [0x03] 突破RFG的可能性 首先我们回顾一下,RFG防护的主要目的是防止恶意的对用户栈的篡改。通过分配一段512G空间的区域将各个线程的“影子栈”放入其中。 在一个被保护函数结束前进行栈数据的比对。RFG的强度在于, 1) 用户态无法控制fs段寄存器指向,这是由内核态来决定的 2) 攻击者很难猜测出“影子栈”所在的内存位置 对于情况1,用户态我们很难做到修改,对于情况2我谈谈曾经考虑过的一些攻击方式。 在应对14986的版本时,考虑的优化后的搜索“影子栈”是有可能的情况。继而我们需要寻找利用的任意地址读写功能。在Edge中首先需要考虑如何任意 内存地址读写,通常的内存层面暴力搜索是存在的问题的,因为“影子栈”之间插入的reversed的内存,读指令会引发程序异常。这里有个利用技巧就是在该 版本中lstrcpyA函数是未被CFG的,我们可以利用该函数来测试一个指定的内存区域是否可读、写。而且该函数仅需要2个参数,利用漏洞较容易控制其利用, 在读到一个reversed空间内存时该函数不会触发异常。 LPTSTR WINAPI lstrcpy( _Out_ LPTSTR lpString1, _In_ LPTSTR lpString2 ); 我们可以控制参数2指向待测试的内存,让其拷贝到我们分配的一段内存空间,即参数1,如果成功说明待测试内存可读。否则说明不可读。当然这里 面有一个风险,待测试的目标内存如果刚好是很长的一个串结尾。而参数1不够长则会出现问题。 同样,我们也可以是GlobalLock函数来达到一样的目的。这个函数也是可以bypass CFG保护的,而且只需要一个参数更方便控制。 LPVOID WINAPI GlobalLock( _In_ HGLOBAL hMem ); 还有就是我们如果可以在Edge中分配出512G的内存空间,那么这个新分配出的空间将靠近在“影子栈”区域,这会方便我们搜索时尽量减少预测的数 据长度。 但不够幸运的是,在15016版本的时候Edge已经把这些函数都加入到CFG防护的列表当中了。也就是说这一类的函数都不能进行读内存的利用了。同样 的原因NtQueryVirtualMemory/VirtualQueryEx 也都很早的被加入了CFG的sensitive API列表里面,这似乎也可以说明不存在更上一层的调用路 径回去调用这2个API去获得内存块的具体信息。所以脚本层面的利用已经非常困难的了。 另外,就是可以考虑可以修改RFG的异常时的跳转问题。在RFG比较失败的情况下会跳向对应_guard_ss_verify_failure: 00007ff7`58e526e2 644c8b1c24 mov r11,qword ptr fs:[rsp] 00007ff7`58e526e7 4c3b1c24 cmp r11,qword ptr [rsp] 00007ff7`58e526eb 0f85ef350100 jne MicrosoftEdgeCP!_guard_ss_verify_failure (00007ff7`58e65ce0) 00007ff7`58e526f1 c3 ret //跳向这里 MicrosoftEdgeCP!_guard_ss_verify_failure: 00007ffb`c52b0580 4d33db xor r11,r11 00007ffb`c52b0583 ff25e7f33800 jmp qword ptr [chakra!_guard_ss_verify_failure_fptr (00007ffb`c563f970)] 00007ffb`c52b0589 cc int 3 //我们查看这里的信息 0:011> x chakra!_guard_ss_verify_failure_fptr 00007ffa`0495f970 chakra!_guard_ss_verify_failure_fptr = //替换 0:011> dqs chakra!_guard_ss_verify_failure_fptr 00007ffa`0495f970 00007ffa`238fe8c0 ntdll!LdrpHandleInvalidReturnAddress//替换这里的指针 00007ffa`0495f978 00007ffa`238fe910 ntdll!RtlGuardVerifyReachableStackPointer 00007ffa`0495f980 00000000`00000000 00007ffa`0495f988 00007ffa`04323a60 chakra!jscriptinfo_IID_Lookup+0x2a60 00007ffa`0495f990 00007ffa`04323a70 chakra!jscriptinfo_IID_Lookup+0x2a70 chakra!_guard_ss_verify_failure_fptr 所在内存区域的属性 Usage: Image Base Address: 00007ffa`04904000 End Address: 00007ffa`04a9e000 Region Size: 00000000`0019a000 ( 1.602 MB) State: 00001000 MEM_COMMIT Protect: 00000002 PAGE_READONLY Type: 01000000 MEM_IMAGE Allocation Base: 00007ffa`04320000 Allocation Protect: 00000080 PAGE_EXECUTE_WRITECOPY 如果有办法修改 chakra!_guard_ss_verify_failure_fptr的内存属性为可写,将其LdrpHandleInvalidReturnAddress指针替换为我们的一个控制的 函数,这样也可以绕过RFG。 总之,在详细分析过后,就是RFG的整体防护情况来看是具有很高的强度防御来应对ROP/控制流劫持等攻击。但其实我们都知道,CFG的弱点是可以通过直接的修改函数栈 上的return地址来绕开,避免寻找直接的方法绕过。RFG的开启,CFG才形成了真正意义上的完整性防护。单纯的内存破坏层面的利用在DEP+ASLR+CFG+RFG的配合下必 将变得越来越难突破。 [0x04].致谢 最后,非常感谢我的同事Sun Bing在RFG研究方面给予的非常重要的提示和帮助,一同调试研究了非常多的技术细节,才得以使得本篇文章得到了整理和汇总。 注意: 微软已经从漏洞悬赏计划里面撤掉了RFG防护,仅作为一个Research Project。说明RFG设计本身存在一个很大的问题。接下来会发生什么谁也不知道了。我们已经知道几个点上可能被 绕过的情况,但不确定是否是微软撤掉RFG的根本原因。 参考资料: http://xlab.tencent.com/cn/2016/11/02/return-flow-guard/