Linux x86 Egghunter

By August 27, 2018 August 30th, 2018 SLAE-x86

This blog post has been created for completing the requirements of the SecurityTube Linux Assembly Expert certification:
http://securitytube-training.com/online-courses/securitytube-linux-assembly-expert/
Student ID: SLAE-1134
Assignment number: 3
Github repo: https://github.com/kkirsche/SLAE


Introduction

Hey everyone! Today, we’re going to discuss the egghunter payload. Egg hunters are a unique type of payload that are good when the space you have for shellcode is limited. We’ll explain more about this payload type in a bit. First, let’s cover the SLAE egghunter assignment requirements

Requirements

  • Study about the Egg Hunter shellcode
  • Create a working demo of the Egg Hunter
  • Should be configurable and allow for different payloads to be used.

We can definitely meet these.

So, what is an Egg Hunter anyway?

Reference Document for this section

Skape Egghunter Shellcode Reference

Summary

An egg hunter is a small payload which can be used to find a second, larger payload, elsewhere within memory. This is ideal for space constrained exploits.

Full Explanation

An egg hunter is a small payload designed to search a system’s virtual address space, commonly abbreviated VAS, for an egg denoting the beginning of another, larger payload. This egg is usually two instances of four bytes (e.g. we may use the egg “eggs” as “\x73\x67\x67\x65” (note this is in reverse for little-endian architectures), or “w00t” as “\x74\x30\x30\x77”) which are used to first identify that we’ve found the larger payload, and then validate that the entry we’ve found is in fact our payload.

If like me, one of the first things you’ll think is can we just use four bytes, rather than eight? At first, this would seem to make sense as we want to keep space of both payloads (most likely) as small as possible. If we use only four bytes though, the egg hunter may inadvertently detect itself, as it has to store the egg in memory itself. Thus, by using the key twice, totalling eight bytes, we bypass this risk and ensure that we have actually found our payload. In addition, using a single four byte key, repeated twice, allows us to reduce the size of our egghunter by four bytes, as we do not need to store two unique keys in memory.

The other thing to consider is what bytes are you choosing and are then something which would not be unique in memory. If you are not careful, it is possible for the chosen key to already exist in memory. An example of a bad key would be “\x90\x90\x90\x90” which constitutes four no operation (NOP) operation codes. These can exist in memory anyway, and as such should not be used.

Requirements

So with a knowledge of what an egg hunter is, what things do we need to take into account? Generally speaking, there are two core requirements for egg hunter shellcode:

  1. Small size
  2. Durable

The first requirement, small size, means that we want the egg hunter payload to be as small as possible. This is because we want the egg hunter to be a viable payload when a full-sized payload such as a reverse shell or bind shell won’t work. This can occur in situations where we need a staged payload or our payload has to meet certain filtering requirements. As such, many egg hunters are between 20 and 40 bytes. Some, will go as low as 11 bytes, but do not offer the durability we expect.

The second requirement, durable, means that we want the egg hunter payload to withstand finding unallocated memory or locations in memory which our user doesn’t have permission to read. Normally, if we attempted to read this memory without the proper permissions, we’d end up with a segmentation fault (SIGSEGV). Luckily, Skape, from the previously referenced paper, has a solution to this using the access system call.

Skape explains why this was chosen well, so to quote Skape from page 7:

The real purpose of this system call is to check and see if the
current process has the specific access rights to a given file on disk. The reason this system call was selected was for two reasons. First, the system call had to have a pointer for just one argument, as multiple pointer arguments would require more register initialization, and thus violate requirement #2 regarding size. Secondly, the system call had to not attempt to write to the pointer supplied, as it could lead to bad things happening if the memory were indeed writable, which in all likelihood would be the case for the buffer that would hold the egg being searched for.

In addition, there is one other characteristic about the access system call which is helpful for us. In the event that we do not have permission to read memory from the location, rather than hitting a segmentation fault it instead returns an EFAULT return value (0xf2 being the low byte of the return value).

Converting our knowledge into a payload

Skape provides quality implementation details inside of the paper as well. Rather than reinventing the wheel, I worked to understand the implementation, and re-implement it myself with better inline documentation. The result was:

This egg hunter is implemented using Skape’s second method access (revised). This helped ensure that we were approaching the solution from an optimized manner.

How does it work?

Let’s break down the egg hunter to understand what’s happening here.


global _start
section .text
_start:
xor edx, edx ; we can't use cdq because we don't know the sign of EAX :(

We start off with our standard boilerplate definitions and then XOR EDX with itself. This differs from other payloads where we can use cdq, but since we are entering the payload from an unknown location, we can’t guarantee what state the signed flag would be in. Thus, we need to use XOR to zero out the register.

If we wanted this to be a bit more polymorphic, we could look at instead using SUB to subtract EDX from itself.

With a zero register, we then implement our page looping functionality


loop_page:
; view your page size with `getconf PAGESIZE`
; getconf PAGESIZE == 4096 decimal / 0x1000 hex
or dx, 0xfff ; 0xfff hex == 4065 decimal

Page size is normally 4 Kb (4096 bytes). As such, our page loop does a bitwise or operation on DX, the lower 16-bits of the current memory pointer. At our start, this is going to be 0, which is not accessible memory. This is why we have to perform the loop at the very start, so that we’re within the first page. Note that this is one less than the full page size, we’ll explain why in the next code segment. If you want to ensure that the page size on your machine is also 4096, you may run getconf PAGESIZE to query your localhost for this value.


next_addr:
inc edx ; increment our pointer by one
lea ebx, [edx+0x4] ; load where our potential eggs are
push byte 0x21 ; access system call
pop eax ; push / pop used to avoid XOR
int 0x80 ; check if we have access to the EBX address(es)
cmp al, 0xf2 ; if 0xf2, we have an EFAULT
jz loop_page ; we don't have access :( next page!

With our page looper done, we immediately drop into our next_addr section, which starts by incrementing EDX. This allows us to hit the first byte of the page (that one byte makes our 4095 into 4096, thus a full page size has been incremented). Now that we’re within the page, load the next eight bytes of memory into EBX where we’ll check if we have access to them. We load eight to avoid having to repeat this call for the second four bytes of our egg.

Since we haven’t zeroed out EAX, we push 0x21 onto the stack, representing the access system call and then pop it into EAX, clearing anything else that was within it. With our function setup, we then call int 0x80 to execute the function.

We compare the lower portion of EAX (represented via al) with 0xf2 which as we mentioned earlier is the lower bits of the EFAULT (aka we don’t have access). As 0 is true (we have an EFAULT in al), 1 is false (we have access), we check if the value is zero, and then jump to loop page if we don’t have access. We do this because if we can’t read the memory location, we can assume that all addresses within the page are also inaccessible. If we have access though to a section, we’ll drop into compare and begin looping over each byte within the page.


compare:
mov eax, 0x73676733 ; 3ggs in little-endian, representing our egg!
mov edi, edx ; put the address to compare into EDI for the scasd operation
; about scasd - https://c9x.me/x86/html/file_module_x86_id_287.html
scasd ; if [EDI] == EAX - note: scasd checks a double word (4 bytes) then increments EDI
jnz next_addr ; this isn't our egg :( let's move on
scasd ; if [EDI] == EAX then we found our egg!
jnz next_addr ; nope :( unlucky it seems
matched:
jmp edi ; let's execute our egg!

So we have access to the memory, it’s now time to check for our eggs.

We first move our egg (in little-endian format because we’re on x86) into EAX. In my example, I used an egg of “3ggs” which directly converted to hex is “\x33\x67\x67\x73”. I chose this as a play off of the name of the payload (egg hunters should look for eggs, duh!).

We then move our current memory address (stored in EDX) into EDI. This seems a bit weird at first if you are used to system call operations where the function arguments usually start with EAX (what function), EBX (arg 1), ECX (arg 2), etc. Luckily, there is a good reference explaining a bit about scasd (scan string double-word) here. Note that this states:

After the comparison, the (E)DI register is incremented or decremented automatically according to the setting of the DF flag in the EFLAGS register. (If the DF flag is 0, the (E)DI register is incremented; if the DF flag is 1, the (E)DI register is decremented.) The (E)DI register is incremented or decremented by 1 for byte operations, by 2 for word operations, or by 4 for doubleword operations.

Essentially, the scasd operation is allowing us to compare the value of four bytes (address of the start of this four bytes is in EDI) to the value in EAX (our egg). It then increments EDX for us, so we don’t have to do so before our next comparison (which is awesome!).

With our scasd operation performed, if we got a false return value (remember that 0 is true, 1 is false), we use jump not zero to jump to our next_addr section. Since we have access to this page, we want to increment each individual byte within the page and then in the event we’ve left our current page, we jump to the next page. This is why our next_addr and loop_page sections are separate.

On the other hand, if we did find our egg, we need to try another comparison to be sure we found the payload (which has EGG EGG PAYLOAD) rather than the egg hunter itself. If we didn’t find the second egg, we jump back to next_addr to continue our search through the page.

If we did find both eggs though, we jump directly to the payload (because scasd increments our memory pointer by 4, we jump directly to the payload, not to the eggs).

Validating our egg hunter

Awesome! We have an egg hunter, but now we need to make sure it actually works. We can do this using a shellcode harness similar to what we’ve used before. In my example, I’m using the IPv6 reverse shell payload which I created for assignment 2.

Compiling / Dumping Shellcode

Test harness

This is easily customizable because the payload is separated from the eggs. Thus, anyone can easily replace the payload following EGG EGG within the secret variable. This is named secret, as it would normally be located somewhere else in memory where our egg hunter doesn’t know about, thus, it’s a secret to the egg hunter.

We then include our egg hunter shellcode which again also includes a variable for our egg, thus if we needed or wanted to switch it out for something else, we could.

With our harness in place, we see that this worked:

working egg hunter

Can we make this easier to customize?

At this point, we’ve completed the assignment, but we want to make this more convenient to generate shellcode for. What if someone using this doesn’t understand how egg hunters work, and tries to use a 5 or 6 byte egg, or a 2 byte egg? This is where our generator comes in!

This allows someone to generate an egg hunter, provide the four bytes they want to use as their egg, and we’ll validate that what they asked to be their egg is actually four hex bytes. Sometimes, unicode characters like ü or é can cause problems as they often use two bytes per character. We also want to ensure we maintain no nulls. Thus, we check the egg length (has to be four), convert it to hex, check if we have a null value in the egg, and then print what the egg is (in little-endian hex) and the shellcode for the egghunter itself.

And with that, three assignments complete! Let’s keep the momentum up and attack assignment 4.

Kevin Kirsche

Author Kevin Kirsche

Kevin is a Principal Security Architect with Verizon. He holds the OSCP, OSWP, OSCE, and SLAE certifications. He is interested in learning more about building exploits and advanced penetration testing concepts.

More posts by Kevin Kirsche

Leave a Reply