Problem Statement

Using GDB, how do we know for sure, how many bytes it will take to overflow up to a certain point (i.e. return address)? From what we have seen in the previous reading on using GDB as a debugger, GDB can give tell us where a program crashes when it crashes.

Theory

De-Brujin sequence is essentially a sequence of characters such that every n characters are unique. Suppose we send an input that looks like this:

input: AAAAAAAA BBBBBBBB CCCCCCCC DDDDDDDD
^ crash!

and the program crashes because our return address has been overwritten with 0x4343434343434343 (’CCCCCCCC’ in hex).

How many characters did it take for us to overwrite the return address? Because “CCCCCCCC” is the 3rd sequence of the input, we can identify that it took the first 2 sequences to overflow up to the return address.

Since each sequence is 8 bytes, it would’ve taken 16 bytes to overwrite the return address.

Practice

We can generate a cyclic sequence using a command line utility that comes with pwntools.

❯ cyclic -n8 500
aaaaaaaabaaaaaaacaaaaaaadaaaaaaaeaaaaaaafaaaaaaagaaaaaaahaaaaaaaiaaaaaaajaaaaaaakaaaaaaalaaaaaaamaaaaaaanaaaaaaaoaaaaaaapaaaaaaaqaaaaaaaraaaaaaasaaaaaaataaaaaaauaaaaaaavaaaaaaawaaaaaaaxaaaaaaayaaaaaaazaaaaaabbaaaaaabcaaaaaabdaaaaaabeaaaaaabfaaaaaabgaaaaaabhaaaaaabiaaaaaabjaaaaaabkaaaaaablaaaaaabmaaaaaabnaaaaaaboaaaaaabpaaaaaabqaaaaaabraaaaaabsaaaaaabtaaaaaabuaaaaaabvaaaaaabwaaaaaabxaaaaaabyaaaaaabzaaaaaacbaaaaaaccaaaaaacdaaaaaaceaaaaaacfaaaaaacgaaaaaachaaaaaaciaaaaaacjaaaaaackaaaaaaclaaaaaacmaaa

In the above screenshot, we generate a cyclic sequence whereby every 8 bytes is a unique set of character.

Let’s try to run our ‘PIEthagoras Theorem’ challenge in GDB and provide the generated sequence above as our input.

running the program in GDB, and providing cyclic sequence as input

running the program in GDB, and providing cyclic sequence as input

GDB will stop due to a crash in the program (SIGSEGV/Segmentaiton Fault) when trying to return to a memory that doesn’t exist. We note that we are currently executing the ret instruction, which means that the return address is currently at the top of the stack.