TransWikia.com

How to know when a subroutine starts when reversing an ARM64 file?

Reverse Engineering Asked by user11144725 on August 12, 2021

I’m kinda new in the ARM world, and a question suddenly came up in my mind: how does (your favorite reverse engineering tool) know where a subroutine of an ARM64 file is starting (because I’d like to write a tool to know how many functions the file has)?

thanks in advance.

One Answer

I don't know how my tools are doing it under the hood, but if I had to do it myself, I would search for some function's prologues related to my architecture of choice.

This is a good 'signature' to identify a function, as this is something standard (even if you may find some cases where a function doesn't have any prologue/epilogue).

As you may know, the prologue of a function is designed to save the previous stack frame, and setup a fresh new one that is able to handle the function's local variables, without messing with what was stored one the stack by the previous function. It is also used to store the address where to return to at the end of the function (basically saving the old instruction pointer + x, and overwriting it at the end of your function).

In the opposite, the end of a function contains an epilogue, witch should restore the previous stack frame, and return to the previous routine.

It all depend on which compilator was used to generate your target binary, but a typical ARM64 function's epilogue looks like this:

sub    sp, sp, #32          ; make rooms on the stack for the new stack frame
stp    x29, x30, [sp, #16]  ; save what is needed to restore the previous stack frame and where to return (respectively x29 the previous SP, and x30 the return address)
add    x29, sp, #16         ; new frame pointer

If you have a binary blob of code that contains some functions, try to look for the opcodes of theses instructions. It should indicate where functions starts.

Now for the implementation itself: take advantage of your tools and what's already existing. You added the 'IDA' tag, so I guess that's what you are taking about when referring to your 'favorite tool'.

When IDA is not able to find any functions (for instance if I try to disassemble a big blob of ARM data that have no entrypoint and some spaced functions inside), I use this small IDAPython snippet that is able to search for a given binary pattern, disassemble everything that start by this, and add it to the functions and disassembly list. The IDA auto-analysis will be trigger by this, and it should find other functions that are called inside the first one, and so on.

from idaapi import *
from ida_search import *
from ida_funcs import *

cnt = 0
my_pattern = '' # The hex value of the opcodes you are looking for

def is_function(start_addr):
   content = get_bytes(start_addr, 4, False).hex()
   if content == my_pattern:
      return True
   return False

addr = find_unknown(0, 1)
while addr != BADADDR
   is_valid = is_function(addr)
   if is_valid:
      add_func(addr)
      cnt += 1
   addr = find_unknown(addr, 1)

print('A total of ' + str(cnt) + ' new functions where defined')

This one is very aggressive, and may find 'false positives' if a blob of junk data contains something that looks like your binary pattern.

If it find nothing, it means that your pattern is not a good one. In this case, disassemble at random locations in IDA until you find a valid function. When you have one, check how this function prologue looks like. It may be slightly different, and your signature was not matching it. Update the script with this new signatures, and you should be good.

Correct answer by Guillaume on August 12, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP