Unrestricting Android Native Dynamic Library Linking
Bypassing linker namespaces to dynamically link libraries.
Introduction
Hacking around Android sometimes requires getting your hands dirty at the native level. And while I was on one such escapade, I discovered that Android has a tendency to make things quite restrictive with what you can do or interact with on the system. Rightfully so, perhaps, as they document on their developer website that changes from Android 7.0 start restricting access to dynamically linking against non-NDK libraries for stability reasons. The exact level of restriction on certain libraries vary between different API levels as they show in the following diagram.
Well, this is troublesome for the things I want to do so I had to investigate further. There weren't many helpful references online for bypassing these restrictions and only this relatively old (2019) Quarkslab post on Android Runtime Restrictions Bypass gave some idea around how it would be possible. It seems quite involved but there is a simpler workaround without tampering with any internal data structures - or so I think. This is a short post on how it's possible to bypass these restrictions and definitely not stealing from Frida's codebase.
Android dlopen
Android dlopen
is quite different from the base Linux implemention in that it gets the namespace of the calling function's module and checks it against the namespaces of the target library to open. To do this, the dlopen
function gets an additional argument of the return address (rsp
) and passes it to void* __loader_dlopen(const char* filename, int flags, const void* caller_addr)
.
In do_dlopen
, it will use the caller address to get the calling module's namespace.
There is a list of soinfo
data structures that contain information about each loaded module, some of which are the module's base address, its size in memory, and its primary namespace.
struct soinfo {
#if defined(__work_around_b_24465209__)
private:
char old_name_[SOINFO_NAME_LEN];
#endif
public:
const ElfW(Phdr)* phdr;
size_t phnum;
#if defined(__work_around_b_24465209__)
ElfW(Addr) unused0; // DO NOT USE, maintained for compatibility.
#endif
ElfW(Addr) base;
size_t size;
#if defined(__work_around_b_24465209__)
uint32_t unused1; // DO NOT USE, maintained for compatibility.
#endif
ElfW(Dyn)* dynamic;
#if defined(__work_around_b_24465209__)
uint32_t unused2; // DO NOT USE, maintained for compatibility
uint32_t unused3; // DO NOT USE, maintained for compatibility
#endif
soinfo* next;
...
// version >= 3
std::vector<std::string> dt_runpath_;
android_namespace_t* primary_namespace_;
android_namespace_list_t secondary_namespaces_;
...
The find_containing_library
function will enumerate the list of soinfo
to find the module's soinfo
by checking if the caller address lies within the module's address range. The head of this list is a global variable called solist
which can be obtained using the solist_get_head
function.
soinfo* find_containing_library(const void* p) {
// Addresses within a library may be tagged if they point to globals. Untag
// them so that the bounds check succeeds.
ElfW(Addr) address = reinterpret_cast<ElfW(Addr)>(untag_address(p));
for (soinfo* si = solist_get_head(); si != nullptr; si = si->next) {
if (address < si->base || address - si->base >= si->size) {
continue;
}
ElfW(Addr) vaddr = address - si->load_bias;
for (size_t i = 0; i != si->phnum; ++i) {
const ElfW(Phdr)* phdr = &si->phdr[i];
if (phdr->p_type != PT_LOAD) {
continue;
}
if (vaddr >= phdr->p_vaddr && vaddr < phdr->p_vaddr + phdr->p_memsz) {
return si;
}
}
}
return nullptr;
}
Using the soinfo
, it will get its primary namespace.
static android_namespace_t* get_caller_namespace(soinfo* caller) {
return caller != nullptr ? caller->get_primary_namespace() : g_anonymous_namespace;
}
And somewhere down the call hierarchy, if the calling module's namespace is incompatible with the libary's, then the loader will reject the request.
Bypassing Restrictions
A namespace is obviously compatible with itself - I hope! So to get the namespace of the library that you want to load, the solution is blindingly trivial: pass the address of the library to __loader_dlopen
's caller_addr
argument. An unrestricted dlopen
would directly call __loader_dlopen
which means that it has to be found through parsing its symbol in linker64
. Procfs can be used to get the initial information.
Here is code to parse the symtab
symbols to find __loader_dlopen
and then using it as an unrestricted dlopen
.
uint64_t linker64_base = ...;
const ElfW(Sym)* ssym = nullptr;
size_t ssym_size = 0;
const char* sstr_table = nullptr;
void*(*unrestricted_dlopen)(const char*, int, void*) = nullptr;
// Get the string table and symbol headers.
for (ElfW(Half) i = 0; i < ehdr->e_shnum; i++) {
if (shdr[i].sh_type == SHT_SYMTAB) {
ssym = reinterpret_cast<const ElfW(Sym)*>(file + shdr[i].sh_offset);
const ElfW(Sym)* ssym_end = reinterpret_cast<const ElfW(Sym)*>(
reinterpret_cast<const uint8_t*>(ssym) + shdr[i].sh_size);
ssym_size = (reinterpret_cast<const uint8_t*>(ssym_end) -
reinterpret_cast<const uint8_t*>(ssym)) / shdr[i].sh_entsize;
sstr_table = reinterpret_cast<const char*>(file + shdr[shdr[i].sh_link].sh_offset);
}
}
// Enumerate the string table and symbols.
for (size_t i = 0; i < ssym_size; i++) {
if (ssym[i].st_name) {
const char *sym_name = &sstr_table[ssym[i].st_name];
if (std::string(sym_name).find("__dl___loader_dlopen"))
// Calculate the memory address of __loader_dlopen.
unrestricted_dlopen = reinterpret_cast<void*(*)(const char*, int, void*)>(linker64_base + ssym[i].st_value);
}
}
}
// Use unrestricted dlopen.
void* libart_base = ...;
void* libart_handle = unrestricted_dlopen("libart.so", RTLD_LAZY, libart_base);
Using the standard dlsym
function works in the same way but once I had the handle to the library, it didn't seem to have issues about namespaces... yet.
Conclusion
It might seem like a roundabout way and unnecessary to get to an unrestricted dlopen
once you already have code to parse symbols for an arbitrary library. I suppose the other solution is to modify the namespace of your own module by tampering with the solist
. But it works!
I also hacked together an LLDB - why have you done this to me, Google - Python script that dumps the soinfo
list and each primary namespace if you give it any (ideally solist
) starting address.
#!/usr/bin/env python3
import argparse
import lldb
import re
import shlex
from pathlib import Path
SIZEOF_POINTER = 8
# [name, size, pad_size]
SOINFO_DEF = [
["phdr", 8, 0],
["phnum", 8, 0],
["base", 8, 0],
["size", 8, 0],
["dyn", 8, 0],
["next", 8, 0],
["flags", 4, 4],
["strtab", 8, 0],
["symtab", 8, 0],
["nbucket", 8, 0],
["nchain", 8, 0],
["bucket", 8, 0],
["chain", 8, 0],
["plt_relx", 8, 0],
["plt_relx_count", 8, 0],
["relx", 8, 0],
["relx_count", 8, 0],
["preinit_array", 8, 0],
["preinit_array_count", 8, 0],
["init_array", 8, 0],
["init_array_count", 8, 0],
["fini_array", 8, 0],
["fini_array_count", 8, 0],
["init_func", 8, 0],
["fini_func", 8, 0],
["ref_count", 8, 0],
["link_map.l_addr", 8, 0],
["link_map.l_name", 8, 0],
["link_map.l_ld", 8, 0],
["link_map.l_next", 8, 0],
["link_map.l_prev", 8, 0],
["contructors_called", 1, 7],
["load_bias", 8, 0],
["has_dt_symbolic", 1, 3],
["version", 4, 0],
["st_dev", 8, 0],
["st_ino", 8, 0],
["children", 8, 0],
["parents", 8, 0],
["file_offset", 8, 0],
["rtld_flags", 4, 0],
["dt_flags_1", 4, 0],
["strtab_size", 8, 0],
["gnu_nbucket", 8, 0],
["gnu_bucket", 8, 0],
["gnu_chain", 8, 0],
["gnu_maskwords", 4, 0],
["gnu_shift2", 4, 0],
["gnu_bloom_filter", 8, 0],
["local_group_root", 8, 0],
["android_relocs", 8, 0],
["android_relocs_size", 8, 0],
["soname", 8*3, 0],
["realpath", 8*3, 0],
["versym", 8, 0],
["verdef_ptr", 8, 0],
["verdef_cnt", 8, 0],
["verneed_ptr", 8, 0],
["verneed_cnt", 8, 0],
["target_sdk_version", 4, 4],
["dt_runpath", 8*3, 0],
["primary_namespace", 8, 0],
["secondary_namespace.head", 8, 0],
["secondary_namespace.tail", 8, 0],
["handle", 8, 0]
]
SIZEOF_SOINFO = 0 #len(SOINFO_DEF) * SIZEOF_POINTER
for i in range(len(SOINFO_DEF)):
SIZEOF_SOINFO += SOINFO_DEF[i][1] + SOINFO_DEF[i][2]
ANDROID_NAMESPACE_DEF = [
["name", 8*3, 0],
["is_isolated", 2, 0],
["is_exempt_list_enabled", 2, 0],
["is_also_used_as_anonymous", 2, 2],
["ld_library_paths", 8*3, 0],
["default_library_paths", 8*3, 0],
["permitted_paths", 8*3, 0],
["allowed_libs", 8*3, 0],
["linked_namespaces", 8*3, 0],
["soinfo_list.head", 8, 0],
["soinfo_list.tail", 8, 0]
]
SIZEOF_ANDROID_NAMESPACE = 0
for i in range(len(ANDROID_NAMESPACE_DEF)):
SIZEOF_ANDROID_NAMESPACE += ANDROID_NAMESPACE_DEF[i][1] + ANDROID_NAMESPACE_DEF[i][2]
def resolve_module_name(debugger, addr: int):
result = lldb.SBCommandReturnObject()
debugger.GetCommandInterpreter().HandleCommand(f"im loo -va {addr}", result)
m = re.search(r'file = "(.*)?",', result.GetOutput())
return m.group(1) if m is not None else None
def parse_std_string(process, bytes: bytes):
try:
return bytes.decode('utf-8').split('\0')[0].strip()
except UnicodeDecodeError as e:
pass
error = lldb.SBError()
len = int.from_bytes(bytes[0:8], 'little')
addr = int.from_bytes(bytes[0x10:0x18], 'little')
s_bytes = process.ReadMemory(addr, len, error)
if not error.Success():
print(f"Error reading memory: {error}")
return ""
return s_bytes.decode('utf-8').split('\0')[0].strip()
def parse_android_namespace(process, addr: int, verbose: bool):
error = lldb.SBError()
bytes = process.ReadMemory(addr, SIZEOF_ANDROID_NAMESPACE, error)
if not error.Success():
return
# start at 1st index
android_namespace_index = ANDROID_NAMESPACE_DEF[0][1] + ANDROID_NAMESPACE_DEF[0][2]
for i in range(1, len(ANDROID_NAMESPACE_DEF)):
member_size = ANDROID_NAMESPACE_DEF[i][1]
pad_size = ANDROID_NAMESPACE_DEF[i][2]
value = int.from_bytes(bytes[android_namespace_index:android_namespace_index+member_size], 'little')
if verbose:
print(f"\t[{int(android_namespace_index / SIZEOF_POINTER)}] {ANDROID_NAMESPACE_DEF[i][0]}: {hex(value)}")
android_namespace_index += member_size + pad_size
def parse_soinfo(debugger, bytes: bytes, verbose: bool):
process = debugger.GetSelectedTarget().process
soinfo_index = 0
soname = parse_std_string(process, bytes[49*8:49*8+8*3])
mod_base = int.from_bytes(bytes[2*8:2*8+8], 'little')
mod_size = int.from_bytes(bytes[3*8:3*8+8], 'little')
print(f"Module: {soname} [{hex(mod_base)}-{hex(mod_base + mod_size)}]")
for i in range(len(SOINFO_DEF)):
member_size = SOINFO_DEF[i][1]
pad_size = SOINFO_DEF[i][2]
value = int.from_bytes(bytes[soinfo_index:soinfo_index+member_size], 'little')
if SOINFO_DEF[i][0] == "primary_namespace": # parse primary namespace
error = lldb.SBError()
bytes = process.ReadMemory(value, 8*3, error)
if error.Success():
name = parse_std_string(process, bytes)
print(f"[{int(soinfo_index / SIZEOF_POINTER)}] {SOINFO_DEF[i][0]}: {name} [{hex(value)}]")
else:
print(f"[{int(soinfo_index / SIZEOF_POINTER)}] {SOINFO_DEF[i][0]}: [{hex(value)}]")
parse_android_namespace(process, value, verbose)
elif verbose:
print(f"[{int(soinfo_index / SIZEOF_POINTER)}] {SOINFO_DEF[i][0]}: {hex(value)}")
soinfo_index += member_size + pad_size
if verbose:
print()
def enum_solist(debugger, command, result, dict):
comm_args = shlex.split(command)
desc = """Enumerate and print information of the solist."""
parser = argparse.ArgumentParser(
description=desc,
prog='enum_solist'
)
parser.add_argument(
'address',
help='address of an solist'
)
parser.add_argument(
'-v',
"--verbose",
action='store_true',
help='verbose outptu of soinfo structure'
)
try:
args = parser.parse_args(comm_args)
except Exception as e:
print(f"Failed to parse args: {e}")
return
start_addr = int(args.address, 0)
target = debugger.GetSelectedTarget()
if not target:
print("Error: invalid target", file=target)
process = target.process
if not process:
print("Error: invalid process", file=result)
error = lldb.SBError()
curr_soinfo_addr = start_addr
while curr_soinfo_addr != 0:
bytes = process.ReadMemory(curr_soinfo_addr, SIZEOF_SOINFO, error)
if not error.Success():
print(f"Error: {error.GetCString()}", file=result)
return
parse_soinfo(debugger, bytes, args.verbose)
curr_soinfo_addr = int.from_bytes(bytes[5*8:5*8+8], 'little')
lldb.debugger.HandleCommand(
"command script add -f enum_solist.enum_solist enum_solist")
print("A new command called 'enum_solist' was added, type 'enum_solist --help' for more information.")