March 9, 2017 | Author: Tuxology.net | Category: N/A
Crash and burn: writing Linux application fault handlers Complementing the standard Linux fault handler ("Segmen...
Crash N' Burn OR
Version 1.1
When bad things happens to good programs...
Gilad Ben-Yossef Chief Coffee Drinker Codefidence Ltd.
[email protected] http://codefidence.com
1
What's this tutorial is about?
Segmentation fault: core dumped
2
Dealing with faults
3
What's wrong with core dumps? ● ●
●
Instant gratification No space left on device for 753Mb core dump No source, no (network) access but working code needed for paycheck
●
Access to external state (e.g. FPGA)
●
Easier access to internal state machine.
●
Custom fault behavior
●
Haiku error messages 4
Haiku error messages? Firs t smoke , the n si len ce. Th is thousa nd d ol lar rout er di es so b eaut iful ly. Seg mentati on f aul t: core dum ped 5
The Plan ●
We shall: –
Trap signals sent by the kernel in response to faults (SIGSEGV and friends)
–
Print back trace and custom state information (Haiku form optional)
–
???
–
Profit!
●
Easy to do
●
Difficult to do right 6
Signals ●
●
●
●
Signals are asynchronous notifications sent to a process by the kernel, another process or itself Process can register a signal handler function to respond to signal Process faults make the kernel generate a signal ... which the process can catch and respond to
Signals Worth Catching SIGQUIT - Quit from keyboard • SIGILL - Illegal Instruction • SIGABRT - Abort signal from abort(3) • SIGFPE - Floating point exception • SIGSEGV - Invalid memory reference • SIGBUS - Bus error (bad memory access) •
Catching Signals int sigaction(int signum, \
●
const struct sigaction *act, \ struct sigaction *oldact); Register a signal handler. –
signum: signal number.
–
act: pointer to new struct sigaction.
–
oldact: pointer to buffer to be filled with current sigaction (or NULL, if not interested).
Catching Signals cont. ●
The sigaction structure is defined as: –
●
struct sigaction { void (*sa_handler)(int); void (*sa_sigaction)(int, siginfo_t *, void *); sigset_t sa_mask; int sa_flags; ... }
Where: –
sa_hander and sa_sigaction are two forms of signal handler call backs. We'll use the SA_SIGINFO flag to choose the sa_sigaction form
–
sa_mask holds the mask of signals which will be blocked during the callback run. We'll flip all bits.
Registering Handler Example struct sigaction act; memset(&act, 0, sizeof (act)); act.sa_handler = my_handler; sigfillset (&act.sa_mask); act.sa_flags = 0; return sigaction(SIGSEGV, &act, NULL);
Signal Handler ●
Signal handler prototype: void handler (int signal, siginfo_t * siginfo, \ void * context)
●
Where: –
signal is the signal number
–
siginfo is a pointer to struct siginfo_t
–
context is a pointer to architecture specific structure holding context of interrupted program.
Signal info ●
●
struct siginfo_t holdes information about the signal delivered. Interesting fields for exceptions include: –
si_errno: errno value ●
–
si_code: Error description code ●
–
Not always filled on all platforms/versions It's an index to a list of specific error descriptions. See sigaction(2).
si_addr: Fault address ●
For SIGILL, SIGFPE, SIGSEGV, and SIGBUS only.
Signal Context ●
A structure that saves the hardware context which the signal interrupted –
Architecture specific
–
Undocumented
–
Changes between release
–
e.g. getting IP in various architectures: ● ●
●
x86: context->uc_mcontext.gregs[REG_EIP] PPC: context->uc_mcontext.regs->nip
Check out sys/ucontext.h for your favorite architecture
Getting a Backtrace ●
glibc back trace support:
#include int backtrace(void **buffer, int size); Fills the buffer with call stack address char ** backtrace_symbols(void *buffer, int size); Returns a malloc-ed array of strings of function names. Returned buffer needs to be free()-ed. void backtrace_symbols_fd(void *const *buffer, int size, int fd); Prints function names to file descriptor fd. ●
Symbols taken from dynamic symbol table, use -rdynamic to populate.
Naïve Example ●
●
WARNING! The code you are about to see is wrong It is also very common...
What's Wrong? ● ●
● ●
Async-signal non safe functions Heap usage after malloc arena corruption Not thread safe Signal handler induced stack munging is hiding real fault location –
On some architectures at least.
Async-signal Safety ● ●
Signal handler run asynchronously Can't share locks between signal handler and main program –
●
Can only use list of async-safe functions defined in POSIX.1-2003 –
●
If lock is taken and signal handler is called we have dead lock.
See signal(2) for the list.
fprint, malloc, backtrace_symbols, fflush are not on the list
Heap Usage ●
●
●
●
The fault may have occurred due to malloc arena corruption Trying to malloc() / free() memory may lead to double fault. So don't ... –
Do not call malloc / free anything
–
Do not call functions that do
free, backtrace_symbols obviously not good
Detecting Heap Usage ●
Poison__malloc_hook and friends: void * kill_malloc(size_t size, const void *caller) { printf("Malloc called from %p\n", caller); abort(); } __malloc_hook = kill_malloc;
●
Poison the heap: char * p = sbrk(0); memset(p-1024, 42, 1024);
Dynamic linker heap usage ●
●
●
backtrace and friends are dynamically loaded from libgcc.so The dynamic linker calls malloc to load the new library So... –
make dummy call to backtrace when installing handler, to force linker to load libgcc with a sane heap.
–
Or statically link libgcc in.
Thread Safety ●
Multiple threads can fault together –
●
●
●
Will garble our output
Use spin lock in signal handler to block concurrent faulting threads Can't spin on the lock if contending thread is of higher RT priority Use pthread_spin_trylock() and sleep with pselect() if failed.
Handler Stack Munging Original user mode stack 0x1234..
Handler called
foo(...)
Munged user mode stack 0x1234...
... 0x1255...
foo(...)
... 0xffffe...
bar(...)
...
0x1266...
signal_handler()
Handler returns
Signal handling code
Kernel
trampoline in vsyscall page
Putting It All Together ●
Fork a “watchdog” process sleeping on a pipe to handle faults –
●
●
●
System wide daemon also possible
Collect information in signal handler and send it over the pipe to the watchdog process for analysis, printing etc. Finalize by sending backtrace_symbols_fd down the pipe Use EIP from signal context to overcome stack munging
Questions? Slides & code at: http://tuxology.net Gilad Ben-Yossef Chief Coffee Drinker Codefidence Ltd.
[email protected] http://codefidence.com © 2008 Codefidence Ltd. Released under a CC-by-sa 2.5 License. 25