l i n u x - u s e r s - g r o u p - o f - d a v i s
Next Meeting:
July 7: Social gathering
Next Installfest:
Latest News:
Jun. 14: June LUGOD meeting cancelled
Page last updated:
2004 Feb 26 13:52

The following is an archive of a post made to our 'vox-tech mailing list' by one of its subscribers.

Report this post as spam:

(Enter your email address)
Re: [vox-tech] late night musings: stripping
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [vox-tech] late night musings: stripping

On Thursday, Feb 26, 2004, at 09:22 US/Pacific, Peter Jay Salzman wrote:
we finished right into glibc.  shouldn't GDB have known when myfunction
returned to main, even if there's no debugging information?
Hmm.. I'm not sure, but I have a suspicion. It has to do with how breakpoints are implemented. The debugger isn't simply using "stepi" to go instruction by instruction. It's modifying the code to return control to the debugger when certain things happen.

Since it has the address of the function, what it does it replace the first instruction(s) with something that returns control to the debugger.. probably a jump or a trap or something else short. It takes the code that was overwritten by that trap, and puts it in separate place that's executed before the control is returned to the program.

But.... without detailed debugging information it's very hard to know when you're returning from a stack frame. Since all the cleanup is done after return, you can "return" from just at about anywhere.. and there may be more than one return in a function. The return that gets taken is not necessarily the first one linearly after the program, and replacing ALL the RET's after the current entry point might break other functions.

It COULD try to implement this by playing with the return address in the stack frame, having it "return" into the debugger, but I suspect that they just didn't bother. If you want that functionality, compile with debugging info turned on.

what would be useful would be something like GDB which can follow a
process and collect information about:

1. control flow (what functions call what).
2. get the parameters and return values of the function calls.

the only way i know how to get #1 is to sit there, using stepi (and
possibly nexti over uninteresting libc functions) with a pencil and
paper in hand.

as for #2, seems like the only way to do that would be to disassemble
the code.  i don't know a lick of x86 assembly, but i did notice that
%eax appears to be the register for returning integers.
gdb has a pretty extensive scripting language, and I imaging that you could do that by looking at when the next instruction is a "CALL", though you also need to trap "INT 0x80" because that's the system call interrupt.

i rewrote myfunction to return an int of 1.
For those unfamiliar with the C call stack I'll add some comments

   0x8048380 <myfunction>: push   %ebp
Pushes the base pointer onto the stack, to save the stack frame of the calling function

   0x8048381 <myfunction+1>:       mov    %esp,%ebp
Moves the current stack pointer into the base pointer to make a new "frame"

   0x8048383 <myfunction+3>:       sub    $0x8,%esp
Make room on the stack for 8 bytes of automatic variables and arguments.

   0x8048386 <myfunction+6>:       movl   $0x80484b4,(%esp,1)
"Push" the address of the format string "Hello World" onto the stack (notice that it doesn't use push.. and so leaves the stack pointer untouched.. I don't know why that is.)

   0x804838d <myfunction+13>:      call   0x8048288 <printf>
call printf

   0x8048392 <myfunction+18>:      mov    $0x1,%eax
Move the return value into eax

   0x8048397 <myfunction+23>:      leave
I don't remember leave.. maybe it's an post 386 extension to automatically clean up the stack

   0x8048398 <myfunction+24>:      ret
And return.

It's pretty much the same with a floating point return value,, except.

here, myfunction returns a float of 1.0:
   0x8048394 <myfunction+18>:      fld1
   0x8048396 <myfunction+20>:      leave
   0x8048397 <myfunction+21>:      ret

i'll need to google for fld1, but its general idea is clear.
fld1 loads the floating point stack. In olden days, floating point operations were handled by a separate chip (the 80{,1,2,3}87), and so you needed to load values onto a separate floating point "stack" in the coprocessor before you could perform operations.. I'm guessing that when the functions returns double/float, it leaves the return value on the top of the floating point stack for efficiency, since functions that return a float are often used as part of a larger equation (i.e. x = 4.0*ln(y+sqrt(z)); )

anyway, it would be neat to have a program that automated all this
I'd be surprised if it hadn't already been done, though it's might be "underground", since it's got a lot of quasi-ethical aspects to it.

-- Mitch

vox-tech mailing list

LUGOD Group on LinkedIn
Sign up for LUGOD event announcements
Your email address:
LUGOD Group on Facebook
'Like' LUGOD on Facebook:

Hosting provided by:
Sunset Systems
Sunset Systems offers preconfigured Linux systems, remote system administration and custom software development.

LUGOD: Linux Users' Group of Davis
PO Box 2082, Davis, CA 95617
Contact Us

LUGOD is a 501(c)7 non-profit organization
based in Davis, California
and serving the Sacramento area.
"Linux" is a trademark of Linus Torvalds.

Sponsored in part by:
Appahost Applications
For a significant contribution towards our projector, and a generous donation to allow us to continue meeting at the Davis Library.