l i n u x - u s e r s - g r o u p - o f - d a v i s
L U G O D
 
Next Meeting:
April 21: Google Glass
Next Installfest:
TBD
Latest News:
Mar. 18: Google Glass at LUGOD's April meeting
Page last updated:
2004 Oct 23 15:48

The following is an archive of a post made to our 'vox-tech mailing list' by one of its subscribers.

Report this post as spam:

(Enter your email address)
[vox-tech] C question: Determining where a signal was raised
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[vox-tech] C question: Determining where a signal was raised



My apologies to clc readers; I'm aware this really isn't a C question.  ;-)



I have some code that traps floating point errors.  The signal is trapped
with this code:


      struct sigaction action;

      memset(&action, 0, sizeof(action));
      action.sa_sigaction = fpe_callback;    /* which callback function  */
      sigemptyset(&action.sa_mask);          /* other signals to block   */
      action.sa_flags = SA_SIGINFO;          /* give details to callback */

      if (sigaction(SIGFPE, &action, 0))
         die("Failed to register signal handler.");


and, when the signal is raised, the callback function is:


   void fpe_callback(int sig_number, siginfo_t *info, void *data)
   {
      data = data;      /* used for SIGIO (see F_SETSIG in fcntl) */

      if (sig_number != SIGFPE) {
         fprintf(stderr, "%s:%d %s error: "
            "recieved wrong signal number %d not %d\n",
            __FILE__, __LINE__, __FUNCTION__, sig_number, SIGFPE);
         exit(2);
      }

      fprintf(stderr, "%s:%d %s warn: ", __FILE__, __LINE__, __FUNCTION__);
      fpe_print_cause(stderr, info);

      exit(1);
   }


The function fpe_print_cause() does nothing more than print the cause of the
floating point error:


   void fpe_print_cause(FILE *file, siginfo_t *info)
   {
      if (info->si_signo != SIGFPE) {      // should never happen
         die("Somehow got a wrong signo = %d\n", info->si_signo);
      } else {
         fprintf(file,
            "FPE reason %d = \"%s\", from address 0x%X\n",
            info->si_code,
            info->si_code == FPE_INTDIV ? "integer divide by zero"     :
            info->si_code == FPE_INTOVF ? "integer overflow"           :
            info->si_code == FPE_FLTDIV ? "FP divide by zero"          :
            info->si_code == FPE_FLTOVF ? "FP overflow"                :
            info->si_code == FPE_FLTUND ? "FP underflow"               :
            info->si_code == FPE_FLTRES ? "FP inexact result"          :
            info->si_code == FPE_FLTINV ? "FP invalid operation"       :
            info->si_code == FPE_FLTSUB ? "subscript out of range"     :
            "unknown",
            (unsigned int) info->si_addr
         );
      }
   }



The *intent* of fpe_callback() is to print the function and line number that
was executing when the FPE signal was raised.  However, the function and line
number that gets printed is fpe_callback().  Useless information.

Is there a way to grab the function, file, and line number of the code that
was executing when the FPE signal was raised?

Running the executable under GDB is not an option because sometimes it can
take many, many hours for the FPE to raise.  Also, I thought I was being
crafty by replacing:

   exit(1);

with:

   abort();

A core file is generated, which should've given me details of where the code
was when the FPE was generated, but it looks like the stack blew chunks:

   p@lucifer$ gdb avataralt core 
   Using host libthread_db library "/lib/tls/libthread_db.so.1".
   Core was generated by `./avataralt'.
   Program terminated with signal 6, Aborted.

   warning: current_sos: Can't read pathname for load map: Input/output error

   Reading symbols from /lib/tls/libm.so.6...done.
   Loaded symbols for /lib/tls/libm.so.6
   Reading symbols from /lib/tls/libc.so.6...done.
   Loaded symbols for /lib/tls/libc.so.6
   Reading symbols from /lib/ld-linux.so.2...done.
   Loaded symbols for /lib/ld-linux.so.2
   #0  0x4006cee9 in raise () from /lib/tls/libc.so.6
   (gdb) bt
   #0  0x4006cee9 in raise () from /lib/tls/libc.so.6
   #1  0x4017aedc in ?? () from /lib/tls/libc.so.6
   #2  0x00003ffe in ?? ()
   #3  0x4006e781 in abort () from /lib/tls/libc.so.6
   #4  0x00000000 in ?? ()
      (snip)
   #46 0x40016c40 in ?? () from /lib/ld-linux.so.2
   #47 0x000000a3 in ?? ()
   #48 0x40016e78 in _r_debug ()
   #49 0xbfff8b74 in ?? ()
   #50 0x4000ba16 in _dl_map_object_deps () from /lib/ld-linux.so.2
   Previous frame inner to this frame (corrupt stack?)

To be honest, I don't have the slightest clue what happens to the stack
when an asynchronous signal handler executes or when a long jump happens.
I assume this is why GDB thinks the stack was corrupt...


Trace code will work, but I'm looking for something more elegant than
sprinkling trace code all over the place.  I'm so busy, the last thing I want
to do is start putting junk in my code that needs to be taken out.  If I'm
going to spend time on this, I at least want a return on my investment and
learn something I didn't know when I woke up this morning...   ;-)

Pete
_______________________________________________
vox-tech mailing list
vox-tech@lists.lugod.org
http://lists.lugod.org/mailman/listinfo/vox-tech



LinkedIn
LUGOD Group on LinkedIn
Sign up for LUGOD event announcements
Your email address:
facebook
LUGOD Group on Facebook
'Like' LUGOD on Facebook:

Hosting provided by:
Sunset Systems
Sunset Systems offers preconfigured Linux systems, remote system administration and custom software development.

LUGOD: Linux Users' Group of Davis
PO Box 2082, Davis, CA 95617
Contact Us

LUGOD is a 501(c)7 non-profit organization
based in Davis, California
and serving the Sacramento area.
"Linux" is a trademark of Linus Torvalds.

Sponsored in part by:
Appahost Applications
For a significant contribution towards our projector, and a generous donation to allow us to continue meeting at the Davis Library.