Expat-IT Tech Bits

Home

Contact

Links

Search this site:

Categories:

/ (287)
  Admin/ (122)
    Apache/ (10)
      HTTPS-SSL/ (4)
      PHP/ (3)
      performance/ (2)
    Cherokee/ (1)
    LAN/ (4)
    LVM/ (6)
    Monitoring/ (2)
      munin/ (2)
    SSH/ (6)
    SSL/ (1)
    Samba/ (1)
    VPN-options/ (6)
      OpenVPN/ (1)
      SSH-Proxy/ (3)
      Tinc/ (1)
      sshuttle/ (1)
    backups/ (17)
      SpiderOak/ (1)
      backuppc/ (5)
      dirvish/ (1)
      misc/ (6)
      rdiff-backup/ (1)
      rsync/ (1)
      unison/ (2)
    commandLine/ (24)
      files/ (8)
      misc/ (10)
      network/ (6)
    crontab/ (1)
    databases/ (15)
      MSSQL/ (2)
      MySQL/ (8)
      Oracle/ (3)
      PostgreSQL/ (1)
    dynamicDNS/ (2)
    email/ (11)
      Dovecot/ (1)
      deliverability/ (1)
      misc/ (1)
      postfix/ (7)
      puppet/ (1)
    iptables/ (3)
    tripwire/ (1)
    virtualization/ (9)
      VMware/ (1)
      virtualBox/ (8)
  Coding/ (14)
    bash/ (1)
    gdb/ (1)
    git/ (3)
    php/ (5)
    python/ (4)
      Django/ (2)
  Education/ (1)
  Hosting/ (27)
    Amazon/ (18)
      EBS/ (3)
      EC2/ (10)
      S3/ (1)
      commandline/ (4)
    Godaddy/ (2)
    NearlyFreeSpeech/ (3)
    Rackspace/ (1)
    vpslink/ (3)
  Linux/ (30)
    Android/ (1)
    Awesome/ (3)
    CPUfreq/ (1)
    China/ (2)
    Debian/ (8)
      APT/ (3)
      WPA/ (1)
    audio/ (1)
    encryption/ (3)
    fonts/ (1)
    misc/ (6)
    remoteDesktop/ (1)
    router-bridge/ (3)
  SW/ (45)
    Micro$soft/ (1)
    browser/ (2)
      Chrome/ (1)
      Firefox/ (1)
    business/ (28)
      Drupal/ (9)
      KnowledgeTree/ (6)
      Redmine/ (2)
      SugarCRM/ (7)
      WebERP/ (2)
      WordPress/ (1)
      eGroupware/ (1)
    chat/ (1)
    email/ (1)
    fileSharing/ (2)
      btsync/ (1)
      mldonkey/ (1)
    graphics/ (2)
    research/ (2)
    website/ (6)
      blog/ (6)
        blosxom/ (3)
        rss2email/ (1)
        webgen/ (1)
  Security/ (15)
    IMchat/ (2)
    circumvention/ (2)
    cryptoCurrency/ (1)
    e-mail/ (4)
    greatFirewall/ (1)
    hacking/ (1)
    password/ (1)
    privacy/ (2)
    skype/ (1)
  Services/ (1)
    fileSharing/ (1)
  TechWriting/ (1)
  xHW/ (14)
    Lenovo/ (1)
    Motorola_A1200/ (2)
    Thinkpad_600e/ (1)
    Thinkpad_a21m/ (3)
    Thinkpad_i1300/ (1)
    Thinkpad_x24/ (1)
    USB_audio/ (1)
    scanner/ (1)
    wirelessCards/ (2)
  xLife/ (17)
    China/ (9)
      Beijing/ (5)
        OpenSource/ (3)
    Expatriation/ (1)
    Vietnam/ (7)

Archives:

  • 2016/07
  • 2016/05
  • 2016/02
  • 2016/01
  • 2015/12
  • 2015/11
  • 2015/06
  • 2015/01
  • 2014/12
  • 2014/11
  • 2014/10
  • 2014/09
  • 2014/07
  • 2014/04
  • 2014/02
  • 2014/01
  • 2013/12
  • 2013/10
  • 2013/08
  • 2013/07
  • 2013/06
  • 2013/05
  • 2013/04
  • 2013/02
  • 2013/01
  • 2012/12
  • 2012/10
  • 2012/09
  • 2012/08
  • 2012/07
  • 2012/06
  • 2012/05
  • 2012/04
  • 2012/03
  • 2012/01
  • 2011/12
  • 2011/11
  • 2011/10
  • 2011/09
  • 2011/08
  • 2011/07
  • 2011/06
  • 2011/05
  • 2011/04
  • 2011/02
  • 2010/12
  • 2010/11
  • 2010/10
  • 2010/09
  • 2010/08
  • 2010/07
  • 2010/06
  • 2010/05
  • 2010/04
  • 2010/03
  • 2010/02
  • 2010/01
  • 2009/12
  • 2009/11
  • 2009/10
  • 2009/09
  • 2009/08
  • 2009/07
  • 2009/06
  • 2009/05
  • 2009/04
  • 2009/03
  • 2009/02
  • 2009/01
  • 2008/12
  • 2008/11
  • 2008/10
  • 2008/09
  • Subscribe XML RSS Feed

    Creative Commons License
    This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
    PyBlosxom

    This site has no ads. To help with hosting, crypto donations are accepted:
    Bitcoin: 1JErV8ga9UY7wE8Bbf1KYsA5bkdh8n1Bxc
    Zcash: zcLYqtXYFEWHFtEfM6wg5eCV8frxWtZYkT8WyxvevzNC6SBgmqPS3tkg6nBarmzRzWYAurgs4ThkpkD5QgiSwxqoB7xrCxs

    Wed, 07 Jan 2009


    /Coding/gdb: Analyzing a Daemon Segfault with GDB

    For the past little while cron on my laptop has been completely broken[1], with cron dumping an endless succession of segfaults (Segmentation Faults -- program attempts to access a forbidden area of memory) to the syslog. Since there was no apparent movement in the bug report, I set about trying to do something about it.

    First I ran strace and ltrace on cron to see if it was failing for some kind of obvious reason: no luck, nothing interesting in the output.

    Next thing to try: isolate exactly what line of code was failing using GDB[2].

    First build, install, and restart a debug version of cron and libpam-mount[3] (otherwise gdb output will just be a mess of numbers). Then start "gdb" in a root terminal (cron runs as root) and at the gdb prompt, attach to the running cron:

    (gdb) attach 3489
    Attaching to process 3489
    Reading symbols from /usr/sbin/cron...done.
    Reading symbols from /lib/libpam.so.0...done.
    Loaded symbols for /lib/libpam.so.0
    Reading symbols from /lib/libselinux.so.1...done.
    Loaded symbols for /lib/libselinux.so.1
    Reading symbols from /lib/i686/cmov/libc.so.6...done.
    Loaded symbols for /lib/i686/cmov/libc.so.6
    Reading symbols from /lib/i686/cmov/libdl.so.2...done.
    Loaded symbols for /lib/i686/cmov/libdl.so.2
    Reading symbols from /lib/ld-linux.so.2...done.
    Loaded symbols for /lib/ld-linux.so.2
    Reading symbols from /lib/i686/cmov/libnss_compat.so.2...done.
    Loaded symbols for /lib/i686/cmov/libnss_compat.so.2
    Reading symbols from /lib/i686/cmov/libnsl.so.1...done.
    Loaded symbols for /lib/i686/cmov/libnsl.so.1
    Reading symbols from /lib/i686/cmov/libnss_nis.so.2...done.
    Loaded symbols for /lib/i686/cmov/libnss_nis.so.2
    Reading symbols from /lib/i686/cmov/libnss_files.so.2...done.
    Loaded symbols for /lib/i686/cmov/libnss_files.so.2
    0xb7f5e424 in __kernel_vsyscall ()
    

    where the number after "attach" is the pid (process id) of cron (obtainable by running "ps -ef | grep cron"). Then:

    (gdb) set follow-fork-mode child
    (gdb) cont
    Continuing.
    

    The follow-fork-mode parameter has to be set to "child" because cron is in the habit of spawning child processes, and it is in the child process where the error occurs. Then let cron (frozen at the moment of the "attach") continue running with "cont". And wait for a segfault:

    Continuing.
    
    Program received signal SIGSEGV, Segmentation fault.
    [Switching to process 19892]
    0x00000000 in ?? ()
    (gdb) 
    

    Now dump out some interesting information about the error:

    (gdb) info frame 0
    Stack frame at 0xbfba1aa0:
     eip = 0x0; saved eip 0xb7b4cda1
     called by frame at 0xbfba1af0
     Arglist at 0xbfba1a98, args: 
     Locals at 0xbfba1a98, Previous frame's sp is 0xbfba1aa0
     Saved registers:
      eip at 0xbfba1a9c
    
    (gdb) up
    #1  0xb7b4cda1 in read_password (pamh=0x8841b00, 
        prompt=0x8846278 "reenter password for pam_mount:", pass=0xbfba1b38)
        at pam_mount.c:160
    160           retval = conv->conv(nargs, message, resp, conv->appdata_ptr);
    
    (gdb) up
    #2  0xb7b4ddf3 in pam_sm_open_session (pamh=0x8841b00, flags=32768, argc=1, 
        argv=0x8843ce0) at pam_mount.c:511
    511           ret = read_password(pamh, Config.msg_sessionpw, &system_authtok);
    
    (gdb) up
    #3  0xb7f693c1 in _pam_dispatch (pamh=0x8841b00, flags=32768, choice=4)
        at pam_dispatch.c:108
    108           retval = h->func(pamh, flags, h->argc, h->argv);
    
    (gdb) up
    #4  0xb7f6cfeb in pam_open_session (pamh=0x8841be8, flags=32768)
        at pam_session.c:23
    23          retval = _pam_dispatch(pamh, flags, PAM_OPEN_SESSION);
    
    (gdb) up
    #5  0x0804e848 in child_process (e=0x88418f8, u=0x88418d8) at 
    ../do_command.c:228
    228             retcode = pam_open_session(pamh, PAM_SILENT);
    
    (gdb) up
    #6  0x0804e36d in do_command (e=0x88418f8, u=0x88418d8) at 
    ../do_command.c:102
    102                     child_process(e, u);
    
    (gdb) up
    #7  0x0804e1e3 in job_runqueue () at ../job.c:68
    68                      do_command(j->e, j->u);
    
    (gdb) up
    #8  0x0804a777 in main (argc=142875624, argv=0x0) at 
    ../cron.c:270
    270                     job_runqueue();
    
    (gdb) up
    Initial frame selected; you cannot go up.
    

    Above I used the "up" and "down" commands to step through the bugtrace and see the stack of procedure calls (and which line in each procedure) that was happening at the moment of the segfault. From this we can divine that the problem seems to happen in libpam-mount, specifically in pam_mount.c:160 of frame #2. (Note that frame 0 is a mess of numbers without symbolic information, because the piece of software coinciding with frame 0 has not been compiled with debug.)

    (gdb) frame 0
    #0  0x00000000 in ?? ()
    
    (gdb) up
    #1  0xb7b4cda1 in read_password (pamh=0x8841b00, 
        prompt=0x8846278 "reenter password for pam_mount:", pass=0xbfba1b38)
        at pam_mount.c:160
    160           retval = conv->conv(nargs, message, resp, conv->appdata_ptr);
    
    (gdb) print *resp
    Cannot access memory at address 0x0
    
    (gdb) print resp
    $3 = (struct pam_response *) 0x0
    (gdb) 
    

    Above I used the "print" command to show the values the pointer *resp, which turns out to still be set to NULL, and is being passed on to another procedure (frame 0) which barfs. This is the likely problem.

    I issued a bug report against libpam-mount[4] suggesting a simple patch (which turned out to have bad side effects....) which prompted another developer to jump in and document an existing patch already applied to upstream. The fix is on the way....

    [1] http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=484122
    [2] http://www.gnu.org/software/gdb/documentation/
    [3] http://blog.langex.net/index.cgi/Linux/Debian/How-to-modify-source.html
    [4] http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=510990

    posted at: 02:18 | path: /Coding/gdb | permanent link to this entry