Bug 36525 - mod_jk 1.2.14.1 core dump
Summary: mod_jk 1.2.14.1 core dump
Status: RESOLVED FIXED
Alias: None
Product: Tomcat Connectors
Classification: Unclassified
Component: Common (show other bugs)
Version: unspecified
Hardware: Other IRIX
: P2 normal (vote)
Target Milestone: ---
Assignee: Tomcat Developers Mailing List
URL:
Keywords:
Depends on:
Blocks: 36281
  Show dependency tree
 
Reported: 2005-09-06 20:26 UTC by David Rees
Modified: 2008-10-05 03:09 UTC (History)
0 users



Attachments
Correct misalignment (4.39 KB, patch)
2005-09-16 01:03 UTC, Rainer Jung
Details | Diff
unnamed-union.patch (2.82 KB, patch)
2005-09-17 12:32 UTC, David Rees
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description David Rees 2005-09-06 20:26:22 UTC
On SGI IRIX mod_jk 1.2.14.1 crashes on every request in jk_lb_worker.c line 605.
 Going back to 1.2.11 works fine.  Stack trace below:

>  0 service(e = 0x1035afe8, s = 0x7fff2b28, l = 0x10281f68, is_error =
0x7fff1af4)
["/tmp/jakarta-tomcat-connectors-1.2.14.1-src/jk/native/common/jk_lb_worker.c":605,
0x4315080]
  1 jk_handler(r = 0x103719a0)
["/tmp/jakarta-tomcat-connectors-1.2.14.1-src/jk/native/apache-2.0/mod_jk.c":1889,
0x4305648]
  2 ap_run_handler(r = 0x103719a0)
["/tmp/httpd-2.0.54/server/config.c":152, 0x1008b624]
  3 ap_invoke_handler(r = 0x103719a0)
["/tmp/httpd-2.0.54/server/config.c":364, 0x1008c390]
  4 ap_process_request(r = 0x103719a0)
["/tmp/httpd-2.0.54/modules/http/http_request.c":249, 0x10071990]
  5 ap_process_http_connection(c = 0x10365460)
["/tmp/httpd-2.0.54/modules/http/http_core.c":251, 0x10070f78]
  6 ap_run_process_connection(c = 0x10365460)
["/tmp/httpd-2.0.54/server/connection.c":43, 0x100a3e94]
  7 ap_process_connection(c = 0x10365460, csd = 0x10365378)
["/tmp/httpd-2.0.54/server/connection.c":176, 0x100a4568]
  8 child_main(child_num_arg = 0)
["/tmp/httpd-2.0.54/server/mpm/prefork/prefork.c":610, 0x1007c444]
  9 make_child(s = 0x102941c0, slot = 0)
["/tmp/httpd-2.0.54/server/mpm/prefork/prefork.c":704, 0x1007c6ac]
  10 startup_children(number_to_start = 5)
["/tmp/httpd-2.0.54/server/mpm/prefork/prefork.c":722, 0x1007c75c]
  11 ap_mpm_run(_pconf = 0x1025aac0, plog = 0x1028cb88, s =
0x102941c0) ["/tmp/httpd-2.0.54/server/mpm/prefork/prefork.c":941,
0x1007cda0]
  12 main(argc = 3, argv = 0x7fff2f44)
["/tmp/httpd-2.0.54/server/main.c":618, 0x100b1c48]
  13 __start()
["/xlv55/kudzu-apr12/work/irix/lib/libc/libc_n32_M3/csu/crt1text.s":177,
0x1004b9e8]
Comment 1 David Rees 2005-09-06 20:28:27 UTC
First email sent to tomcat-dev:
http://marc.theaimsgroup.com/?l=tomcat-dev&m=112501659012202&w=2

Another user has reported what appears to be the exact same crash on Solaris.  I
am guessing that this is either some sort of 64-bit or big endian bug that does
not show up on i386 (where mod_jk 1.2.14.1 works fine for me)

http://marc.theaimsgroup.com/?l=tomcat-user&m=112569118927613&w=2
Comment 2 Mladen Turk 2005-09-07 07:44:18 UTC
Hi,

Can you comment lines 605 and 606 in jk_lb_worker.c and see if
it still core dumps.
Also, did you try to stop the previous version and delete .shm file?
Nou sure, but even reboot might be required if OS catched the shared memory.
The shared memory slot was enlarged with 1.2.14 version so the stuctures
are different, and if old one are catched then the new one will core dump.
Comment 3 David Rees 2005-09-07 11:25:27 UTC
I can confirm that I fully shut down Apache before installing the new mod_jk and
starting Apache back up.

Removing the mod_jk.shm after shutting down, installing the new module and
starting back up doesn't make a difference.

Using ipcs to view shared memory after Apache shutdown shows that Apache has
correctly released the shared memory that was in use.

Hope to try commenting out the lines in the next day or 2.
Comment 4 Mladen Turk 2005-09-12 15:03:44 UTC
Fixed in the CVS.
This was really strange. Seems that shared memory gets corrupted
if 64 bit access is desired.
Can you try the current HEAD?
Comment 5 David Rees 2005-09-13 00:38:56 UTC
Checked out from CVS today, and can confirm that the new build appears to work
properly.

Thanks!
Comment 6 Rainer Jung 2005-09-16 01:03:51 UTC
Created attachment 16424 [details]
Correct misalignment

There is an alignment problem in the shared memory. The bug only shows up, when
gcc is used with "-O" or "-O2".
Comment 7 Rainer Jung 2005-09-16 01:05:34 UTC
I reopen the bug, because:

- I think the above patch will resolve the problem and still allow to use 64 Bit
counters
- I think that without the patch there might result further cores when the
members of the structs in the shared memory are changed in the future (even
without 64 Bit members)
Comment 8 Rainer Jung 2005-09-16 08:55:09 UTC
Mladen Turk applied the patch. Thanks!
Comment 9 David Rees 2005-09-17 11:50:06 UTC
I just tried compiling on IRIX using latest CVS, but the modified jk_shm.c does
not compile using the SGI CC compiler:

"jk_shm.c", line 50: warning(1040): expected an identifier

Apparently the compiler doesn't like the union inside of the struct.  Removing
the union makes it compile and seems to function OK as well (sorry, even though
my C is a bit rusty my change sure looks like a hack!):

diff -u -r1.20 jk_shm.c
--- common/jk_shm.c     16 Sep 2005 05:52:26 -0000      1.20
+++ common/jk_shm.c     17 Sep 2005 09:43:10 -0000
@@ -43,10 +43,8 @@
 /** jk shm header record structure */
 struct jk_shm_header
 {
-    union {
-        jk_shm_header_data_t data;
-        char alignbuf[JK_SHM_ALIGN(sizeof(jk_shm_header_data_t))];
-    };
+    jk_shm_header_data_t data;
+    char alignbuf[JK_SHM_ALIGN(sizeof(jk_shm_header_data_t))];
     char   buf[1];
 };

Comment 10 Mladen Turk 2005-09-17 12:00:48 UTC
The compiler probably does not like un named unions rather then
unions itself.

Can you try something like:

union {
  ...
  ...
} h;

Of course you will need to change the jk_shm.c code
for each 'hdr->data.XXX' to the 'hdr->h.data.XXX'
 
Tell me if that helps.
Comment 11 David Rees 2005-09-17 12:32:53 UTC
Created attachment 16437 [details]
unnamed-union.patch

Yes, that appears to compile and run OK.  Here is the attached diff I used
against jk_shm.c.
Comment 12 Mladen Turk 2005-09-17 13:21:38 UTC
Hi,

I have commited the patch. Can you double check with the current HEAD
and close the issue if all is working.

Thanks.
Comment 13 David Rees 2005-09-21 07:19:06 UTC
CVS head from earlier today works for me on SGI IRIX and Fedora Core 4.  Thanks!
Comment 14 Rainer Jung 2005-09-21 21:04:37 UTC
The user in

http://marc.theaimsgroup.com/?l=tomcat-user&m=112569118927613&w=2

reported, that CVS fixes the bug for him too.