View Issue Details

IDProjectCategoryView StatusLast Update
0002760FSSCPHUDpublic2013-01-03 06:24
Reporteroldlaptop Assigned Toniffiwan  
PrioritylowSeveritycrashReproducibilityalways
Status resolvedResolutionfixed 
Platformx86_64OSDebianOS VersionWheezy
Product Version3.6.14 
Fixed in Version3.6.15 
Summary0002760: Game segfaults on targeting anything when compiled with the GCC optimization option -ftree-vectorize
DescriptionWhen the game is built using GCC 4.7.2 and the -ftree-vectorize, the game segfaults when any object is targeted by the player.
Steps To Reproduce1) Configure the game with: CFLAGS="-ftree-vectorize" CXXFLAGS="$CFLAGS"
2) Run the resulting executable
3) Enter any mission
4) Target anything
TagsNo tags attached.

Activities

Valathil

2012-12-28 06:25

developer   ~0014581

I'd really like to close this with a suggestion of "Then don't use -ftree-vectorize" but people wouldn't like that i think.

niffiwan

2012-12-28 08:50

developer   ~0014582

Last edited: 2012-12-28 11:44

I'm unable to reproduce the issue. I compiled with this:

make distclean && CXXFLAGS="-ftree-vectorize" sh autogen.sh && V=1 make -j4

Here's the output from the link showing all the options:

g++ -m64 -march=athlon64 -ansi -DLUA_USE_LINUX -O2 -Wall -Wno-write-strings -funroll-loops -I/usr/include/SDL -D_GNU_SOURCE=1 -D_REENTRANT -I/usr/include/libpng12 -I/usr/include/lua5.1 -fsigned-char -Wno-unknown-pragmas -Wno-deprecated -Wno-char-subscripts -ftree-vectorize -o fs2_open_3.6.15 freespace.o levelpaging.o libcode.a -L/usr/lib/x86_64-linux-gnu -lSDL -lvorbis -lm -lvorbisfile -ltheora -logg -lopenal -lpng -lGL -lGLU -lpng12 -llua5.1 -ljpeg

SVN version 9470 / trunk


I tried with no mods enabled, played the 1st training mission, and pressed "t" to target the instructor. No crash.

And lastly, I have an AMD CPU (in-case the -march stuff has anything to do with it):

$ grep 'name' /proc/cpuinfo | sort -u
model name : AMD Phenom(tm) II X2 550 Processor

edit: hrmmm - just realised my version of GCC doesn't match the report
gcc version 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5)
I'll have to retest...

oldlaptop

2012-12-28 15:41

reporter   ~0014583

I am seeing the crash on a Core 2 Duo T7600, with or without the appropriate -march option:

model name : Intel(R) Core(TM)2 CPU T7600 @ 2.33GHz

Full gcc -v output:

$ gcc -v
Using built-in specs.
COLLECT_GCC=/usr/bin/gcc-4.7.real
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/4.7/lto-wrapper
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Debian 4.7.2-4' --with-bugurl=file:///usr/share/doc/gcc-4.7/README.Bugs --enable-languages=c,c++,go,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-4.7 --enable-shared --enable-linker-build-id --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.7 --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-gnu-unique-object --enable-plugin --enable-objc-gc --with-arch-32=i586 --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 4.7.2 (Debian 4.7.2-4)

Valathil

2012-12-28 18:16

developer   ~0014584

hmm compiling 64 bits even works? With all the pointermagic were doing I'm surprised.

chief1983

2012-12-28 22:06

administrator   ~0014585

Actually I think it runs pretty well on Linux 64bit.

oldlaptop

2012-12-28 23:39

reporter   ~0014586

This issue does not occur when compiling on i386 Wheezy with the same GCC version (4.7.2). CPU is a Pentium M @ 1.70ghz.

oldlaptop

2012-12-29 15:41

reporter   ~0014587

Last edited: 2012-12-29 15:42

This issue *does* occur when compiling on x86_64 openSUSE 12.2 with gcc 4.7.1 and an Athlon 64 3200+, therefore this is probably not distro or CPU specific.

niffiwan

2013-01-02 04:11

developer  

mantis2760-svn.patch (955 bytes)   
Index: code/hud/hudshield.h
===================================================================
--- code/hud/hudshield.h	(revision 9479)
+++ code/hud/hudshield.h	(working copy)
@@ -22,10 +22,10 @@
 #define HULL_HIT_OFFSET				4		// used to access the members in shield_hit_info that pertain to the hull
 typedef struct shield_hit_info
 {
+	int shield_hit_timers[NUM_SHIELD_HIT_MEMBERS];	// timestamps that get set for SHIELD_FLASH_TIME when a quadrant is hit
+	int shield_hit_next_flash[NUM_SHIELD_HIT_MEMBERS];
 	int shield_hit_status;		// bitfield, if offset for shield quadrant is set, that means shield is being hit
 	int shield_show_bright;		// bitfield, if offset for shield quadrant is set, that means play bright frame
-	int shield_hit_timers[NUM_SHIELD_HIT_MEMBERS];	// timestamps that get set for SHIELD_FLASH_TIME when a quadrant is hit
-	int shield_hit_next_flash[NUM_SHIELD_HIT_MEMBERS];
 } shield_hit_info;
 
 extern ubyte Quadrant_xlate[4];
mantis2760-svn.patch (955 bytes)   

niffiwan

2013-01-02 04:12

developer   ~0014606

Last edited: 2013-01-02 04:20

More testing has shown that the issue also occurs with GCC 4.7.0 on Centos 6. Also gdb says that hud/hudshield.cpp:468 is the place the segfault occurs.

I've had a look at the assembler that GCC is producing and I think that this being caused by a GCC bug. The reason I think it's a bug is that GCC is issuing a MOVDQA instruction which is for 16 byte aligned data structures only (1). However the data-structure is *NOT* aligned (unless by accident) so a MOVDQU should have been used instead (I think...)

Here's the assembler from 4.7.3 when it segfaults (note that shield_info_reset has been inlined into hud_shield_hit_reset, and => points to the active instruction when the segfault occured)

(gdb) disas
Dump of assembler code for function hud_shield_hit_reset(int):
   0x000000000054d2e0 <+0>: movdqa 0x347d18(%rip),%xmm0 # 0x895000
   0x000000000054d2e8 <+8>: mov $0xe1b460,%eax
   0x000000000054d2ed <+13>: test %edi,%edi
   0x000000000054d2ef <+15>: mov $0xe1b490,%edx
   0x000000000054d2f4 <+20>: cmove %rdx,%rax
   0x000000000054d2f8 <+24>: movl $0x0,(%rax)
   0x000000000054d2fe <+30>: movl $0x0,0x4(%rax)
   0x000000000054d305 <+37>: movl $0x1,0x8(%rax)
   0x000000000054d30c <+44>: movl $0x1,0x1c(%rax)
=> 0x000000000054d313 <+51>: movdqa %xmm0,0xc(%rax)
   0x000000000054d318 <+56>: movdqa %xmm0,0x20(%rax)
   0x000000000054d31d <+61>: retq
End of assembler dump.
(gdb)

Here's the assembler from 4.6.3 (no segfault)

(gdb) disas
Dump of assembler code for function hud_shield_hit_reset(int):
   0x00000000005527e0 <+0>: movdqa 0x35ecb8(%rip),%xmm0 # 0x8b14a0
   0x00000000005527e8 <+8>: mov $0xe3a140,%eax
   0x00000000005527ed <+13>: test %edi,%edi
   0x00000000005527ef <+15>: mov $0xe3a170,%edx
=> 0x00000000005527f4 <+20>: movdqa 0x383e14(%rip),%xmm1 # 0x8d6610
   0x00000000005527fc <+28>: cmove %rdx,%rax
   0x0000000000552800 <+32>: movdqa %xmm1,(%rax)
   0x0000000000552804 <+36>: movdqa %xmm0,0x10(%rax)
   0x0000000000552809 <+41>: movdqa %xmm0,0x20(%rax)
   0x000000000055280e <+46>: retq
End of assembler dump.
(gdb)

Here's the assembler from 4.7.3 after I implemented a workaround (i.e. simply re-ordered the structure members):

(gdb) disas
Dump of assembler code for function hud_shield_hit_reset(int):
   0x000000000054d290 <+0>: movdqa 0x347d48(%rip),%xmm0 # 0x894fe0
=> 0x000000000054d298 <+8>: mov $0xe1b460,%eax
   0x000000000054d29d <+13>: test %edi,%edi
   0x000000000054d29f <+15>: mov $0xe1b490,%edx
   0x000000000054d2a4 <+20>: cmove %rdx,%rax
   0x000000000054d2a8 <+24>: movl $0x0,0x28(%rax)
   0x000000000054d2af <+31>: movl $0x0,0x2c(%rax)
   0x000000000054d2b6 <+38>: movdqa %xmm0,(%rax)
   0x000000000054d2ba <+42>: movl $0x1,0x10(%rax)
   0x000000000054d2c1 <+49>: movl $0x1,0x24(%rax)
   0x000000000054d2c8 <+56>: movdqu %xmm0,0x14(%rax)
   0x000000000054d2cd <+61>: retq
End of assembler dump.
(gdb)

(I was wondering if the right hand operand has a clue to the memory alignment issue, segfault has a 0xc (12), the other two have 0x10 (16 & aligned) and 0x14 (20 but with MOVDQU...))

Anyway, background info out of the way, could someone please test my workaround patch? It just reorders the relevant data structure, largest to smallest members (that's most space efficient anyway?)

(I'd also like to submit a GCC bug, but I can't get my test case to reproduce the issue yet...)

(1) (to find the reference just google search for MOVDQA - site was www-jaist-ac-jp)

(kinda wierd, but mantis didn't like the link I'd inserted...just made the entire comment blank?)

Valathil

2013-01-02 19:09

developer   ~0014608

you know the first thing i thought was "it's probably a compiler bug". If the reordering doesn't interfere with anything else like multi or something i'd say we do the change cause why the hell not.

chief1983

2013-01-02 20:18

administrator   ~0014609

Unless there's anywhere where struct member order matters, makes sense to me.

oldlaptop

2013-01-02 22:14

reporter   ~0014610

niffiwan's patch fixes the crash for me (and does not appear to cause problems with i386).

niffiwan

2013-01-03 06:24

developer   ~0014611

Workaround for probable GCC bug committed in r9480.

Issue History

Date Modified Username Field Change
2012-12-27 20:17 oldlaptop New Issue
2012-12-28 06:25 Valathil Note Added: 0014581
2012-12-28 08:46 niffiwan Assigned To => niffiwan
2012-12-28 08:46 niffiwan Status new => assigned
2012-12-28 08:50 niffiwan Note Added: 0014582
2012-12-28 08:50 niffiwan Status assigned => feedback
2012-12-28 11:44 niffiwan Note Edited: 0014582
2012-12-28 15:41 oldlaptop Note Added: 0014583
2012-12-28 15:41 oldlaptop Status feedback => assigned
2012-12-28 18:16 Valathil Note Added: 0014584
2012-12-28 22:06 chief1983 Note Added: 0014585
2012-12-28 23:39 oldlaptop Note Added: 0014586
2012-12-29 15:41 oldlaptop Note Added: 0014587
2012-12-29 15:42 oldlaptop Note Edited: 0014587
2013-01-02 04:11 niffiwan File Added: mantis2760-svn.patch
2013-01-02 04:12 niffiwan Note Added: 0014606
2013-01-02 04:12 niffiwan Status assigned => feedback
2013-01-02 04:12 niffiwan Note Edited: 0014606
2013-01-02 04:13 niffiwan Note Edited: 0014606
2013-01-02 04:14 niffiwan Note Edited: 0014606
2013-01-02 04:14 niffiwan Note Edited: 0014606
2013-01-02 04:15 niffiwan Note Edited: 0014606
2013-01-02 04:16 niffiwan Note Edited: 0014606
2013-01-02 04:17 niffiwan Note Edited: 0014606
2013-01-02 04:18 niffiwan Note Edited: 0014606
2013-01-02 04:18 niffiwan Note Edited: 0014606
2013-01-02 04:19 niffiwan Note Edited: 0014606
2013-01-02 04:20 niffiwan Note Edited: 0014606
2013-01-02 19:09 Valathil Note Added: 0014608
2013-01-02 20:18 chief1983 Note Added: 0014609
2013-01-02 22:14 oldlaptop Note Added: 0014610
2013-01-02 22:14 oldlaptop Status feedback => assigned
2013-01-03 06:24 niffiwan Note Added: 0014611
2013-01-03 06:24 niffiwan Status assigned => resolved
2013-01-03 06:24 niffiwan Fixed in Version => 3.6.15
2013-01-03 06:24 niffiwan Resolution open => fixed