Source Code Project Mantis - FSSCP
View Issue Details
0002205FSSCPAIpublic2010-05-09 03:442010-07-10 11:03
ReporterEchelon9 
Assigned ToSushi_CW 
PrioritynormalSeveritycrashReproducibilityrandom
StatusresolvedResolutionfixed 
PlatformOSOS Version
Product Version3.6.12 RC2 
Target VersionFixed in Version3.6.13 
Summary0002205: Division by zero in do_random_sidethrust()
DescriptionCrash in game occurred because of a division by zero in do_random_sidethrust(). I'd assume extra checks that strafeHoldDirAmount != zero (although it's a bit odd that static_rand_range() would return a zero in this case.

Additional InformationTop of the stack:

Exception Type: EXC_ARITHMETIC (SIGFPE)
Exception Codes: EXC_I386_DIV (divide by zero)
Crashed Thread: 0 Dispatch queue: com.apple.main-thread

Thread 0 Crashed: Dispatch queue: com.apple.main-thread
0 FS2_Open (debug) 0x00091ed3 do_random_sidethrust(ai_info*, ship_info*) + 139
1 FS2_Open (debug) 0x000921ee attack_set_accel(ai_info*, ship_info*, float, float, float) + 604
2 FS2_Open (debug) 0x000944a3 ai_chase_attack(ai_info*, ship_info*, vec3d*, float, int) + 739
3 FS2_Open (debug) 0x000a8d70 ai_chase() + 2304
4 FS2_Open (debug) 0x000aecb1 ai_execute_behavior(ai_info*) + 189
5 FS2_Open (debug) 0x000af98a ai_frame(int) + 2524
6 FS2_Open (debug) 0x000afb5e ai_process(object*, int, float) + 356
TagsNo tags attached.
Attached Filespatch staticrand.patch (295) 2010-07-04 10:46
http://scp.indiegames.us/mantis/file_download.php?file_id=1529&type=bug
txt static_rand_range_test.cpp.txt (1,708) 2010-07-05 17:28
http://scp.indiegames.us/mantis/file_download.php?file_id=1531&type=bug

Notes
(0011948)
Echelon9   
2010-05-09 03:45   
Sushi_CW, contact me if you'd like directions on which (non-public) mission I experienced this in.
(0011949)
The_E   
2010-05-09 09:11   
There is no do_random_sidethrust() call in attack_set_accel(), at least not on my checkout... (I could just be blind, however).

Also, the function's signature reads "void attack_set_accel(ai_info *aip, float dist_to_enemy, float dot_to_enemy, float dot_from_enemy)", so the question is whether this actually happens on current trunk as well?
(0011950)
Echelon9   
2010-05-09 09:28   
Sorry to clarify, I am using a trunk build, being the latest r6094.

In trunk, we can see how we get to this point within aicode.cpp:

6695: void attack_set_accel(ai_info *aip, ship_info *sip, float dist_to_enemy, float dot_to_enemy, float dot_from_enemy)
{
...
6737: do_random_sidethrust(aip, sip);
...
(0011951)
The_E   
2010-05-09 09:36   
(Last edited: 2010-05-09 09:42)
Ahh, my mistake. My checkout script apparently failed to run....

EDIT: Right, looking at it, this should only ever happen if sip->slide_accel is -1, and static_rand_range returns the defined minimum (1, in this case).

Considering past experiences, I would recommend tracing further back to see if sip->slide_accel actually is valid at that point, is there something in the mission that would cause that variable to become NaN?

(0011952)
Sushi_CW   
2010-05-09 10:05   
Yeah, that is surprising: the static_rand_range should go from 1 to 3, inclusive. What is the slide accel on the ship this triggers for? It isn't negative, is it?
(0012153)
Echelon9   
2010-06-28 10:42   
Apologies for not getting better testing details to you Sushi, I've been flat out these two weeks.

I can confirm this division by zero crash still occurs regularly with trunk and the same missions.
(0012154)
The_E   
2010-06-28 14:19   
Yes, well, can you get those details to all of us? A test scenario using retail assets would be good....
(0012182)
Echelon9   
2010-07-04 04:31   
In testing strafeHoldDirAmount was equal to 0 (thus causing the crash) when:

(int)(sip->slide_accel) = 0, and
static_rand_range((Missiontime + static_rand(aip->shipnum)) >> 18, 1, 3) = 0.

static_rand_range() returned zero when (Missiontime + static_rand(aip->shipnum)) >> 18 was equal to -6813 is this case.
(0012183)
Echelon9   
2010-07-04 05:02   
Sushi will know this section of code much better, but could this be because of loss of precision changing the float to an int? The '$Slide accel' on the relevant ship is 0.05.
(0012184)
The_E   
2010-07-04 10:29   
(Last edited: 2010-07-04 10:45)
Umm.

Actually, that would hint more towards a failure in static_rand_range(), as that one should return a number >= 1 && <= 3, but not 0.

Which raises the question whether static_rand_range (and by extension, static_rand) deal with negative numbers properly.

Does the same error appear if you add a check for negative numbers to static_rand, like in the attached patch?

(0012191)
Sushi_CW   
2010-07-05 17:26   
So, I exhaustively tested static_rand_range by feeding it every possible int value and verifying that the resulting value was in the 1-3 range (same usage scenario as in the FSO code under discussion). Code for the test is attached. It passed the test, at least for me. Manual inspection of the static_rand_range code doesn't see anything objectionable either.

I'm not sure where to go from here, since I can't reproduce the error. Diaspora uses circle strafe, which invokes do_random_sidethrust, and also uses small values for slide accel (which shouldn't make a difference anyway: the slide accel value doesn't even go into static_rand_range). Despite a lot of AI testing in Diaspora, I've never run into this error.

The only way I see for the divide-by-zero to occur is if slide_accel < 0. Do you get a different error if you add

Assert(sip->slide_accel) >= 0;

to the beginning of do_random_sidethrust?

(0012208)
The_E   
2010-07-09 09:54   
Could we please get a sample tbl and sample mission with which to reproduce this?
(0012217)
Echelon9   
2010-07-10 08:15   
static_rand_range() was returning zero, which is fixed by Mondragon's patch to check for negative numbers in static_rand.

This may be caused by the underlying system's generation of rand(). I'm testing on a Mac here, and in Diaspora missions the AI was causing this crash regularly after 2-4mins. With the patch it is rock solid again.

I assume Sushi and Mondragon aren't on Mac or Linux (i.e. on Windows)?
(0012219)
The_E   
2010-07-10 09:52   
Well, yeah, I am on Windows. But it's weird, is there something in the random number functions that works differently on Linux/MacOS?
(0012220)
Sushi_CW   
2010-07-10 10:56   
Yeah, I'm on Windows.

This is indeed very weird. :) I'm guessing Mondragon's patch should work to solve the issue, though.
(0012221)
The_E   
2010-07-10 11:03   
Patch committed in revision 6289. If anyone can make it clear why this happens, I'd be grateful.

Issue History
2010-05-09 03:44Echelon9New Issue
2010-05-09 03:44Echelon9Statusnew => assigned
2010-05-09 03:44Echelon9Assigned To => Sushi_CW
2010-05-09 03:45Echelon9Note Added: 0011948
2010-05-09 09:11The_ENote Added: 0011949
2010-05-09 09:28Echelon9Note Added: 0011950
2010-05-09 09:36The_ENote Added: 0011951
2010-05-09 09:39The_ENote Edited: 0011951
2010-05-09 09:42The_ENote Edited: 0011951
2010-05-09 10:05Sushi_CWNote Added: 0011952
2010-06-28 10:42Echelon9Note Added: 0012153
2010-06-28 14:19The_ENote Added: 0012154
2010-07-04 04:31Echelon9Note Added: 0012182
2010-07-04 05:02Echelon9Note Added: 0012183
2010-07-04 10:29The_ENote Added: 0012184
2010-07-04 10:36The_ENote Edited: 0012184
2010-07-04 10:45The_ENote Edited: 0012184
2010-07-04 10:46The_EFile Added: staticrand.patch
2010-07-05 17:26Sushi_CWNote Added: 0012191
2010-07-05 17:26Sushi_CWNote Edited: 0012191
2010-07-05 17:28Sushi_CWFile Added: static_rand_range_test.cpp.txt
2010-07-09 09:54The_ENote Added: 0012208
2010-07-10 08:15Echelon9Note Added: 0012217
2010-07-10 09:52The_ENote Added: 0012219
2010-07-10 10:56Sushi_CWNote Added: 0012220
2010-07-10 11:03The_ENote Added: 0012221
2010-07-10 11:03The_EStatusassigned => resolved
2010-07-10 11:03The_EFixed in Version => 3.6.13
2010-07-10 11:03The_EResolutionopen => fixed