View Issue Details

IDProjectCategoryView StatusLast Update
0002988FSSCPgraphicspublic2013-12-30 08:02
ReporterEchelon9 Assigned Toniffiwan  
PrioritynormalSeveritymajorReproducibilityalways
Status resolvedResolutionfixed 
Product Version3.7.0 
Target Version3.7.1Fixed in Version3.7.1 
Summary0002988: Crashes on Linux with Intel 965 integrated drivers - do_blit_readpixels()
DescriptionAlthough support for Intel integrated graphics has been patchy at best, there has been a recent influx of crash reports on Ubuntu 13.10 (64bit) with this hardware.

Investigating to see if there is a recent change in the underlying platform which can be worked around.

Relevant HLP forum threads (includes core dump);
http://www.hard-light.net/forums/index.php?topic=85985.20
http://www.hard-light.net/forums/index.php?topic=86435.0
http://www.hard-light.net/forums/index.php?topic=81380.msg1726499#msg1726499
http://www.hard-light.net/forums/index.php?topic=86145.0
Additional Information#0 0x00007fffeb6b082e in do_blit_readpixels (pixels=0x0, pack=0x25b3b28, type=33639, format=32993,
        height=1080, width=1920, y=0, x=0, ctx=0x25a4a30) at intel_pixel_read.c:94
    0000001 intelReadPixels (ctx=0x25a4a30, x=0, y=0, width=1920, height=1080, format=32993, type=33639,
        pack=0x25b3b28, pixels=0x0) at intel_pixel_read.c:174
    0000002 0x00007fffeb19ab36 in _mesa_ReadnPixelsARB (x=0, y=0, width=1920, height=1080, format=32993, type=33639,
        bufSize=bufSize@entry=2147483647, pixels=pixels@entry=0x0) at ../../../src/mesa/main/readpix.c:1042
    0000003 0x00007fffeb19ad2a in _mesa_ReadPixels (x=<optimized out>, y=<optimized out>, width=<optimized out>,
        height=<optimized out>, format=<optimized out>, type=<optimized out>, pixels=0x0)
        at ../../../src/mesa/main/readpix.c:1050
...
0000003 0x00000000004d5167 in gr_opengl_save_screen () at graphics/gropengl.cpp:1028
0000004 0x00000000007ae8ee in popup_do (pi=0x147e2a0 <Popup_info>, flags=67175424) at popup/popup.cpp:821
0000005 0x00000000007aef88 in popup (flags=67175424, nchoices=3) at popup/popup.cpp:1013
0000006 0x00000000005d1187 in player_tips_popup () at menuui/playermenu.cpp:1347
0000007 0x00000000005b856a in main_hall_do (frametime=0.25) at menuui/mainhallmenu.cpp:840
0000008 0x00000000004175fe in game_do_state (state=1) at freespace2/freespace.cpp:6466
0000009 0x00000000004bbea5 in gameseq_process_events () at gamesequence/gamesequence.cpp:409
0000010 0x000000000041841b in game_main (cmdline=0x223acc0 "") at freespace2/freespace.cpp:7062
#11 0x00000000004185ca in main (argc=1, argv=0x7fffffffe628) at freespace2/freespace.cpp:7196
TagsNo tags attached.

Activities

Echelon9

2013-12-29 22:17

developer   ~0015536

This is the relevant Intel Mesa driver source file: http://cgit.freedesktop.org/mesa/mesa/log/src/mesa/drivers/dri/i965/intel_pixel_read.c

Echelon9

2013-12-29 22:25

developer   ~0015537

Last edited: 2013-12-29 22:32

Within FS2Open, the relevant OpenGL call is glReadPixels(), and can be found here https://github.com/scp-fs2open/fs2open.github.com/blob/master/code/graphics/gropengl.cpp#L1028

That use of glReadPixels() is rather odd, as the parameter for the returned pixel data is NULL. i.e. we aren't doing anything with the data returned.

niffiwan

2013-12-29 22:29

developer   ~0015538

Last edited: 2013-12-29 22:30

I've been having a look at this as well; this seems to be the offending code in FSO:
glReadPixels(0, 0, gr_screen.max_w, gr_screen.max_h, GL_read_format, GL_UNSIGNED_INT_8_8_8_8_REV, NULL);

graphics/gropengl.cpp:1028 gr_opengl_save_screen()

According to this:
http://www.opengl.org/sdk/docs/man/xhtml/glReadPixels.xml

The last param is the return data from the function, so I wonder why we're passing NULL to it. From looking at the linked mesa code, I think this is being dereferenced here which is probably causing the crash:

dst_offset = (GLintptr)pixels; (line 114)

Having said that, I wonder what on earth the Nvidia/AMD drivers do in the same instance?

Anyway - I'm going to try hacking up a patch to feed a valid pointer to that function and see what happens (i.e. in FSO, make the glReadPixels call the same as the one when Use_PBOs is not true). Of course, since I lack an Intel card to test this on, I'll have to rely on someone else testing it for me :)

Lastly, I thought that since when Use_PBOs is true, it seems to use a PBO (which is the calls; glGenBuffersARB() + glBindBufferARB() + glBufferDataARB()) and therefore that the glReadPixels() call is redundant, but removing that call causes graphics corruption when popups occur both in the mainhall and in-game.

Echelon9

2013-12-29 22:39

developer   ~0015539

Last edited: 2013-12-29 22:43

Yes, I agree that passing as data a NULL pointer to glReadPixels is odd if a buffer is not bound to GL_PIXEL_PACK_BUFFER?

If you make a change, you should also apply the same fix to lines 528 and 968 in the same file.

niffiwan

2013-12-29 23:02

developer   ~0015540

Last edited: 2013-12-30 00:02

Yeah, I just found that other reference which mentions GL_PIXEL_PACK_BUFFER
http://www.opengl.org/wiki/GLAPI/glReadPixels

It seems to me like all three instances of glReadPixels with NULL have previously created a GL_PIXEL_PACK_BUFFER. i.e. from the mouse cursor save (line 968):

    if ( Use_PBOs ) {
        // since this is used a lot, and is pretty small in size, we just create it once and leave it until exit
        if (!GL_cursor_pbo) {
            vglGenBuffersARB(1, &GL_cursor_pbo);
            vglBindBufferARB(GL_PIXEL_PACK_BUFFER_ARB, GL_cursor_pbo);
            vglBufferDataARB(GL_PIXEL_PACK_BUFFER_ARB, cursor_size * 4, NULL, GL_STATIC_READ);
        }

        vglBindBufferARB(GL_PIXEL_PACK_BUFFER_ARB, GL_cursor_pbo);
        glReadBuffer(GL_BACK);
        glReadPixels(x, gr_screen.max_h-y-1-h, w, h, GL_read_format, GL_UNSIGNED_INT_8_8_8_8_REV, NULL);
        vglBindBufferARB(GL_PIXEL_PACK_BUFFER_ARB, 0);
    } else {
...

This makes me think that the Linux Intel drivers drivers are not following the standard correctly, or they have a bug in their implementation. I might try a different approach then to see if this is correct by just disabling Use_PBOs to force following the alternate code path.

Update - added a command line flag to disable PBO; code is here:
https://github.com/niffiwan/fs2open.github.com/commits/mantis2988

niffiwan

2013-12-30 02:15

developer   ~0015543

Looks like the workaround "works": http://www.hard-light.net/forums/index.php?topic=85985.msg1726655#msg1726655

Any comments on adding another command line flag to alter this behaviour? Given that this looks like a driver issue, I didn't want to change our code too much.

Is it worthwhile trying to autodetect & autodisable PBO in this situation? The driver reports that the extension "GL_ARB_pixel_buffer_object" is available, however it seems that it's not correctly implemented.

Echelon9

2013-12-30 05:40

developer   ~0015544

From my code review looks fine -- an elegant solution as is.

Given the number of Linux users who are reporting this lately I think we should implement it as a commandline switch and instruct its use as required.

Linux users are generally fine "tinkering" and it's a huge positive if we can improve the adoption of FS2Open by those with Intel integrated graphics cards. The benefits of wider user adoption outweigh the minor code bloat -- it seems not to be our error to fix, but look, let's just implement the no PBO option and move on I say.

niffiwan

2013-12-30 08:02

developer   ~0015545

Thanks for the review - committed in r10268

Issue History

Date Modified Username Field Change
2013-12-29 22:06 Echelon9 New Issue
2013-12-29 22:13 Echelon9 Additional Information Updated
2013-12-29 22:14 Echelon9 Description Updated
2013-12-29 22:14 Echelon9 Summary Crashes on Linux with Intel 965 integrated drivers => Crashes on Linux with Intel 965 integrated drivers - do_blit_readpixels()
2013-12-29 22:17 Echelon9 Note Added: 0015536
2013-12-29 22:25 Echelon9 Note Added: 0015537
2013-12-29 22:29 niffiwan Note Added: 0015538
2013-12-29 22:30 niffiwan Note Edited: 0015538
2013-12-29 22:32 Echelon9 Note Edited: 0015537
2013-12-29 22:39 Echelon9 Note Added: 0015539
2013-12-29 22:43 Echelon9 Note Edited: 0015539
2013-12-29 22:44 Echelon9 Assigned To => niffiwan
2013-12-29 22:44 Echelon9 Status new => assigned
2013-12-29 23:02 niffiwan Note Added: 0015540
2013-12-30 00:02 niffiwan Note Edited: 0015540
2013-12-30 02:15 niffiwan Note Added: 0015543
2013-12-30 02:15 niffiwan Status assigned => code review
2013-12-30 05:40 Echelon9 Note Added: 0015544
2013-12-30 08:02 niffiwan Note Added: 0015545
2013-12-30 08:02 niffiwan Status code review => resolved
2013-12-30 08:02 niffiwan Fixed in Version => 3.7.1
2013-12-30 08:02 niffiwan Resolution open => fixed