View Issue Details

IDProjectCategoryView StatusLast Update
0002746FSSCPgraphicspublic2014-06-30 15:51
ReporterValathil Assigned ToValathil  
PrioritynormalSeveritytweakReproducibilityalways
Status closedResolutionwon't fix 
Product Version3.6.16 
Target Version3.7.0 
Summary0002746: Geometry batcher is highly inefficient
DescriptionAs it stands now everytime a particle or other geometry batched item is added to rendering queue the batcher has to iterate over every entry and look for the right texture id. Since missions with many effects i.e. Artemis Station can have up to 1k entries in the vector this consumes many unneeded CPU cycles. I therefore propose and have implemented a hash mapped approach to the texture id lookup. I declared a new template class SCP_unordered_map which replaces SCP_hash_map done by Swifty that follows the other SCP_* templates and actually works with our Debug memory allocation scheme. Performance measurements from before and after show an approximate gain of 3.0% CPU time in heavy load situations i.e. Artemis Station as the Karuna is heavily pounded by beam fire and explodes. I attached a patch which was tested on all compatible platforms and kindly call for code review before committing to trunk.
TagsNo tags attached.

Relationships

related to 0002749 closedZacam unordered_map breaks compiling on VS2008 

Activities

Valathil

2012-12-03 15:55

developer  

unordered_map.patch (10,063 bytes)   
Index: code/globalincs/vmallocator.h
===================================================================
--- code/globalincs/vmallocator.h	(revision 9388)
+++ code/globalincs/vmallocator.h	(working copy)
@@ -9,22 +9,13 @@
 #include <string>
 #include <queue>
 
-#if defined __GNUC__
-#define GCC_VERSION (__GNUC__ * 10000 + __GNUC_MINOR__ * 100 + __GNUC_PATCHLEVEL__)
-#if GCC_VERSION >= 40300
+
+#ifndef WIN32
 #include <tr1/unordered_map>
-#define SCP_hash_map std::tr1::unordered_map
-#elif GCC_VERSION < 40300 || __clang__
-#include <ext/hash_map>
-#define SCP_hash_map __gnu_cxx::hash_map
-#endif // GCC_VERSION || __clang__
-#endif // __GNUC__
+#else // WIN32
+#include <unordered_map>
+#endif // WIN32
 
-#if ! defined __GNUC__
-#include <hash_map>
-#define SCP_hash_map stdext::hash_map
-#endif // ! defined __GNUC__
-
 #if defined(_MSC_VER) && _MSC_VER >= 1400 || !defined(_MSC_VER)
 
 #define DESTROY( type, p ) (p)->~type( )
@@ -123,6 +114,9 @@
 template< typename T >
 class SCP_queue : public std::queue< T, std::deque< T, SCP_vm_allocator< T > > > { };
 
+template< typename T, typename U >
+class SCP_unordered_map : public std::tr1::unordered_map<T, U, std::tr1::hash<T>, std::equal_to<T>, SCP_vm_allocator<std::pair<const T, U> > > { }; 
+
 template <class T1, class T2>
 bool operator==(const SCP_vm_allocator<T1>&, const SCP_vm_allocator<T2>&) throw()
 {
@@ -143,6 +137,7 @@
 #define SCP_queue std::queue
 #define SCP_vector std::vector
 #define SCP_list std::list
+#define SCP_unordered_map std::unordered_map
 
 #endif
 
Index: code/graphics/grbatch.cpp
===================================================================
--- code/graphics/grbatch.cpp	(revision 9388)
+++ code/graphics/grbatch.cpp	(working copy)
@@ -591,8 +591,8 @@
 	bool laser;
 };
 
-static SCP_vector<batch_item> geometry_map;
-static SCP_vector<batch_item> distortion_map;
+static SCP_unordered_map<int, batch_item> geometry_map;
+static SCP_unordered_map<int, batch_item> distortion_map;
 
 // Used for sending verts to the vertex buffer
 effect_vertex *Batch_buffer = NULL;
@@ -600,40 +600,18 @@
 
 static size_t find_good_batch_item(int texture)
 {
-	size_t max_size = geometry_map.size();
-
-	for (size_t i = 0; i < max_size; i++) {
-		if (geometry_map[i].texture == texture)
-			return i;
+	if(geometry_map[texture].texture != texture) {
+		geometry_map[texture].texture = texture;
 	}
-
-	// don't have an existing match so add a new entry
-	batch_item new_item;
-
-	new_item.texture = texture;
-
-	geometry_map.push_back(new_item);
-
-	return (geometry_map.size() - 1);
+	return texture;
 }
 
 static size_t find_good_distortion_item(int texture)
 {
-	size_t max_size = distortion_map.size();
-
-	for (size_t i = 0; i < max_size; i++) {
-		if (distortion_map[i].texture == texture)
-			return i;
+	if(distortion_map[texture].texture != texture) {
+		distortion_map[texture].texture = texture;
 	}
-
-	// don't have an existing match so add a new entry
-	batch_item new_item;
-
-	new_item.texture = texture;
-
-	distortion_map.push_back(new_item);
-
-	return (distortion_map.size() - 1);
+	return texture;
 }
 
 float batch_add_laser(int texture, vec3d *p0, float width1, vec3d *p1, float width2, int r, int g, int b)
@@ -728,71 +706,71 @@
 
 void batch_render_lasers(bool stream_buffer)
 {
-	for (SCP_vector<batch_item>::iterator bi = geometry_map.begin(); bi != geometry_map.end(); ++bi) {
+	for (SCP_unordered_map<int, batch_item>::iterator bi = geometry_map.begin(); bi != geometry_map.end(); ++bi) {
 
-		if ( !bi->laser )
+		if ( !bi->second.laser )
 			continue;
 
-		if ( !bi->batch.need_to_render() )
+		if ( !bi->second.batch.need_to_render() )
 			continue;
 
-		Assert( bi->texture >= 0 );
-		gr_set_bitmap(bi->texture, GR_ALPHABLEND_FILTER, GR_BITBLT_MODE_NORMAL, 0.99999f);
+		Assert( bi->second.texture >= 0 );
+		gr_set_bitmap(bi->second.texture, GR_ALPHABLEND_FILTER, GR_BITBLT_MODE_NORMAL, 0.99999f);
 		if ( stream_buffer ) {
-			bi->batch.render_buffer(TMAP_FLAG_TEXTURED | TMAP_FLAG_XPARENT | TMAP_HTL_3D_UNLIT | TMAP_FLAG_RGB | TMAP_FLAG_GOURAUD | TMAP_FLAG_CORRECT);
+			bi->second.batch.render_buffer(TMAP_FLAG_TEXTURED | TMAP_FLAG_XPARENT | TMAP_HTL_3D_UNLIT | TMAP_FLAG_RGB | TMAP_FLAG_GOURAUD | TMAP_FLAG_CORRECT);
 		} else {
-			bi->batch.render(TMAP_FLAG_TEXTURED | TMAP_FLAG_XPARENT | TMAP_HTL_3D_UNLIT | TMAP_FLAG_RGB | TMAP_FLAG_GOURAUD | TMAP_FLAG_CORRECT);
+			bi->second.batch.render(TMAP_FLAG_TEXTURED | TMAP_FLAG_XPARENT | TMAP_HTL_3D_UNLIT | TMAP_FLAG_RGB | TMAP_FLAG_GOURAUD | TMAP_FLAG_CORRECT);
 		}
 	}
 }
 
 void batch_load_buffer_lasers(effect_vertex* buffer, int *n_verts)
 {
-	for (SCP_vector<batch_item>::iterator bi = geometry_map.begin(); bi != geometry_map.end(); ++bi) {
+	for (SCP_unordered_map<int, batch_item>::iterator bi = geometry_map.begin(); bi != geometry_map.end(); ++bi) {
 
-		if ( !bi->laser )
+		if ( !bi->second.laser )
 			continue;
 
-		if ( !bi->batch.need_to_render() )
+		if ( !bi->second.batch.need_to_render() )
 			continue;
 
-		Assert( bi->texture >= 0 );
-		bi->batch.load_buffer(buffer, n_verts);
+		Assert( bi->second.texture >= 0 );
+		bi->second.batch.load_buffer(buffer, n_verts);
 	}
 }
 
 void batch_render_geometry_map_bitmaps(bool stream_buffer)
 {
-	for (SCP_vector<batch_item>::iterator bi = geometry_map.begin(); bi != geometry_map.end(); ++bi) {
+	for (SCP_unordered_map<int, batch_item>::iterator bi = geometry_map.begin(); bi != geometry_map.end(); ++bi) {
 
-		if ( bi->laser )
+		if ( bi->second.laser )
 			continue;
 
-		if ( !bi->batch.need_to_render() )
+		if ( !bi->second.batch.need_to_render() )
 			continue;
 
-		Assert( bi->texture >= 0 );
-		gr_set_bitmap(bi->texture, GR_ALPHABLEND_FILTER, GR_BITBLT_MODE_NORMAL, bi->alpha);
+		Assert( bi->second.texture >= 0 );
+		gr_set_bitmap(bi->second.texture, GR_ALPHABLEND_FILTER, GR_BITBLT_MODE_NORMAL, bi->second.alpha);
 		if ( stream_buffer ) {
-			bi->batch.render_buffer(bi->tmap_flags);
+			bi->second.batch.render_buffer(bi->second.tmap_flags);
 		} else {
-			bi->batch.render( bi->tmap_flags);
+			bi->second.batch.render( bi->second.tmap_flags);
 		}
 	}
 }
 
 void batch_load_buffer_geometry_map_bitmaps(effect_vertex* buffer, int *n_verts)
 {
-	for (SCP_vector<batch_item>::iterator bi = geometry_map.begin(); bi != geometry_map.end(); ++bi) {
+	for (SCP_unordered_map<int, batch_item>::iterator bi = geometry_map.begin(); bi != geometry_map.end(); ++bi) {
 
-		if ( bi->laser )
+		if ( bi->second.laser )
 			continue;
 
-		if ( !bi->batch.need_to_render() )
+		if ( !bi->second.batch.need_to_render() )
 			continue;
 
-		Assert( bi->texture >= 0 );
-		bi->batch.load_buffer(buffer, n_verts);
+		Assert( bi->second.texture >= 0 );
+		bi->second.batch.load_buffer(buffer, n_verts);
 	}
 }
 
@@ -898,54 +876,54 @@
 
 void batch_render_distortion_map_bitmaps(bool stream_buffer)
 {
-	for (SCP_vector<batch_item>::iterator bi = distortion_map.begin(); bi != distortion_map.end(); ++bi) {
+	for (SCP_unordered_map<int, batch_item>::iterator bi = distortion_map.begin(); bi != distortion_map.end(); ++bi) {
 
-		if ( bi->laser )
+		if ( bi->second.laser )
 			continue;
 
-		if ( !bi->batch.need_to_render() )
+		if ( !bi->second.batch.need_to_render() )
 			continue;
 
-		Assert( bi->texture >= 0 );
-		gr_set_bitmap(bi->texture, GR_ALPHABLEND_NONE, GR_BITBLT_MODE_NORMAL, bi->alpha);
+		Assert( bi->second.texture >= 0 );
+		gr_set_bitmap(bi->second.texture, GR_ALPHABLEND_NONE, GR_BITBLT_MODE_NORMAL, bi->second.alpha);
 
 		if ( stream_buffer ) {
-			bi->batch.render_buffer(bi->tmap_flags);
+			bi->second.batch.render_buffer(bi->second.tmap_flags);
 		} else {
-			bi->batch.render( bi->tmap_flags);
+			bi->second.batch.render( bi->second.tmap_flags);
 		}
 	}
 }
 
 void batch_load_buffer_distortion_map_bitmaps(effect_vertex* buffer, int *n_verts)
 {
-	for (SCP_vector<batch_item>::iterator bi = distortion_map.begin(); bi != distortion_map.end(); ++bi) {
+	for (SCP_unordered_map<int, batch_item>::iterator bi = distortion_map.begin(); bi != distortion_map.end(); ++bi) {
 
-		if ( bi->laser )
+		if ( bi->second.laser )
 			continue;
 
-		if ( !bi->batch.need_to_render() )
+		if ( !bi->second.batch.need_to_render() )
 			continue;
 
-		Assert( bi->texture >= 0 );
-		bi->batch.load_buffer(buffer, n_verts);
+		Assert( bi->second.texture >= 0 );
+		bi->second.batch.load_buffer(buffer, n_verts);
 	}
 }
 
 int batch_get_size()
 {
 	int n_to_render = 0;
-	SCP_vector<batch_item>::iterator bi;
+	SCP_unordered_map<int, batch_item>::iterator bi;
 
 	for (bi = geometry_map.begin(); bi != geometry_map.end(); ++bi) {
-		n_to_render += bi->batch.need_to_render();
+		n_to_render += bi->second.batch.need_to_render();
 	}
 
 	for (bi = distortion_map.begin(); bi != distortion_map.end(); ++bi) {
-		if ( bi->laser )
+		if ( bi->second.laser )
 			continue;
 
-		n_to_render += bi->batch.need_to_render();
+		n_to_render += bi->second.batch.need_to_render();
 	}
 
 	return n_to_render * 3;
Index: code/object/objcollide.cpp
===================================================================
--- code/object/objcollide.cpp	(revision 9388)
+++ code/object/objcollide.cpp	(working copy)
@@ -42,7 +42,7 @@
 obj_pair pair_free_list;
 
 SCP_vector<int> Collision_sort_list;
-SCP_hash_map<uint, collider_pair> Collision_cached_pairs;
+SCP_unordered_map<uint, collider_pair> Collision_cached_pairs;
 
 struct checkobject;
 extern checkobject CheckObjects[MAX_OBJECTS];
@@ -1039,7 +1039,7 @@
 			opp = opp->next;
 		}
 	} else {
-		SCP_hash_map<uint, collider_pair>::iterator it;
+		SCP_unordered_map<uint, collider_pair>::iterator it;
 		collider_pair *pair_obj;
 
 		for ( it = Collision_cached_pairs.begin(); it != Collision_cached_pairs.end(); ++it ) {
@@ -1180,7 +1180,7 @@
 
 void obj_collide_retime_cached_pairs(int checkdly)
 {
-	SCP_hash_map<uint, collider_pair>::iterator it;
+	SCP_unordered_map<uint, collider_pair>::iterator it;
 
 	for ( it = Collision_cached_pairs.begin(); it != Collision_cached_pairs.end(); ++it ) {
 		it->second.next_check_time = timestamp(checkdly);
unordered_map.patch (10,063 bytes)   

Swifty

2012-12-06 04:26

developer   ~0014326

This is a good idea.

MjnMixael

2012-12-06 04:29

manager   ~0014328

I've been running a build with this for two days and haven't noticed any adverse effects so far.

Echelon9

2012-12-09 00:56

developer   ~0014365

Tested a compile of this patch here on Mac (Xcode 4.5.2, LLVM compiler 4.1, 64bit) against trunk r9409.

A small change was required, see attached patch, as the isnan() function was not resolved correctly after this change.

Echelon9

2012-12-09 00:56

developer  

unordered_map_mac_fix.patch (451 bytes)   
Index: code/windows_stub/config.h
===================================================================
--- code/windows_stub/config.h	(revision 9409)
+++ code/windows_stub/config.h	(working copy)
@@ -277,7 +277,7 @@
 char *strnset( char *string, int fill, size_t count);
 
 // other stuff
-#define _isnan(f)     isnan(f)
+#define _isnan(f)     std::isnan(f)
 #define _hypot(x, y)  hypot(x, y)
 
 int MulDiv(int number, int numerator, int denominator);
unordered_map_mac_fix.patch (451 bytes)   

Zacam

2012-12-09 02:20

administrator   ~0014368

Last edited: 2012-12-09 02:30

Looks good.
And because I keep breaking things, the commit to Trunk was r9410.

Zacam

2012-12-09 02:35

administrator   ~0014369

Putting to feedback as a last minute safety check for this commit.

Just in case.

Valathil

2012-12-09 05:39

developer   ~0014377

i thought the isnan stuff was a trunk issue? its directly caused by this?

Echelon9

2012-12-09 11:16

developer   ~0014383

To the extent it was caused by a rebuild of certain parts of the code, it might have been a trunk issue.

Either way, it's been addressed in the commit to trunk in last 6 hrs.

The_E

2012-12-09 21:38

administrator   ~0014391

Has been committed to trunk in 9410 by Zacam

niffiwan

2012-12-10 01:57

developer   ~0014397

I think the isnan stuff needs another look, the current setup breaks compilation on Linux. See the attached patch for a suggested fix.

niffiwan

2012-12-10 01:58

developer  

2746-isnan.svn.patch (485 bytes)   
Index: code/windows_stub/config.h
===================================================================
--- code/windows_stub/config.h	(revision 9419)
+++ code/windows_stub/config.h	(working copy)
@@ -277,7 +277,11 @@
 char *strnset( char *string, int fill, size_t count);
 
 // other stuff
+#ifdef APPLE_APP
 #define _isnan(f)     std::isnan(f)
+#else
+#define _isnan(f)     isnan(f)
+#endif
 #define _hypot(x, y)  hypot(x, y)
 
 int MulDiv(int number, int numerator, int denominator);
2746-isnan.svn.patch (485 bytes)   

Zacam

2012-12-10 08:01

administrator   ~0014407

And it apparently has issues with some people's 2008 and older.

Pulled back out of trunk to do review and discussion internally.

MjnMixael

2014-06-30 15:51

manager   ~0015971

Per discussion with The_E, closing as Swifty's work will likely supersede this.

Issue History

Date Modified Username Field Change
2012-12-03 15:55 Valathil New Issue
2012-12-03 15:55 Valathil Status new => assigned
2012-12-03 15:55 Valathil Assigned To => Valathil
2012-12-03 15:55 Valathil File Added: unordered_map.patch
2012-12-03 15:57 Valathil Status assigned => code review
2012-12-03 15:57 Valathil Description Updated
2012-12-06 04:26 Swifty Note Added: 0014326
2012-12-06 04:29 MjnMixael Note Added: 0014328
2012-12-09 00:56 Echelon9 Note Added: 0014365
2012-12-09 00:56 Echelon9 File Added: unordered_map_mac_fix.patch
2012-12-09 02:20 Zacam Note Added: 0014368
2012-12-09 02:30 Zacam Note Edited: 0014368
2012-12-09 02:35 Zacam Note Added: 0014369
2012-12-09 02:35 Zacam Status code review => feedback
2012-12-09 05:39 Valathil Note Added: 0014377
2012-12-09 05:39 Valathil Status feedback => assigned
2012-12-09 11:16 Echelon9 Note Added: 0014383
2012-12-09 21:38 The_E Note Added: 0014391
2012-12-09 21:38 The_E Status assigned => resolved
2012-12-09 21:38 The_E Resolution open => fixed
2012-12-10 01:57 niffiwan Note Added: 0014397
2012-12-10 01:57 niffiwan Status resolved => feedback
2012-12-10 01:57 niffiwan Resolution fixed => reopened
2012-12-10 01:57 niffiwan File Added: 2746-isnan.svn.patch
2012-12-10 01:58 niffiwan File Deleted: 2746-isnan.svn.patch
2012-12-10 01:58 niffiwan File Added: 2746-isnan.svn.patch
2012-12-10 01:58 niffiwan Status feedback => code review
2012-12-10 03:03 MjnMixael Relationship added related to 0002749
2012-12-10 08:01 Zacam Note Added: 0014407
2014-06-30 15:51 MjnMixael Note Added: 0015971
2014-06-30 15:51 MjnMixael Status code review => closed
2014-06-30 15:51 MjnMixael Resolution reopened => suspended
2014-06-30 15:51 The_E Resolution suspended => won't fix