Chipmunk breaking, but only on one platform

Official forum for the Chipmunk2D Physics Library.
Post Reply
mcc
Posts: 27
Joined: Sun Mar 30, 2008 9:00 pm
Contact:

Chipmunk breaking, but only on one platform

Post by mcc »

So here's one. I am porting my code to Android. As soon as I run on Android, Chipmunk (it's 3.5.4) breaks. It gets as far as trying to add an object to a space, then freaks out. There are two possibilities here--

1. My Android code runs through different initialization code from my other compile targets, then hands control over to a common program_init function that sets up chipmunk. Maybe I did something in the initialization that breaks things for Chipmunk later.

2. Somehow Chipmunk behaves different on Android.

(1) seems more likely, but either way, the code is failing in a very odd way.

Here's what happens. I have some simple code which in its early stages is actually fairly similar to the old "MoonBuggy" demo. Loosely, it

- cpInitChipmunks
- creates a couple of spaces
- Creates, then adds to one of the spaces, four cpSegmentShapes

I'm finding that the code gets as far as the first cpSpaceAddStaticShape, but never exits that function.

Inside cpSpaceAddStaticShape, it's getting as far as cpSpaceActivateShapesTouchingShape, wherein it does a cpSpaceShapeQuery to look for any objects that need to be woken up by the newly inserted object. As soon as cpSpaceHashQuery is called for the first time (on the active hash), disaster strikes because, for some reason, the cpBB chipmunk decides to test for the shape is totally invalid. I added a

Code: Select all

ERR("bb lrbt %lf %lf %lf %lf; dim %lf", (double)bb.l, (double)bb.r, (double)bb.b, (double)bb.t, (double)dim);
to the start of cpSpaceHashQuery, and on Android it prints out:

Code: Select all

03-18 22:24:21.332: ERROR/Jumpcore(517): bb lrbt -Inf -Inf NaN NaN; dim 0.100000
As a unpleasant coincidence would have it, this particular combination of bad values causes cpSpaceHashQuery to go into an infinite loop-- reason why, floor_int at least in the Android emulator winds up converting all those bad values to INT_MAX. Because r is thus equal to INT_MAX, the i<=r condition below can never be true, i just wraps around forever.

(A note, unrelated to the actual problem: You might want to consider doing a cpAssert(isfinite(bb.l) && ... ) in this function, or at the end of cpShapeCacheBB or in some similar places? You might also want to consider adding a quick check-and-bail-out for r==INT_MAX before performing the loop because even without having actually invalid floats a user could believably hit the infinite-loop condition simply by having a coordinate which is extremely large. Though I think at that point they likely would have other problems...)

ANYWAY when I run this same code on OS X it prints out

Code: Select all

bb lrbt -1.700000 -1.500000 -1.100000 1.100000; dim 0.100000
I feel a little stuck on figuring out how the two versions diverge because I don't think I understand where the cpBB in cpSpaceShapeQuery (obtained by calling cpShapeCacheBB) is coming from.

Have you seen anything like this before? Why do you suppose I am winding up with this invalid BB on Android only? Is there some further testing I could do?

Thanks!
ndizazzo
Posts: 15
Joined: Thu Feb 10, 2011 7:53 pm
Contact:

Re: Chipmunk breaking, but only on one platform

Post by ndizazzo »

mcc wrote:1. My Android code runs through different initialization code from my other compile targets, then hands control over to a common program_init function that sets up chipmunk. Maybe I did something in the initialization that breaks things for Chipmunk later.
Is this your own build process? or the standard Android one? I'm not familiar with the platform, but I'd say throw it in a simple hello world application (forget rendering) and see if you can simulate the space.
User avatar
slembcke
Site Admin
Posts: 4166
Joined: Tue Aug 14, 2007 7:13 pm
Contact:

Re: Chipmunk breaking, but only on one platform

Post by slembcke »

Hmm. My initial reaction was that cpShapeCacheBB() was not being called before the query somehow, but it's called on the line before it. Have you tried stepping the debugger into cpShapeCacheBB()? Each shape type has a little vtable-like struct with function pointers in it. Basically all cpShapeCacheBB() does is call cpPolyShapeCacheData(), cpCircleShapeCacheData(), or cpSegmentShapeCacheData() based on the type of the shape. Try looking at what is going on in those functions. Maybe the issue is happening before that even?

My next best guess is that you are compiling for an Android phone without an FPU and using the -ffast-math flag. The -ffast-math flag tells the compiler to generate code to deal nicely with NaNs and infinity. Every FPU I've ever heard of handles infinity just fine, and only chokes when dealing with NaNs. Chipmunk uses infinity heavily, but has no need to use NaNs anywhere as an error condition. When using software FPU emulation and -ffast-math, it causes all the calculations that involve infinity to get screwed up.
Can't sleep... Chipmunks will eat me...
Check out our latest projects! -> http://howlingmoonsoftware.com/wordpress/
mcc
Posts: 27
Joined: Sun Mar 30, 2008 9:00 pm
Contact:

Re: Chipmunk breaking, but only on one platform

Post by mcc »

ndizazzo wrote:
mcc wrote:1. My Android code runs through different initialization code from my other compile targets, then hands control over to a common program_init function that sets up chipmunk. Maybe I did something in the initialization that breaks things for Chipmunk later.
Is this your own build process? or the standard Android one? I'm not familiar with the platform, but I'd say throw it in a simple hello world application (forget rendering) and see if you can simulate the space.
So, I was just trying to express the two codebases are not identical. Desktop version is built with xcode, enters a normal-type "void main", inits SDL, and then calls my "real program" init functions; Android version is built with standard Android build process (ndk-build + eclipse), enters something based on a Google code sample, inits Android stuff, and then calls my "real program" init functions. The code should behave identical in either case, I was just trying to raise the spectre I'd like, dunno, corrupted the heap or something in some really crazy way before Chipmunk even gets going?

I think "hello world" has been shown to work, I do some somewhat complex things including reading and writing files before I get to the Chipmunk init. I also do not think rendering is relevant because on either target Chipmunk does crash during init (although, after I initialize openGL, maybe I should test without that).
Have you tried stepping the debugger into cpShapeCacheBB()?
Sorry I should have mentioned this, I have been trying very hard to get GDB working but have not met with success yet :( the sequence of events I describe above was reconstructed with an enormous number of strategic printfs*. I see I have some new responses on the mailing list today so maybe I can get that working soon...
slembcke wrote:Hmm. My initial reaction was that cpShapeCacheBB() was not being called before the query somehow, but it's called on the line before it. Have you tried stepping the debugger into cpShapeCacheBB()? Each shape type has a little vtable-like struct with function pointers in it. Basically all cpShapeCacheBB() does is call cpPolyShapeCacheData(), cpCircleShapeCacheData(), or cpSegmentShapeCacheData() based on the type of the shape. Try looking at what is going on in those functions. Maybe the issue is happening before that even?

My next best guess is that you are compiling for an Android phone without an FPU and using the -ffast-math flag. The -ffast-math flag tells the compiler to generate code to deal nicely with NaNs and infinity. Every FPU I've ever heard of handles infinity just fine, and only chokes when dealing with NaNs. Chipmunk uses infinity heavily, but has no need to use NaNs anywhere as an error condition. When using software FPU emulation and -ffast-math, it causes all the calculations that involve infinity to get screwed up.
Actually I am running on the emulator, so it's very possible I don't have "an FPU" or am using a software FPU. Hm. However I'd still be a little confused how a NaN and an inf are being generated out of an attempt to bound just a segment.

Let me take another pass at fixing GDB and get back to you...
mcc
Posts: 27
Joined: Sun Mar 30, 2008 9:00 pm
Contact:

Re: Chipmunk breaking, but only on one platform

Post by mcc »

Still working on gdb, however #android-dev claims and web research confirms that the the Android emulator in fact uses soft fpu because it's supposed to mimic the least-common-denominator phone (which is really weird, since you instantiate emulators for specific versions of android and the recent versions of android don't even run on the softfpu phones!). Hm...
mcc
Posts: 27
Joined: Sun Mar 30, 2008 9:00 pm
Contact:

Re: Chipmunk breaking, but only on one platform

Post by mcc »

[REDACTED]

Update 2: Problem was in my code... due to an error in init, the aspect ratio was equal to 0 and I was creating segments that actually did have NaN as coordinates!!! (cpv(-1/aspect,-1), etc) Experimenting further...

It does make me wonder about cpAsserts checking isfinite on user input inside say the cpshapenew functions... that would have caught this.

Update 3: Fixed aspect ratio (and, thus, input to Chipmunk) and it works now. Thanks for the help
User avatar
slembcke
Site Admin
Posts: 4166
Joined: Tue Aug 14, 2007 7:13 pm
Contact:

Re: Chipmunk breaking, but only on one platform

Post by slembcke »

Yeah, I sprinkle new asserts in as I think about them or people find ways where they can get bad data into the system. I've added a bunch more in Chipmunk 6, but I don't really go to find every conceivable input.
Can't sleep... Chipmunks will eat me...
Check out our latest projects! -> http://howlingmoonsoftware.com/wordpress/
Post Reply

Who is online

Users browsing this forum: No registered users and 32 guests