Collision callbacks slow

Official forum for the Chipmunk2D Physics Library.
Post Reply
J .
Posts: 8
Joined: Tue Mar 09, 2010 4:10 am

Collision callbacks slow

Post by J . »

In profiling my program using Pygame, I've found that my pre-solve callback, while only running a total 0.221s overall (including called functions) (in a 5s period), is causing an increase in running time of several internal Pymunk functions amounting to over a second (the difference when replacing the function with a return).

I guess this could be a bug in the profiler and it's not counting all the functions called by my function in its total time, considering this is Python calling some C code calling a Python function calling more Chipmunk stuff (or whatever Pymunk does). It would be nice to know for sure, though.

Here are the profiling results for major (running time) Pymunk internal functions and my pre_collision function, before and after:

Code: Select all

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    11067    0.639    0.000    1.795    0.000 pymunk/__init__.py:419(cf)
    44028    0.260    0.000    0.260    0.000 pymunk/__init__.py:1170(_get_shape)
    22014    0.251    0.000    0.511    0.000 pymunk/__init__.py:1167(_get_shapes)
      600    0.225    0.000    2.021    0.003 pymunk/__init__.py:274(step)
23726/16872    0.141    0.000    0.185    0.000 pymunk/__init__.py:1158(_get_contacts)
    10410    0.106    0.000    0.169    0.000 pymunk/vec2d.py:149(__sub__)
    11007    0.080    0.000    0.221    0.000 game.py:22(pre_collision)
    30270    0.067    0.000    0.067    0.000 pymunk/__init__.py:616(_get_velocity)
    16710    0.062    0.000    0.062    0.000 pymunk/vec2d.py:46(__init__)
    23876    0.049    0.000    0.049    0.000 pymunk/vec2d.py:58(__getitem__)
    11067    0.048    0.000    0.048    0.000 pymunk/__init__.py:1146(__init__)
     6854    0.031    0.000    0.031    0.000 pymunk/__init__.py:1108(__init__)
     6300    0.025    0.000    0.043    0.000 pymunk/vec2d.py:41(from_param)
    13466    0.023    0.000    0.023    0.000 pymunk/__init__.py:1127(_get_normal)
    12600    0.022    0.000    0.022    0.000 pymunk/__init__.py:610(_get_position)
    10410    0.019    0.000    0.019    0.000 pymunk/__init__.py:863(<lambda>)

Code: Select all

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
      600    0.273    0.000    0.483    0.001 pymunk/__init__.py:274(step)
    11167    0.143    0.000    0.209    0.000 pymunk/__init__.py:419(cf)
     6300    0.041    0.000    0.074    0.000 pymunk/vec2d.py:41(from_param)
    11167    0.040    0.000    0.040    0.000 pymunk/__init__.py:1146(__init__)
    12600    0.035    0.000    0.035    0.000 pymunk/__init__.py:610(_get_position)
     6300    0.033    0.000    0.033    0.000 pymunk/vec2d.py:46(__init__)
     9450    0.030    0.000    0.030    0.000 pymunk/__init__.py:616(_get_velocity)
    11104    0.025    0.000    0.025    0.000 game.py:22(pre_collision)
     7574    0.020    0.000    0.020    0.000 pymunk/__init__.py:588(_get_angle)
Is anything else needed?
User avatar
slembcke
Site Admin
Posts: 4166
Joined: Tue Aug 14, 2007 7:13 pm
Contact:

Re: Collision callbacks slow

Post by slembcke »

Well, callback are slow in that they tend to happen very often. 100 objects sitting in a pile can generate several hundred callbacks every frame. I'm sure viblio could give you a better answer about how Pymunk forwards the callbacks on to the interpreter, but there is almost certainly going to be plenty of overhead calling back and forth across the interpreter/C boundary. You should be wary of the Python profiler (and all profilers that work by wrapping individual calls) in that it tends to artificially inflate the runtime of small functions. The same thing can happen with Ruby's profiler, Unity's profiler, and gproff. Sometimes old fashioned benchmarks can give you more accurate profiling information.

Performance issues with callbacks is part of why I implemented the begin/separate collision messages. For people coding using bindings or people that are forwarding collision callbacks to expensive method calls (like Obj-C on the iPhone), having a simple callback that returns a static value can be almost as expensive or more than the collision detection itself. Receiving a single message on collision begin/separate was often all I wanted and putting the logic to track that into a Ruby callback was a very expensive way to track a boolean.

I've done a fair amount of Ruby game dev, and I wouldn't want to write a whole game in C. With Ruby at least, it's so slow that I've found that it's impractical to write a whole game in it. I often found that I had to rewrite a few functions in native code in order to get acceptable performance. If 5% of the code is using 50% of the CPU time, you don't have to do too much work to double the performance by rewriting it in a language that is 10x faster.
Can't sleep... Chipmunks will eat me...
Check out our latest projects! -> http://howlingmoonsoftware.com/wordpress/
J .
Posts: 8
Joined: Tue Mar 09, 2010 4:10 am

Re: Collision callbacks slow

Post by J . »

I wasn't focusing so much on that as on the fact that the profiler seems to indicate the overhead is coming from Chipmunk stuff running outside of the callback itself (that is, not called by it).

Edit: related, though not the reason I made the topic: after reading your post, I've just realised that part of the code in the pre-solve callback would be fine in a begin one, despite having considered it before and for some reason deciding to leave it alone. The rest only does stuff every fourth step of the space ATM, so it's not too bad. (I'm sure I'll find out why at a later date, after a few hours of debugging...)

While it would be nice to put some rewrite some intensive stuff in C, I think it's all quite spread out over a few functions, and a lot of the stuff taking the longest interacts with instances of Python classes, I've written. I haven't done bindings between languages before, but something tells me that might make it tricky. I'll look into it, though; performance hasn't been a priority up to now, but it makes sense to look into things before I end up at the point where it'll take a lot of refactoring to get good performance.
viblo
Posts: 206
Joined: Tue Aug 21, 2007 3:12 pm
Contact:

Re: Collision callbacks slow

Post by viblo »

I did some basic profiling myself of a static L shape with a couple of circles on it with a pre_solve callback, and from what I can see the timings reported back from cProfile looks correct, the time is where it should be and it changes accordingly if I add or withdraw some code..

With some expensive loop inside pre_solve() looking up the position on arbiter.shapes[0].body:

Code: Select all

     ncalls  tottime  percall  cumtime  percall filename:lineno(function)
       7388    8.469    0.001   28.302    0.004 callback_benchmark.py:5(pre_solve)
  1403720    8.253    0.000   15.602    0.000 __init__.py:1167(_get_shapes)
  2807440    7.349    0.000    7.349    0.000 __init__.py:1170(_get_shape)
  1403720    2.297    0.000    2.297    0.000 __init__.py:610(_get_position)
  1403720    1.911    0.000    1.911    0.000 __init__.py:863(<lambda>)
       1000    0.149    0.000   28.517    0.029 __init__.py:274(step)
       7388    0.046    0.000   28.366    0.004 __init__.py:419(cf)
       8388    0.025    0.000    0.025    0.000 {range}
       7388    0.017    0.000    0.017    0.000 __init__.py:1145(__init__)


Without the expensive loop:

Code: Select all

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
     1000    0.128    0.000    0.221    0.000 __init__.py:274(step)
     7388    0.040    0.000    0.092    0.000 __init__.py:419(cf)
     7388    0.025    0.000    0.037    0.000 callback_benchmark.py:5(pre_solve)
     7388    0.015    0.000    0.015    0.000 __init__.py:1145(__init__)
The source is here: http://pastebin.com/Fm3yiP0H
Maybe you can explain what you expected from your own profile, its a bit hard to see for me when I don't know which methods are called where.

Anyway, the cf(419) and __init__(1145) functions are wrappers to convert/hide the C types from chipmunk, cf is called from chipmunk and in turn calls your callback method, __init__ is the constructor for the Arbiter class that is passed to the callback.

I think it would be fairly simple to code a callback directly in C and pass that in instead and bypass the whole python callback wrapping code. But only as long as you don't need any information from the python side of things and are happy with just the arguments that are given by Chipmunk. If you need something from the python side you will have to do something more complicated.
http://www.pymunk.org - A python library built on top of Chipmunk to let you easily get cool 2d physics in your python game/app
J .
Posts: 8
Joined: Tue Mar 09, 2010 4:10 am

Re: Collision callbacks slow

Post by J . »

If you look at the difference in cumulative time for your pre_solve between the examples, that's pretty much exactly the same as the difference in total time for itself and other functions between them. In my results, the callback's cumulative time varies by ~.2s, but Pymunk functions are varying by over a second in total, and I have no idea why that would happen. The number of calls to cf hardly varies, but it takes much longer.
You should be wary of the Python profiler (and all profilers that work by wrapping individual calls) in that it tends to artificially inflate the runtime of small functions.
Considering the functions called by the callback (_get_shape(s), _get_contacts) are reported as taking a lot longer than the callback itself, I guess it must be this. I just don't see why it's happening for me and not in your example. The function how it was at the time of that profile is here (I've since cleaned bits up, especially that silly duplicate call to arbiter.shapes...). I'm not sure how much sense it makes without the rest of the source...
viblo
Posts: 206
Joined: Tue Aug 21, 2007 3:12 pm
Contact:

Re: Collision callbacks slow

Post by viblo »

Ah, I think I understand it now, your question is why cf have so much cumtime when pre_collision doesnt, and why for example _get_shapes have more cumtime than pre_collision even when its only called from pre_collision?

I don't know, and I cant get it to happen with the simple test code I made.. Have you tried to profile with the profile module instead of cProfile?
http://www.pymunk.org - A python library built on top of Chipmunk to let you easily get cool 2d physics in your python game/app
J .
Posts: 8
Joined: Tue Mar 09, 2010 4:10 am

Re: Collision callbacks slow

Post by J . »

Hmm, there is still some difference in the wrong direction, but not as much as with cProfile. I guess if different profilers can have such a discrepancy, the whole thing could be one?
viblo
Posts: 206
Joined: Tue Aug 21, 2007 3:12 pm
Contact:

Re: Collision callbacks slow

Post by viblo »

Maybe.. Guess the best would be to construct a small example and ask on the python list. Or just let it go ;)
http://www.pymunk.org - A python library built on top of Chipmunk to let you easily get cool 2d physics in your python game/app
J .
Posts: 8
Joined: Tue Mar 09, 2010 4:10 am

Re: Collision callbacks slow

Post by J . »

Mhm...

Thanks for the replies, both of you - and the great software.
Post Reply

Who is online

Users browsing this forum: No registered users and 15 guests