mewert wrote:
I came across this optimization on my own, even though I have actually seen code that uses it. I missed it before because I was busy keeping my Jacobians straight, and it's usually kind of obfuscated. Going back now I've noticed it's in ODE, it's in Erin Catto's paper, it's all over! My question is this: Is everyone arriving at this optimization on their own, or has it been detailed somewhere before?
Most of these things is something you pick up over the years. ODE is a great source for studying many of these kind of "optimizations".
mewert wrote:
If so, what other order of magnitude optimizations am I missing out on!

Well for Gauss-Seidel I would say that the next logical step is a two stage LCP solver. Say the full problem is defined by the matrix equation (w and lambda being complementary):
w = A lambda + b
where A = J M^-1 J^T and b = J u + F_ext. Then for n-contact points you get n 3x3 LCPs looking something like this
w_i = J_ii^T M_i^-1 J_ii lambda_i + b_i + sum_j J_ij F_j
which we can simplify to drop the F-terms
w_i = J_ii^T M_i^-1 J_ii lambda_i + b_i + sum_j (w_j - b_j)
Setting A_i' = J_ii^T M_i J_ii and b_i'= b_i + sum_j w_j - b_j, you end up with an LCP defined by the
w_i = A_i' lambda_i + b_i'
These smaller LCPs can be solved cheaply using some direct method. These kinds of solvers have been described by M.Jean and J.J.Moreau, so it is really nothing new.
There is a more recent paper by Renouf, Acary, and Dumont describing the same thing, As I recall they refer to it as a blocked non-smooth projected gauss-seidel. However, I do not think they reported any convergence results.
Notice that keeping b_i' from iteration to iteration, you can update this really fast once you have solved for a lambda_j. I believe with such a method you exploit data-locality better, besides some of the nasty index book-keeping and cache worries seem to disappear.
However, you can solve all the sub-problems in parallel and need I then say GPU or PPU.... One can even solve the smaller problems in a non-linear manner, this really allows one to use anisotropic friction etc. Personally I would favor using a friction splitting method. I.e. first solve for the normal force, then solve for the frictional forces by projecting onto some limit surface.