Lattice -> Joints

It’s not perfect, but here’s a small script that samples a lattice and tries to set joint weights based on the influence of each lattice point.

Given a set of lattice vertices and a model influenced by these vertices it will create joints at every lattice point, bind a skin and set the weights.

Usage: just edit the variables at the top & run the script. It’s slapped together really quickly.

It moves every lattice point one by one & stores the amount of movement that occured per vertex, which is basically the weight of this point for that vertex.

Issues: Small weights are completely vanishing, you could try dividing the sampled movement by the amout of movement to get a 0-1 weight, then apply an inverse s-curve or pow / sqrt to that value and use it as weight instead.

Requirements: to set all weights really fast I use a custom “skinWeightsHandler” command, you can write your own ‘set all weights for all joints and then normalize’ routine or get the plugin by installing Perry Leijten’s skinning tools for which I originally made this plugin.

model = r'polySurface1'
influences = (r'[0][0][0]',

def sample(model):
    return cmds.xform(model + '.vtx[*]', q=True, ws=True, t=True)[1::3]

def difference(list, list2):
    stack = [0] * len(list)
    for i in range(len(list)):
        stack[i] = abs(list2[i] - list[i])
    return stack

def gather(model, influences):
    original = sample(model)
    weights = {}
    for influence in influences:
        cmds.xform(influence, ws=True, r=True, t=[0, 1000, 0])
        weights[influence] = difference(sample(model), original)
    return weights

weights = gather(model, influences)
# generate joints
joints = []
for influence in influences:
    pos = cmds.xform(influence, q=True, ws=True, t=True)
    cmds.xform(joints[-1], ws=True, t=pos)
# concatenate weights in the right way
vertexCount = len(weights.values()[0])
influenceCount = len(influences)
vertexWeights = [0] * (vertexCount * influenceCount)
for i in xrange(vertexCount):
    tw = 0
    for j, influence in enumerate(influences):
        vertexWeights[i * influenceCount + j] = weights[influence][i]
        tw += weights[influence][i]
    if not tw:
        # weight is 0
    for j in xrange(influenceCount):
        vertexWeights[i * influenceCount + j] /= tw
# expand to shape
if not, type='mesh'):
    model = cmds.listRelatives(model, c=True, type='mesh')[0]
# bind skin, joints)
skinCluster = cmds.skinCluster()
# set weights
cmds.SkinWeights([model, skinCluster],  nwt=vertexWeights)

Maya discovery of the day

if you’re looking for all objects with a specific attribute, it is nice to know that ls and it’s wildcards also work on attributes! It even does not care whether you supply the long or the short name. To get all objects with a translateX attribute you can simply use:'*.tx')

Wildcards do not work with some other modifiers however, so you can not do this:'*.myMetaData', l=True, type='mesh', sl=True)

because the returned type is not a mesh, but an attribute; but you can of course do this (notice the o=True to return object names not attributes):'*.myMetaData', o=True), l=True, type='mesh', sl=True)

Just wanted to share that bit of information! And while we’re at it, python supports ‘or’ in arbitrary expressions, so if you wish to find all transforms that contain a mesh (or get the transforms of selected meshes at the same time), you’ll often find yourself doing this:

selected_transforms ='transform', sl=True, l=True)
selected_meshes ='mesh', sl=True, l=True)
if selected_transforms is not None:
    meshes = cmds.listRelatives(selected_transforms, c=True, type='mesh', f=True)
    if meshes is not None:
        if selected_meshes is not None:
            selected_meshes += meshes
            selected_meshes = meshes
selected_mesh_transforms = []
if selected_meshes is not None:
    selected_mesh_transforms = cmds.listRelatives(selected_meshes, p=True)

just because ls and listRelatives return None instead of an empty list this code is super complicated. With ‘or’ we can simply do this:

meshes = ('mesh', sl=True, l=True) or []) + (cmds.listRelatives('transform', sl=True, l=True), c=True, type='mesh', f=True) or [])
selected_mesh_transforms = cmds.listRelatives(meshes, p=True, f=True) or []

Admittedly a bit less readable, but make a utility function or name variables appropriately is my advice!

Simple Maya mesh save/load

I recently wanted to capture some frames of an animation into a single mesh and really the easiest way to ditch any dependencies & materials was to export some OBJs, import them and then combine them! This is rather slow, especially reading gigantic models, and I did not need a lot of the data stored in an OBJ.

So here I have a small utility that stores a model’s position & triangulation and nothing else in a binary format closely resembling the Maya API, allowing for easy reading, writing and even combining during I/O.

Use write() with a mesh (full) name and use read() with a filepath to serialize
and deserialize maya meshes:

import struct
from maya.OpenMaya import MSelectionList, MDagPath, MFnMesh, MGlobal, MPointArray, MIntArray, MSpace, MPoint

def _named_mobject(path):
    li = MSelectionList()
    MGlobal.getSelectionListByName(path, li)
    p = MDagPath()
    li.getDagPath(0, p)
    return p

def writeCombined(meshes, file_path):
    # start streaming into the file
    with open(file_path, 'wb') as fh:
        # cache function sets
        fns = []
        for mesh in meshes:

        # get resulting mesh data sizes
        vertex_count = 0
        poly_count = 0
        index_count = 0
        meshPolygonCounts = []
        meshPolygonConnects = []
        for fn in fns:
            vertex_count += fn.numVertices()
            # we need to get these now in order to keep track of the index_count,
            # we cache them to avoid copying these arrays three times during this function.
            fn.getVertices(meshPolygonCounts[-1], meshPolygonConnects[-1])
            poly_count += meshPolygonCounts[-1].length()
            index_count += meshPolygonConnects[-1].length()

        # write num-vertices as uint32
        fh.write(struct.pack('<L', vertex_count))

        for fn in fns:
            vertices = MPointArray()
            fn.getPoints(vertices, MSpace.kWorld)

            # write all vertex positions as pairs of three float64s
            for i in xrange(vertex_count):
                fh.write(struct.pack('<d', vertices[i].x))
                fh.write(struct.pack('<d', vertices[i].y))
                fh.write(struct.pack('<d', vertices[i].z))

        # write num-polygonCounts as uint32
        fh.write(struct.pack('<L', poly_count))

        for i, fn in enumerate(fns):
            # write each polygonCounts as uint32
            for j in xrange(meshPolygonCounts[i].length()):
                fh.write(struct.pack('<L', meshPolygonCounts[i][j]))

        # write num-polygonConnects as uint32
        fh.write(struct.pack('<L', index_count))

        # keep track of how many vertices there are to offset the polygon-vertex indices
        offset = 0
        for i, fn in enumerate(fns):
            # write each polygonConnects as uint32
            for j in xrange(meshPolygonConnects[i].length()):
                fh.write(struct.pack('<L', meshPolygonConnects[i][j] + offset))
            offset += fn.numVertices()

def write(mesh, file_path):
    writeCombined([mesh], file_path)

def readCombined(file_paths):
    numVertices = 0
    numPolygons = 0
    vertices = MPointArray()
    polygonCounts = MIntArray()
    polygonConnects = MIntArray()

    for file_path in file_paths:
        with open(file_path, 'rb') as fh:
            # read all vertices
            n = struct.unpack('<L',[0]
            for i in xrange(n):

            # read all polygon counts
            n = struct.unpack('<L',[0]
            numPolygons += n
            polygonCounts += struct.unpack('<%sL'%n, * 4))

            # read all polygon-vertex indices
            n = struct.unpack('<L',[0]
            offset = polygonConnects.length()
            polygonConnects += struct.unpack('<%sL'%n, * 4))

            # offset the indices we just added to the match merged mesh vertex IDs
            for i in xrange(n):
                polygonConnects[offset + i] += numVertices

            numVertices += n

    new_object = MFnMesh()
    new_object.create(numVertices, numPolygons, vertices, polygonCounts, polygonConnects)
    return new_object.fullPathName()

def read(file_path):
    with open(file_path, 'rb') as fh:
        numVertices = struct.unpack('<L',[0]
        vertices = MPointArray()
        for i in xrange(numVertices):
        numPolygons = struct.unpack('<L',[0]
        polygonCounts = MIntArray()
        polygonCounts += struct.unpack('<%sL'%numPolygons, * 4))
        n = struct.unpack('<L',[0]
        polygonConnects = MIntArray()
        polygonConnects += struct.unpack('<%sL'%n, * 4))

    new_object = MFnMesh()
    new_object.create(numVertices, numPolygons, vertices, polygonCounts, polygonConnects)
    return new_object.fullPathName()

I basically used a snippet like this to snapshot my animation:

tempfiles = []
for f in (0,4,8,12):
    writeCombined('mesh', l=True), tempfiles[-1])
newmesh = readCombined(tempfiles)
for p in tempfiles:

Important notice: I have found some random crashes in using a large amount of memory (high polycount per frame) in the writeCombined function (which may be solvable when ported to C++ an receiving proper error data).

Parameter to nurbs surface node

A simple deformer that reprojects a source mesh (considered as UVW coordinates)
onto a (series of) nurbs surfaces.

Inspired by “It’s a UVN Face Rig”

It takes an array of nurbs surfaces which must be at least length 1,
a polygonal mesh where the point positions are considered parameters on the nurbs surface; Z being an offset in the normal direction (hence UVN),
and an optional int array where there can be one entry per input vertex, stating which nurbs surface this vertex should project onto.

The default surface for every vertex is 0, so for a single nurbs surface projection no array is needed and only overrides have to be specified.

This includes full source + a project compiled using VS2015 for Maya2015 x64.
Download zip

Python test code:

PLUGIN = r'UVNDeformer.mll'

node = cmds.createNode('UVNNurbsToPoly')
nurbs = cmds.sphere()[0]
uvn = cmds.polyPlane()[0] + '.vtx[*]')
cmds.rotate(90, 0, 0, r=True)
cmds.move(0.5001, 0.5001, r=True)
result = cmds.createNode('mesh')
cmds.connectAttr(nurbs + '.worldSpace[0]', node + '.ins[0]')
cmds.connectAttr(uvn + '.outMesh', node + '.iuvnm')
cmds.connectAttr(node + '.outMesh', result + '.inMesh') + '.vtx[*]')

SIMD Matrix math for Python

Long story short: scroll down for a downloadable DLL and python file that do matrix math using optimized SIMD functions.

Recently I was messing around with some 3D in PyOpenGL and found my most notable slowdowns occuring due to matrix math (multiplications being most common).

So I decided to try and implement some fast matrix functions and call those from python, using C98 limitations and ctypes as explained here by my friend Jan Pijpers.

I won’t go into detail about the math, you can download the project files at the end; any sources used are referenced in there.

I do have some profile results to compare! Doing 100,000 calls for each action listed, time displayed in seconds.

Pure python implementation.

identity: 0.0331956952566
rotateY: 0.0617851720355
transform: 1.70942981948
inverse: 15.095287772
multiply: 0.492130726156
vector: 0.160486968636
perspective: 0.107690428216
transpose: 0.452984656091

Note that the inverse is matrix size agnostic (and not normalized!), therefore no loop unrolling is done by the python compiler. It is not representative of a proper python matrix4x4 inverse.

Using VC++ 14.0 MSBUILD, compiling release with -O2 and running without the debugger.

identity: 0.0333827514946
rotateY: 0.0857555184901
transform: 0.251571437936
inverse: 0.0439880125093
multiply: 0.0420022367291
vector: 0.288415226444
perspective: 0.156626988673
transpose: 0.0889596428649
perspective no SIMD: 0.160488955074
Using LLVM 14.0 from visual studio (not sure which linker is used there), compiling release with -O2 and running without the debugger (-O3 doesnt change the results).

identity: 0.0323709924443
rotateY: 0.0845113462024
transform: 0.23958858222
inverse: 0.0395744785104
multiply: 0.0437013033019
vector: 0.286256299491
perspective: 0.150614703216
transpose: 0.0877707597662
perspective no SIMD: 0.156242612934

Interestingly not all operations are faster using C due to type conversions. For a simple axis aligned rotation all we need is a sin, a cos and a list. The sin/cos of python are not going to be any slower than those in C, so all we did was complicate the program.

But in a practical example represented by the transform function (which is a a separate rotateX, rotateY, rotateZ and translation matrix call, then all four of them multiplied together) we see a very worthwhile performance gain.

The math executes using SIMD instructions, all data is therefore converted to 16-bit memory aligned “__m128” structures (from “xmmintrin.h”). We need the C identity and rotate constructors to get the proper type of data, then when we actually need this data we must call storeMat44() to get an actual c_float[16] for python usage.

From my current usage 1 in 3 matrices requires a conversion back to python floats in order get passed to PyOpenGL, so here is another multiply performance test with every third multiplication stored back into a float[16]…

python multiply: 0.492130726156
MVC raw: 0.0436761417549
MVC multiply convert every third: 0.06491612928
MVC convert all: 0.0925153667527

So while our raw implementation is about 11 times faster, the fully converting implementation is only 5 times faster. 7.5 times for our real world example. That’s more than 30% lost again… still much better than pure python though!

Download the visual studio project with x86 and x64 binaries here! Tested on Win10 x64 with Python 2.7.10 x64.
Math3Dx64.Dll and are the end user files.

One important thing I wish to look into is to pass data by copy instead, currently all functions allocate a new matrix on the heap, and the user has to delete these pointers by hand from python using the deleteMat44() helper function. I do not know enough about DLLs or python’s memory allocation to know whether I can copy data from the stack instead, and if so whether that would be any faster.

I do know that __vectorcall is not compatible with __declspec(dllexport), which kind of makes sense… but more direct data passing could be nice.