Python dependency graph

Beware, code heavy post
Node graphs are a great way to express relationships and logical flow between objects, both for code design and 3D scenes. Where a tree can only express single parent-child relationships, a node graph can show a hierarchy and then display how one object has a look-at logic to look at another object entirely.

This can be done for almost any type of data, and a dependency graph is an evaluation model that tries to evaluate as little of the graph as possible. An evaluation model is the way our code traverses the graph’s data to compute the end result.

A dependency graph works by lazily computing data and by storing the “dirty” state of each attribute (dirty meaning something has been changed since the last calculation of this attribute). I’ll try to illustrate with this animation of 2 colors being mixed. When an input changes it propagates that change by “dirtying” all downstream dependencies (red).

When a value is requested, input wires are followed and for outputs the node is computed (yellow), but only if the attribute is “dirty”, else we just use the cached state (green).

So to sum it up in a few simple rules:

To keep it simple I’ll say “if an input on a node changes all its outputs are dirty”, which means nodes should be quite granular as having inputs that only change part of the outputs will waste computation.

Also any connection going out of an attribute (whether it is an input feeding another input directly or an output feeding another input) will dirty all targets. These 2 simple rules result in recursive dirtying of all the right inputs and outputs throughout the graph.

When a value is requested its cached value is returned unless it’s “dirty“ in which case (for outputs) the node will be computed or (for inputs) the other side of the connection will be queried, which results in this rule applying recursively and the minimal graph being computed to get the right value in the end.

Coding it

Now with that theory out of the way, I’ve written some python to do exactly this. First let’s take a look at the initial data structure. A plug must understand connections, ownership, dirty-state and retain an internal value (e.g. the result of computing an output, or just a constant value for an input that is not connected to anything).

class Plug(object):
    def __init__(self, name, node, isComputed):
        self.__name = name  # really used for debugging only
        self.__node = node  # the parent node this plug belongs to
        self.__isComputed = isComputed  # make this an output-plug
        self.__isDirty = isComputed  # upstream dependencies changed? computed plug always have dependencies on creation
        self.__cache = None  # last computed value for outputs, user set value for inputs
        self.__sources = []  # plugs we depend on
        self.__targets = []  # plugs we feed into

    def clean(self):
        self.__isDirty = False

class DependNode(object):
    def __init__(self, computeFunc=None):
        self.__plugs = OrderedDict()  # list of node plugs, dict key should match plug name
        self._computeFunc = computeFunc  # logic implementation, should be a callable that accepts this DependNode instance as only argument

    def compute(self):
        # call compute function if node has logic
        if self._computeFunc is not None:
            self._computeFunc(self)

        # clean all plugs
        for plug in self.__plugs.itervalues():
            plug.clean()

    def addInput(self, name):
        self.__plugs[name] = Plug(name, self, isComputed=False)

    def addOutput(self, name):
        self.__plugs[name] = Plug(name, self, isComputed=True)

This template allows us to set up a basic node with inputs and outputs and to provide a custom function to calculate this node’s outputs.

const = DependNode()
const.addInput('value')

The next step is to allow setting a value to a specific plug. To do this, I’ll use some meta-programming to make the resulting code more readable. Implementing __getattr__ and __setattr__ on the DependNode as well as value() and setValue() on the plug to alter the Plug.__cache:

# Added to the end of the Plug class:
    def value(self):
        return self.__cache

    def setValue(self, value):
        self.__cache = value

# Added to the end of the DependNode class:
    def __getattr__(self, name):
        return self.__plugs[name].value()

    def __setattr__(self, name, value):
        if name not in ('_DependNode__plugs', '_DependNode__computeFunc'): # To enable setting attrs in __init__ we must revert to default behaviour for those atrtibute names, notice how mangled names are required for private attributes.
            return self.__plugs[name].setValue(value)
        return super(DependNode, self).__setattr__(name, value)

Now we can do this and see the internal state reflected properly:

const.value = 2.0

The next step I suppose is to create a simple “add” node which sums up its inputs. To do this we’ll have to add connections. I’ll implement this on the Plug class, where a plugs sources and targets are managed implicitly:

    def addSource(self, sourcePlug):
        if sourcePlug not in self.__sources:
            self.__sources.append(sourcePlug)
            sourcePlug.__targets.append(self)

    def removeSource(self, sourcePlug):
        if sourcePlug in self.__sources:
            self.__sources.remove(sourcePlug)
            sourcePlug.__targets.remove(self)

The next step is to start implementing some of our rules. First rule we need is: compute when we are dirty. This I do in Plug.value().

    def value(self):
        if self.__isDirty:
            # we are going to get clean, set clean beforehand so the compute function can get other plug values without recursively triggering compute again.
            self.__isDirty = False
            # compute dirty output
            if self.__isComputed:
                self.__node.compute()
            # fetch dirty input connection
            elif self.__sources:
                self.__cache = [source.value() for source in self.__sources]
            # plug is clean now
            self.clean()
        # return internal state
        return self.__cache

So now outputs are computed, inputs return their sources, all this is cached on the plug so it only computes when dirty is true.
The next step is to actually set and propagate the dirty state. So whenever we set a value, set a source, or a plug gets dirtied: all outgoing connections are dirtied. When the dirty happens to an input, the node’s outputs are dirtied. See the _dirty implementation below for the Plug class. The setValue and add- removeSource functions just get a _dirty call in the end.

# modifications to Plug
    def setValue(self, value):
        self.__cache = value
        self._dirty()

    def addSource(self, sourcePlug):
        if sourcePlug not in self.__sources:
            self.__sources.append(sourcePlug)
            sourcePlug.__targets.append(self)
            self._dirty()

    def removeSource(self, sourcePlug):
        if sourcePlug in self.__sources:
            self.__sources.remove(sourcePlug)
            sourcePlug.__targets.remove(self)
            self._dirty()

# added to Plug
    def _isComputed(self):
        return self.__isComputed

    def _dirty(self):
        if self.__isDirty:
            return
        self.__isDirty = True

        # dirty plugs that are computed based on this plug
        if not self.__isComputed:
            for plug in self.__node.iterPlugs():
                if plug._isComputed():
                    plug._dirty()

        # dirty plugs that use this plug as source
        for plug in self.__targets:
            plug._dirty()

# the DependNode class also gets this iterPlugs implementation
    def iterPlugs(self):
        return self.__plugs.itervalues()

Whew, now after all that we should be able to run this test to add 2 and 4 together using 3 nodes:

const = DependNode()
const.addInput('value')
const.value = 2.0

const2 = DependNode()
const2.addInput('value')
const2.value = 4.0


def addFunc(node):
    node.output = sum(node.inputs)


add = DependNode(addFunc)
add.addInput('inputs')
add.addOutput('output')
add.plug('inputs').addSource(const.plug('value'))
add.plug('inputs').addSource(const2.plug('value'))

print add.output

Here is a slightly more interesting example of a pointOnCurve node that computes a point on a bezier segment and feeds it to a different node:

def pointOnCurve(node):
    # interpolate a 2D bezier segment with t=node.parameter
    cvs = node.cvs2[0]
    t = node.parameter[0]
    r = [0, 0]
    for i in xrange(2):
        a, b, c = cvs[1][i] - cvs[0][i], cvs[2][i] - cvs[1][i], cvs[3][i] - cvs[2][i]
        d = b - a
        a, b, c, d = c - b - d, d + d + d, a + a + a, cvs[0][i]
        r[i] = (t * (t * (t * a + b) + c) + d)
    node.point = tuple(r)

timeNode = DependNode()
timeNode.addInput('time')
timeNode.time = 0.0

curveNode = DependNode()
curveNode.addInput('cvs')
curveNode.cvs = ((0.0, 0.0), (0.2, 0.0), (0.0, 0.2), (0.2, 0.2))

pointOnCurveNode = DependNode(pointOnCurve)
pointOnCurveNode.addInput('cvs2')
pointOnCurveNode.addInput('parameter')
pointOnCurveNode.plug('parameter').addSource(timeNode.plug('time'))
pointOnCurveNode.plug('cvs2').addSource(curveNode.plug('cvs'))
pointOnCurveNode.addOutput('point')

transform = DependNode()
transform.addInput('translate')
transform.plug('translate').addSource(pointOnCurveNode.plug('point'))

print transform.translate
timeNode.time = 0.5
print transform.translate

Now there are 2 more improvements I want to add. First whenever a plug has sources, there is no real need to cache the result. We can just directly read from the sources and save ourselves a bunch of copying. Second I want to be able to alter an input plug temporarily. Imagine an interface with a point moving over time, it may be nice to alter that point by hand to e.g. feed that back in some animation system. In this case it is important for us to set the translate of a transform without it jumping back to the source time as soon as we redraw (which would read the translate attribute).

First I edit Plug.value() to use the cache only for computed data.
Now that __cache is freed up I plan to use it for the user-override. So if a cache is available I want to use it at all times.
Next I return sources if available, else cache again which should in this case be computed.

Next in Plug._dirty() I only dirty computed plugs, and I set the cache to None if the plug as sources coming in. This will result in a small problem with Plug.setValue: it currently caches the value and then dirties the plug. That means that on input plugs the cache is set to None immediately. I just swap those 2 lines so my setValue sticks, and the user override also works.

    def value(self):
        if self.__isDirty and self.__isComputed:
            # compute dirty output
            if self.__isComputed:
                self.__node.compute()
            # plug is clean now
            self.clean()

        # override cache for input attributes to intervene with connections?
        if self.__cache is not None:
            return self.__cache

        # fetch input connection
        if self.__sources:
            return [source.value() for source in self.__sources]

        return self.__cache

    def _dirty(self):
        if self.__isComputed:
            # don't dirty again
            if self.__isDirty:
                return
            self.__isDirty = True
        if self.__sources:
            self.__cache = None

        # dirty plugs that are computed based on this plug
        if not self.__isComputed:
            for plug in self.__node.iterPlugs():
                if plug._isComputed():
                    plug._dirty()

        # dirty plugs that use this plug as source
        for plug in self.__targets:
            plug._dirty()

    def setValue(self, value):
        self._dirty()
        self.__cache = value

Using the previous example I can now see how overriding the parameter works until time is changed again making it use the source connection again:

print transform.translate
timeNode.time = 0.5
print transform.translate
pointOnCurveNode.parameter = 0.25
print transform.translate
timeNode.time = 0.5
print transform.translate

Full code:

from collections import OrderedDict


class Plug(object):
    def __init__(self, name, node, isComputed):
        self.__name = name  # really used for debugging only
        self.__node = node  # the parent node this plug belongs to
        self.__isComputed = isComputed  # make this an output-plug
        self.__isDirty = isComputed  # upstream dependencies changed?
        self.__cache = None  # last computed value for outputs, user set value for inputs
        self.__sources = []  # plugs we depend on
        self.__targets = []  # plugs we feed into

    def clean(self):
        self.__isDirty = False

    def value(self):
        if self.__isDirty and self.__isComputed:
            # compute dirty output
            if self.__isComputed:
                self.__node.compute()
            # plug is clean now
            self.clean()

        # override cache for input attributes to intervene with connections?
        if self.__cache is not None:
            return self.__cache

        # fetch input connection
        if self.__sources:
            return [source.value() for source in self.__sources]

        return self.__cache

    def setValue(self, value):
        self._dirty()
        self.__cache = value

    def addSource(self, sourcePlug):
        if sourcePlug not in self.__sources:
            self.__sources.append(sourcePlug)
            sourcePlug.__targets.append(self)
            self._dirty()

    def removeSource(self, sourcePlug):
        if sourcePlug in self.__sources:
            self.__sources.remove(sourcePlug)
            sourcePlug.__targets.remove(self)
            self._dirty()

    def _isComputed(self):
        return self.__isComputed

    def _dirty(self):
        if self.__isComputed:
            # don't dirty again
            if self.__isDirty:
                return
            self.__isDirty = True
        if self.__sources:
            self.__cache = None

        # dirty plugs that are computed based on this plug
        if not self.__isComputed:
            for plug in self.__node.iterPlugs():
                if plug._isComputed():
                    plug._dirty()

        # dirty plugs that use this plug as source
        for plug in self.__targets:
            plug._dirty()


class DependNode(object):
    def __init__(self, computeFunc=None):
        self.__plugs = OrderedDict()  # list of node plugs, dict key should match plug name
        self.__computeFunc = computeFunc  # logic implementation, should be a callable that accepts this DependNode instance as only argument

    def iterPlugs(self):
        return self.__plugs.itervalues()

    def compute(self):
        # call compute function if node has logic
        if self.__computeFunc is not None:
            self.__computeFunc(self)

        # clean all plugs
        for plug in self.__plugs.itervalues():
            plug.clean()

    def addInput(self, name):
        self.__plugs[name] = Plug(name, self, isComputed=False)

    def addOutput(self, name):
        self.__plugs[name] = Plug(name, self, isComputed=True)

    def plug(self, name):
        return self.__plugs[name]

    def __getattr__(self, name):
        return self.__plugs[name].value()

    def __setattr__(self, name, value):
        if name not in ('_DependNode__plugs', '_DependNode__computeFunc'):
            return self.__plugs[name].setValue(value)
        return super(DependNode, self).__setattr__(name, value)


def pointOnCurve(node):
    print 'compute'
    # interpolate a 2D bezier segment with t=node.parameter
    cvs = node.cvs2[0]
    t = node.parameter[0]
    r = [0, 0]
    for i in xrange(2):
        a, b, c = cvs[1][i] - cvs[0][i], cvs[2][i] - cvs[1][i], cvs[3][i] - cvs[2][i]
        d = b - a
        a, b, c, d = c - b - d, d + d + d, a + a + a, cvs[0][i]
        r[i] = (t * (t * (t * a + b) + c) + d)
    node.point = tuple(r)

Now I’ve also been working on sort of an entity-component based system on top of this graph, based on the PyOpenGL mesh renderer from a few posts back.

It allows me to create a scene graph based on Transforms and exported logic (using DependNodes like above) but also logical relationships which are not necessarily customizable, e.g. when rendering a camera I can query it’s entity, whose transform will provide the view matrix, without connecting these things. This creates a more classical (and less customizable) framework for render logic, but gives full control over the scene logic behind it through the dependency graph.

Using the point on curve example above and feeding it to a cube’s transform – while also implementing a timeline UI & 3D gizmo drawing – this fun thing came out:

Now the next steps would probably involve adding shader support, file watchers and creating a more elaborate Maya exporter that supports various Maya nodes (Maya is also dependency graph based so it fits very well with this!) but I’m not sure if I’m going to keep working on this.

Leave a Reply

Your email address will not be published. Required fields are marked *