« January 2007 | Main | March 2007 »

February 27, 2007

Core Data Double Linked List How To

I have been working on a core data app lately and stumbled upon some data that should maintain relative order. Since core data is a collection of objects stored in a database of sorts, this is not supported out of the box. There are a couple of ways to accomplish this.

Most of us who find ourselves in this situation will keep an incremental index as an instance variable in each member of our collection and sort based on that index number. Under certain circumstances, this is the best way to do it. However, A kind of nasty problem arises when we want to reorder the collection of objects. Either we're forced to auto-increment the indices by something other than 1 to allow us a little wiggle room for local insertion or reordering of objects or we're going to have to re-index most (or all) of the objects in our collection any time the order changes or an object is inserted. For instance, if I were to insert an object at the front of an indexed list, I would probably have to iterate through the entire collection to increment the index number. No fun.

While reading the long back and forth on the cocoadev list from June of 2005, I stumbled upon a post by Bill Bumgarner that I found quite interesting.

Actually, depending on context of implementation, using a linked list 
is both quite easy and very efficient.  Obviously, inserting/deleting 
objects into a linked list is trivial and fast if you have both a 
next and a previous pointer.  Given that Core Data will maintain the 
inverse relationship automatically, it will also maintain the linked 
list pointers automatically.


Given that many UI type usage contexts will require all of the objects to be faulted into memory anyway, walking the "next" relations and filling an NSMutableArray for display purposes is about three lines of code.

... In this particular example, the amount of overhead in the form of additional lines of code incurred by the developer is pretty trivial. ... b.bum
He's writing about a common data structure called a double (or doubly) linked list wherein each object in the list holds a pointer to the next and previous objects that can, in certain situations, be more useful for ordering data than an indexed set. We can enumerate in either direction very easily, asking each object in the chain for next or previous until we get nil and we can insert, remove and reorder objects with little hassle. All we have to do is make sure that next and previous values of adjacent objects are set appropriately (more on this in a second). Unfortunately, We can't (efficiently) get an object at an index like we can with an indexed list because we're purposely not indexing the collection. Also, remember that much of cocoa will want a data structure that responds to objectAtIndex which, as Bill mentiones above, will require us to iterate the chain and put it into an array. We also can't place an object in the chain at two different positions (but two objects with identical attributes can coexist in the chain).

To use core data for ordered collections we have a decision to make. Either we can make insertion and reordering fast or we can make getting an object at an index fast. If you like the indexed set idea you can go here:

Jesse Grosjean's implementation and Uli Kusterer's as well

If you like the double linked list idea, stick around. You can download the whole package from my svn repository by getting subversion and typing the following into the terminal:
svn co https://jonathansaggau.com/svn/MOLinkedList
Data Model

Model.jpg There are four entities here:
JSMOLLContainer (abstract entity):
  Contains most of the logic of inserting, deleting, reordering, etc.
  Relationship:
     Nodes: To-Many relationship
            Delete Rule: cascade (removing the container removes the nodes)
ThingContainer (subentity of JSMOLLContainer):
  You can replace this with your subentity, including any attributes you might want to add.
JSMONode(abstract entity)::
  Contains only getters and setters for next and previous.
  Relationships:
     Container (JSMOLLContainer):
            Delete Rule: Nullify
     Next (JSMONode):
            Delete Rule: Nullify
     Prev (JSMONode):
            Delete Rule: Nullify
Thing (subentity of JSMONode):
  You can replace this with your subentity, including any attributes you might want to add.  I've chosen to add a "name" attribute (a string) as a demonstration.
Implementation

I chose to use Jonathan "Wolf" Rentzsch's mogenerator to create default implementation files. It's awesome. Default accessors are built for you and placed in a machine-generated superclass for each of your objects. Mogenerator's code keeps the managed object model consistent for us and allows us to implement specific functionality in our class files. This is magic.

First tries are always throw away code, right? When I tried to implement a double linked list through messaging the node objects (say calling setNext directly on a node object to insert a new node), things became difficult in a hurry. Basic functionality was simple; the problems started creeping in when I realized that I had to guard against a lot of strange edge cases when inserting and removing objects. Coredata objects like to stick around for undo / redo purposes, so I was constantly making sure I didn't call any objects that responded to isDeleted with YES. Needless to say, I couldn't get the unit tests to work reliably talking just to the nodes.

If one tries to implement a double linked list by forcing the user to message a managing container (something we're used to doing with arrays anyway), life gets much simpler. Here we quickly rub up against a very nice feature of coredata (and also the most frustrating thing with regard to debugging). It will happily fire inverse relationships for us to keep the object graph consistent. When we call setNext on our JSMONode subclass, coredata calls setPrevious on the inverse entity. If we forget that core data is doing that for us, we get into infinite loops.

Here's the interface file for the custom parts of the Container

 
#import "_JSMOLLContainer.h"
//_JSMOLLContainer is the machine generated code from mogenerator.  It has only
//the accessors and the KVO notifications within.

@interface JSMOLLContainer : _JSMOLLContainer 
{
    //we cache first and last values because the fetch takes a while
    @private
    JSMONode *_cachedFirst;
    JSMONode *_cachedLast;
}

- (void)addNodesObjectAtFront:(JSMONode*)value_;
- (void)addNodesObjectAtEnd:(JSMONode*)value_;
- (void) moveNode: (JSMONode *)moveMe
           before: (JSMONode *)beforeMe;
- (void) moveNode: (JSMONode *)moveMe
            after: (JSMONode *)afterMe;
- (JSMONode *)first;
- (JSMONode *)last;
- (NSArray *)nodesArray;
- (NSEnumerator *)nodesEnumerator;
- (NSEnumerator *)reverseNodesEnumerator;
@end
Adding a node


To add a node object we first add it to our collection of nodes by calling the mogenerater generated superclass method. This fires all of the appropriate KVO/KVC notifications. You'll no doubt notice that there are ivars (_cachedFirst and _cachedLast) for caching the first and last nodes in the chain. It turns out that we need to know first and last pretty often. Fetching this from core data can be painfully slow, so we cache this information when we can.
- (void)addNodesObject: (JSMONode*)value_
{
    //add node object at the END of the chain.
    [super addNodesObject: value_];
    JSMONode *last = [self last];
    JSMONode *newNext = [last next];  // should be null
    [value_ setPrev: last];
    [value_ setNext: newNext];
    [self set_cachedLast: value_];
    if (IsEmpty(last))
        [self set_cachedFirst:value_];
}
There is a similar method for putting a node object at the head of the chain.

Removing a node


To remove a node object we must first remove it from the superclass (this removes it from our collection) and set the bordering objects' next and previous values appropriately to keep the chain consistent. The appropriate inverse relationship methods are fired for us. (Remember those infinite loops I was talking about?).
- (void)removeNodesObject: (JSMONode*)value_ 
{
    [super removeNodesObject: value_];
    //set next and prev as needed on bordering objs
    if (!IsEmpty([value_ prev]))
        [[value_ prev] setNext: [value_ next]];
    else
        [[value_ next] setPrev: [value_ prev]];
}
Reordering
The easiest way to reorder a node is first to remove it (thus keeping the chain consistent with regard to adjacent objects), then tell the new previous node to set its next object to the node being moved and tell the node being moved to set its next object accordingly. (At 3:00 AM, this gets hard on the mind).
- (void) moveNode: (JSMONode *)moveMe
           before: (JSMONode *)beforeMe
{
    [self removeNodesObject: moveMe];
    
    JSMONode *myNewPrev = [beforeMe prev];
    if(!IsEmpty(myNewPrev))
    {
        //if we are not moving moveMe to the front of the chain, this works
        //otherwise there IS NO myNewPrev and we have to use the method below.
        [myNewPrev setNext: moveMe];
        [moveMe setNext: beforeMe];
    }
    else
    {
        [moveMe setPrev: myNewPrev];
        [beforeMe setPrev: moveMe];
    }
}
The rest of the implementation deals with building up simple enumerators and also with fetching the first and last nodes (where prev is Null or next is Null respectively) from the Managed Object Context when it turns out we haven't cached it appropriately. Eventually, the fetch requests will go away. Right now, they'll only likely be needed when we've done some reordering as that code doesn't update the cached first and last values as it should. To make sure that we're really returning the first node, we simply check that the _cachedFirst node's prev value is Null (actually [NSNull null]); that's what IsEmpty does here.
- (NSArray *)_nodesFilteredUsingPredicate: (NSPredicate *)predicate
{
    if(0 == [[self nodesSet] count]) return [NSArray array];
    NSEntityDescription * entity = [NSEntityDescription entityForName: @"JSMONode" 
                                               inManagedObjectContext: [self managedObjectContext]];
    NSFetchRequest * fetch = [[NSFetchRequest alloc] init]; 
    [fetch setEntity: entity]; 
    [fetch setPredicate: predicate]; 
    
    NSManagedObjectContext *context = [self managedObjectContext];
    NSArray * results = [context executeFetchRequest: fetch error: nil];
    [fetch release];
    return results;
}

- (JSMONode *)first
{
    if(0 == [[self nodesSet] count]) return nil;
    //If we haven't yet cachedFirst we need to find it.
    //IF cachedFirst previous value is not null, cachedFirst is no longer actually first.
    if (!IsEmpty(_cachedFirst) && IsEmpty([_cachedFirst prev]))
        return _cachedFirst;
    
    JSMONode *first;
    NSPredicate *predicate = [NSPredicate predicateWithFormat: @"container == %@ AND prev == %@", self, [NSNull null]];
    NSArray *results = [self _nodesFilteredUsingPredicate: predicate];
    
    if (0 == [results count])
    {
        [self set_cachedFirst: nil];
        return nil;
    }
    
    NSAssert(([results count] == 1), @"There should be one or fewer firstThing(s)");
    first = [results objectAtIndex: 0];
    [self set_cachedFirst: first];
    return first;
}

- (JSMONode *)last
{
    if(0 == [[self nodesSet] count]) return nil;
    if (!IsEmpty(_cachedLast) && IsEmpty([_cachedLast next]))
        return _cachedLast;
    
    JSMONode *last;
    NSPredicate *predicate = [NSPredicate predicateWithFormat: @"container == %@ AND next == %@", self, [NSNull null]];
    NSArray *results = [self _nodesFilteredUsingPredicate: predicate];    
    if (0 == [results count])
    {
        [self set_cachedLast: nil];
        return nil;
    }
    
    NSAssert(([results count] == 1), @"There should be one or fewer lastThing(s)");
    last = [results objectAtIndex: 0];
    [self set_cachedLast: last];
    return last;
}
Meanwhile, back at the Hall of Justice: usage So how do we use it? We simply insert a JSMOLLContainer subclass (in the case of the diagram above, it's a ThingContainer) into the managed object context like so:

container = [NSEntityDescription insertNewObjectForEntityForName:@"ThingContainer" 
                                                               inManagedObjectContext:moc];


Then we start putting node objects into it like so:

thingZero = [[NSEntityDescription insertNewObjectForEntityForName:@"Thing" 
                                                inManagedObjectContext:moc]

     //perhaps we want to set an attribute
     [thingZero setName:@"Thing Zero Here"];
     [container addNodesObject:thingZero];

     thingOne = [[NSEntityDescription insertNewObjectForEntityForName:@"Thing" 
                                                inManagedObjectContext:moc]

     //perhaps we want to set an attribute
     [thingOne setName:@"Thing One Here"];
     [container addNodesObject: thingOne];
Now we've a container with two things in this order: thingZero -> thingOne.

If we wanted to iterate through a list, we have a couple of options: 1. We can ask the container for its object enumerator and do it "cocoa style"
    JSMONode *currentThing;
    NSEnumerator *nodesEnumerator = [container nodesEnumerator];
    while (currentThing = [nodesEnumerator nextObject])
    {
        //do something
    }

2. We can ask the container for the first object and keep asking for next
    JSMONode *currentThing = [container first];
    
    while (!IsEmpty(currentThing))
    {
        //do something
        currentThing = [currentThing next];
    }

At this point, I should probably show you the IsEmpty function. This is mostly stolen from Wil Shipley's blog . I just added the test for [NSNull null], which is the "empty" object for insertion into cocoa collections (like a core data graph, for instance).
//Thanks Wil
static inline BOOL IsEmpty(id thing) {
    return thing == nil
    || ([thing isEqual:[NSNull null]]) //JS addition for coredata
    || ([thing respondsToSelector:@selector(length)]
        && [(NSData *)thing length] == 0)
    || ([thing respondsToSelector:@selector(count)]
        && [(NSArray *)thing count] == 0);
}


Inserting an object before or after another object currently in the chain is (for now) a two step process. Let's say you have a node called "node3" in a linked list and you want to add a node called "node7" to the list after node 3. Let's pretend that we have nodes zero through six already in the list. (node0 -> node1 -> node2 -> ... node6) node7 (outside)

1. Put node7 in the container by simply adding it to the end
[container addNodesObject: node7]
2. Then place it after node3
[container moveNode: node7
                   after: node3;]


That's it. It's somewhat suboptimal in that the container will insert at end, then remove, then reinsert the node behind the scenes, but I haven't had to optimize this code yet in practice and it keeps the API simple.

So there you have it. Make a subentity (or several) of the JSMOLLContainer and the of JSMONode objects and enjoy. Take a look at the subversion repository. It includes test code that will give you a good idea of how everything works in practice. Oh. Do what you want with it. It's offered AS IS. No Warranty.

February 01, 2007

Compile dar on OSX 10.4.8

CFLAGS="-DINSTALLPREFIX=/usr/local"
./configure --disable-nls
./make
sudo ./make install

that is all

self, out