Introducing listor — a list/vector

In implementing my never-ending editor project, I came across a fundamental problem. Some operations were O(N2) — that is, they took an exponentially longer amount of time with the amount of data they handled. The “kryptonite” here is vi's “reverse the sequence of lines” command:

g/^/m0

If you're not a vi aficionado, let me break it down for you. The “g” command says “globally for all lines in the file that match the given regular expression, do the following command.” The regular expression is given between slashes, and is the up-caret character (“^”) — it matches the beginning of the line. Well, all lines have a beginning (they might not have any content, but they have a beginning). Ok, so we're matching all lines in the file. The command is given by “m”, which is the “move” command. It moves the line range given on its left hand side to after the address on its right hand side. So, it moves the line matched by each iteration of the “g” command to after line 0. If you think about it for a second, you'll see that it effectively reverses the line ordering in the current file. (Line 1 moves to after line 0, which is line 1, so no change. But line 2 moves to after line 0, so it's now before line 1. Line 3 moves to after line 0, so it's now before line 2. And so on.)

The way that most editors are implemented is that there's an array that holds pointers to the individual lines. For simplicity:

char *lines [MAX_LINES];

Line 1 is stored at lines[0], line 2 is stored at lines[1], and so on. When we execute g/^/m0, we move all of the lines down by one position, and store the selected line at position 0. We repeat this for each and every line in the file. If there are 100 lines in the file, we do this 100 times.

The math is important: we're moving 99 lines, 100 times. The total number of bytes moved is 99 x 100 x sizeof (char *) which is 79k bytes on a 64-bit machine. If we had a 200 line file, you'd think it would be 2 x 79k — but it's not. It's actually 199 x 200 x sizeof (char *) which is 318k bytes (4X the size). For files in the hundreds, thousands, even tens of thousands of lines in length, this is a trivial amount of data to copy (a 100,000 line file would take 99999 x 100000 x sizeof (char *) or 80GB of data transfer — this is where things start to bog down). But a 1 million line file? That's about 8 TB of data. This takes minutes of CPU! This is an example of an O(N2) problem. (And in case you think a 1M line file is big: trust me, it's not, especially when you're dealing with dump files, and there are lots of other cases!)

Ok, so how do we fix it? During development, I had the idea that I could store my lines as a linked list, rather than as a vector. So, in C++ I did:

list <char *> lines;

instead of the C++ equivalent of the above array:

vector <char *> lines;

Every website I visited said that you should “prefer vector<> over list<> unless there's a compelling reason”. I figured that eliminating the N2 runtime characteristic was compelling. So, I was able to make my editor be much faster than the current “production” editors (e.g., vim).

But lists are a real pain in the ass to use, especially when you are used to random access. If I wanted to get to line 343423 in a big file (that was implemented using a list), I'd have to start at the beginning of the list and traverse 343422 entries until I got to entry number 343423. That's a lot of work! [You could optimize the work slightly by deciding to start at the end of the list if that was closer to the target line, but I digress.] Plus, in a list, given two elements, there's no way to tell which one is before the other, unless you traverse a significant number of elements to see which one gets hit first [3 CPUs could speed that up, still I digress.]

I struggled. I really wanted the random access “easy to use, easy to compute, easy going in general” nature of vectors, but I wanted the speed of lists.

Thus was born the listor.

Tell us about the listor already!

So, the simplest implemention of the listor is what I use in my editor today:

class OneListor
{
    uint32_t    next = 0;
    uint32_t    prev = 0;
    char        *content = NULL;
};

class Listor
{
    vector <OneListor>  elements;
    bool                is_list = false;
};

The Listor class is a vector of elements, each of which can take on the list characteristics (via the prev and next members). The first thing to do is to completely ignore these members. If you do that, then the Listor is nothing more than a vector of char *, just like we had above. Easy to randomly access, easy to test for ordering, and so on. But when it comes time for the kryptonite test, we turn on the is_list member (via a special function which we'll get to shortly), and use the prev and next members. When we're done moving lines around, and we want to switch back to vector mode, we convert the list to a vector. There are several member functions of class Listor that are relevant here:

class Listor
{
    // administration / management
    void    listify (void);
    void    vectorify (void);

    // operations
    void move (int target, int source);
    void remove (int target);
    void append (int target, int source);
};

In order to do the kryptonite test, we call listify to “convert” the Listor into a list, and then use the move member functions to do the work. When we're done, we call vectorify to convert the Listor back into a vector and, voila, we're done!

This is the actual code sequence from one of the unit tests for Listor:

Listor lines;

// add five lines to the test case
lines.push_back ("Line 1");
lines.push_back ("Line 2");
lines.push_back ("Line 3");
lines.push_back ("Line 4");
lines.push_back ("Line 5");

// perform kryptonite test
lines.move (1, 0);
lines.move (2, 0);
lines.move (3, 0);
lines.move (4, 0);
lines.move (5, 0);

// restore vectoricity (vectorocity? vectorness? vectorousness?)
lines.vectorify ();

None of these operations are O(N2); let's look at their implementations.

Member function listify

Before we perform “expensive” operations, we “listify” the Listor — that is, we “convert” it to its list form. This is O(N):

void
Listor::listify (void)
{
    if (is_list) return;

    int     nelements = elements.size();

    for (int i = 0; i < nelements; i++) {
        elements [i].prev = i - 1;
        elements [i].next = i + 1;
    }

    // special cases at the beginning and the end
    elements [0].prev = 0;
    elements [nelements - 1].next = 0;
    is_list = true;
}

Easy peasy! We simply set up the prev and next members so that we can use them, and set the is_list flag to indicate that we are in list mode.

Member function vectorify

This one is a little more complicated; it follows the chain of next pointers and reconstructs the same ordering within the vector:

void
Listor::vectorify (void)
{
    if (!is_list) return;                               // we're already a vector, don't worry about it
    is_list = false;                                    // mark us as being a vector

    int nelements = elements.size();
    for (int target = 1; target < nelements; target++) {
        int source = elements [target - 1].next;
        if (target == source) continue;

        if (elements [source].next == target) {
            elements [source].next = source;
            elements [target].prev = target;
        }

        swap (elements [source], elements [target]);

        int tn = elements [target].next;
        int tp = elements [target].prev;
        int sn = elements [source].next;
        int sp = elements [source].prev;

        elements [tn].prev = target;                    // fixup correctly placed location's next's prev
        elements [tp].next = target;                    // fixup correctly placed location's prev's next
        elements [sn].prev = source;                    // fixup temporary placed location's next's prev
        elements [sp].next = source;                    // fixup temporary placed location's prev's next
    }
}

Nice, another O(N) function. An important consideration for the design here was that it not destroy the list structure.

Of course, the individual worker member functions, like move, append, and remove, are all trivial to implement in their list forms.

Member function remove

The remove member function just takes an element and removes it from the linked list — the element remains in place in the vector.

void
Listor::remove (int target)
{
    listify ();
    elements [elements [target].prev].next = elements [target].next;
    elements [elements [target].next].prev = elements [target].prev;
}

Member function append

Append is trivial as well:

void
Listor::append (int target, int source)
{
    listify ();
    elements [target].prev = source;
    elements [target].next = elements [source].next;
    elements [source].next = target;
    elements [elements [target].next].prev = target;
}

Member function move

And finally, kryptonite turns out to be really simple as well:

void
Listor::move (int target, int source)
{
    listify ();

    remove (target);
    append (target, source);
}

Implementation Notes

I chose the uint32_t as the preferred form of “pointer” for several reasons:

Summary

This is the first iteration of the code, there may be bugs :-)

I feel that this is kind of a polar vs rectangular coordinates kind of thing. Each has their benefits, and there's a way to convert between them. A major reason I feel this to be a novel approach is because the distinction between a list and vector is so fundamental, that they are considered as opposites. That is, almost nobody would think to use them together; and those that would, would be concerned about the cost of converting listvectorlist.

In this particular use case, the conversion in representational space is O(N), and given that a whole bunch of operations are grouped / batched via vi's g command, this represents the kind of performance improvement that you'd expect from O(N2) → O(N). The overhead is 2N — one N to convert vectorlist, and one N to convert listvector.

Yes, not all “N”'s are created equal; the time spent using memmove to move the array down one element versus the time spent doing that in a for loop byte-by-byte may not be equivalent, but, as the formula leads you to generalize, these “N”'s are all within the same order of magnitude.

What about the memory overhead? I've done my part for the environment: by choosing a char * instead of a string, I've already saved 32 - 8 = 24 bytes (on my system) per element, so adding 2 x uint32_t (8 bytes) of overhead still leaves me 16 bytes ahead of the game. [string is awesome in general, but it's not appropriate here because the lines are heap-based and abandoned in place for journaling; we never reallocate or change them once they're committed to memory.]

More seriously, though, we're trading off memory for performance. By basically doubling the pointer size, we're now able to have a symbiotic relationship between two radically different access methods. It's a linear vs exponential tradeoff. I've added to each size (an O(N) effect), but I've logarithmically reduced computation (an O(N2) effect).

Future Work

I think that the symbiosis of the two access methodologies can be made even tighter. If we could keep the list representation synchronized with the vector representation, then we could avoid one of the conversions. Synchronization is destroyed during the vectorify function, so it might be possible to code around it (haven't looked). vector-mode desynchronization can occur if lines are added or deleted where it makes sense in a vector context. For example, individual commands, like a visual “delete line” command that just memmove's the vector space around. Summary: keeping the list synchronized to the vector should be possible, keeping the vector synchronized to the list is undesirable and should be an explicit function call.

Currently, the move, remove, and append switch over to list mode and do their work. We could leave the mode decision to the caller, and have these functions check the mode, and perform the work in the given mode. For example, deleting a single line out of the middle of a file when in visual mode in vi is perfectly acceptable to do in vector mode — after all, the lines will all need to be moved up and the screen updated; there are no other operations to “batch up” and hence do in list mode.

As an optimization, we could do range limiting. If we know that the list operations only touched a limited range of the vector, we might be able to shorten the amount of time we spend in vectorify in order to restore the vector view (i.e., it would be limited to repairing just the unsynchronized areas).

Other Applications

The Listor's list access method could come in handy for insertions and deletions as well. If those were performed in list-space, the vectorify function optimally reorders the workspace after the list operations are done.

Templating, STL

It might be a good idea to template out the size of the prev and next members, and to template out the actual content element. It might even be possible / desirable to make this into a proper STL class.