I have a large number of legacy systems that – when all other avenues fail – become my responsibility to sort out. Some of those are very old Zope systems written by others and which never fail to reduce me to tears. This morning I came across some particularly good design decisions which I thought I’d share. Yes, that ‘good’ is sarcasm.
First of all let’s remember that Zope, by default, uses the ZODB. In my past lives when I used to use Zope I used to use it as a frontend to Postgres (which sounds nuts now but at the time we didn’t have lots of fancy MVC frameworks to spend our hours with). The legacy systems I have currently been gifted with however use the ZODB to save their data. You could argue that for content management systems (which is what we’re talking about here) that the ZODB is not a bad fit. I’m not going to argue one way or the other, although my own personal point of view is that it’s shit) – however, for the purposes of today’s “issue” the data was real data – i.e. information that would sit quite comfortably in a relational database as god intended.
We’re talking about parent data with child records. This particular issue was that the customer had made a mistake and I needed to remove a bunch of child records from one particular parent record and this had to be done by the sixth of March or the world would end or something similar. Fair enough, so I poke around the code and the database.
Poking around a ZODB database isn’t like poking around a relational database. You can’t look at a database schema and say “these records in this table have these fields”. All you can say is that this record in this big bucket of crap has these fields. And the record next to it could be the same type of record with different fields. Or a completely different type of record. So you poke around and try to work out what is held where. If you fancy a laugh you take a look at the code that writes the records and try and make some sense of it but frankly you’re better off pulling the nerve endings away from your fingers one by one, with a rusty pair of pliers.
So eventually I find the child records and I find a method which is called something like ‘deleteChildRecord’. It turns out this doesn’t just delete child records – it deletes parent records as well because in this brave ZODB world they’re all sitting in the bucket of slop together. Which is OK in the scheme of things because by this time its only 05:30 in the morning and we’re beyond caring. By the way – there is a doc string on deleteChildRecord but it doesn’t seem to make much sense … at all. Then I realise the same doc string is used to document virtually every method in the file. Somebody copied and pasted the same doc string twenty times and never thought it useful to change it to something else. But that’s OK. We’re used to that.
So I write some code that works out which child records to delete and run it and get the client to check. They respond saying that when they look at the parent record they can still see ‘stubs’ of the deleted child records. The codes of the deleted records are still showing against the parent but no details of the child records. This doesn’t surprise me that much because in my ZODB world it’s very hard to get rid of data. Like the brown stains around the porcelain after a particularly heavy dump, there’s always some lingering remnant that remains in the crusty crevices of the database even after repeated flushings. Often it’s because the indexing system that Zope and it’s content management framework (CMF) uses hasn’t removed it’s indexes of metadata even when the real record has been removed. Because you should never query the database directly (unless you fancy wallowing through 50GB of binary data record by record) code will just query the indexes (known as the catalog) which contains the main info you need and will then pull out the records for each catalog item it finds.
Am I boring you? By the way, it just occurred to me that the best way to ensure that all traces of a particular record are removed from Zope are to tell it that’s it’s vitally important that it should be kept. I can guarantee you’ll never see it again. Anyway – back to the story. Oh god. So I think … the catalog has not updated itself. I will rebuild the catalog. This I do. This is a simple thing to do – you just click on a button labelled ‘Rebuild catalog’.
I now have no data.
You see, whatever genius designed this part of the system decided to store the data in the place where you’re supposed to just store the indexes and metadata of a record. Just as you thought you were winning and the baddie had been despatched to the lower reaches of hell you see a shape in the window and it turns out you’re still thirty minutes from the final credits and you’re not halfway through the body count yet.
Fortunately, like the seasoned campaigner I am, I am not doing this on the live system – having transferred the 50GB bag off poo to a staging site before commencing this exercise. Hell, I don’t log onto a bloody legacy Zope site of ours without taking a backup, dumping it to tape and moving it 100 miles offsite.
Right start again and lets start looking at the code.
I can see code that creates parent records. I can see code that creates child records (all got the same doc strings). I can actually see quite a lot of code that creates child records. And I can see code that deletes child records (with the same doc string). Unfortunately that code doesn’t tell a parent record that the child record has been removed so if you do actually use that routine you’ll find the system fails spectacularly the next time you try and view anything. OK – so … the parent record stores the list of related children as a tuple of id’s. So we write some code that takes the tuple (which is immutable) to a list, and then modifies the list every time we remove a child record, converts it back to a tuple at the end and then writes it back to the parent record when we finish.
But of course that still doesn’t work because things such as ‘child count’ are not calculated automatically – they’re stored as properties as well on the parent record. So we manually count the records and update the child count property but then find that even though we have updated that – we have two other counts which are held as two other fucking properties, one for the two possible type of child records we can have and the parent record is wetting itself in the only way it knows how by vomiting python tracebacks over the screen.
So in short when you delete a child record you have to manually a) tell the parent record the child record has gone b) tell the parent record to decrement a specific count for the type of child record you have deleted and c) tell the parent record to manually decrement a specific count for the total of child records you now have.
This I do – give the results to the client who says it all looks fine apart from one record which they don’t recognise the code for and has a completely different child type assigned to it from anything else and would I know why that was?
No, I don’t. I really don’t. I don’t understand this data, I don’t understand the structure (partly because I don’t think there is one). Frankly I hardly know where the floor is at this point and even such concepts as light and dark have gone hazy.