As a programmer, I spend most of my life using a piece of software called emacs. It’s the best thing. (This MAY be a cue to civilians to stop reading, unless you want to endure an episode of narcolepsy.)
Emacs is nominally a “programmers editor”, but it does everything. As such, I keep my notes, daily jottings and such down in it. And I’ve BEEN keeping my “at my desk” scribblings in there for more than 20 years. So, as you might imagine have a lot of these files kicking around.
For the most part, they’re in an outline format (horribly abusing org-mode) that looks like this:
* 2016-06-04 15:49 - This is a topic.
This is some stuff about the topic
** this is a sub topic
This is some stuff about a sub topic.
** this is yet another sub topic
*** Getting a little crazy with the sub-sub topics
But, as it turns out, I have these things with up to 5 or 6 sub topics. So three isn't really all that outlandish.
* 2016-06-03 12:01 - Yesterday's topic
Yeah. See here's the problem.
I am (as you might also imagine) trying to put all of these files in a simple database of nodes. Now, it’s easy enough to rip through a file and detect a new outline node, it’s outline depth, title, content, etc. Then put those in the database and run a simple post process to flesh out the parent/child relationships (if I’m even going to bother. I may just not care. I ain’t fer sure yet, not ’til I write the front-end application that manages this stuff, which is some ways away.)
ANYway. So I take a file and I split it in to pieces and number the nodes sequentially, top to bottom, inserting them in the database.
A week goes by and I want to refresh the database from the actual files again. Problem is, new nodes are added at the top, not the bottom. But I’ve numbered everything sequentially, so it becomes an interesting proposition. If yesterday’s top node is node 1, what is today’s top node?
I suppose I could conceivably just wipe the database representation of the file and start again, reimporting the whole thing over the top of where the old one was.
There’s something to that, I suppose. I mean, it’s disgusting, but it would work…mostly.
Yeah I suppose I’m going to have to do that. I don’t see another way to do it without creating some really peculiar sequencing issues. I DO have the issue where I will sometimes delete nodes from my outline. But That’s generally handled by the fact that I keep daily backups of these files, and the dated backups will be imported as well, and will therefore not be subject to such flux. (Yes, I know this leads to a lot of data duplication. But the upside is that I simply won’t have to worry, unless I add and delete archivally interesting information intra-day, which even I can’t imagine doing to myself.
BUT, what THAT means is that I’m going to have to be very careful about identifying what makes a file unique. It’s not just file name, since the file I used on my old computer had the same name as the file I use on this computer. It can’t be size, since the current file changes. It can’t be a hash of the contents for the same reason.
I think I’m stuck with Machine Name + Drive name/letter + File Path + File Name. That way, even if I have old archives, I’ll still be able to uniquely identify every file by its location. Yeah. That’ll do it.
Off to the code!
Good talk o7