A Codd Subrelational Analyst

I started developing applications in Informix-SE and Informix SQL, essentially incorporating Bourne Shell Scripts for process control purposes, in the late 80s.

My job title at Murphy Limited in London N17, which I joined in 1987, was “Group Accountant” , but I managed to wangle becoming a developer, although I retained a significant number of other responsibilities.

Early on in my time at Murphys, the Purchase Ledger Supervisor, Brenda, who had previously operated manual / paper Purchase Ledgers, observed to me that there was something daft and unsatisfactory about the way in which the Accounting System, that I had been brought in to help bed in, handled, or failed to coherently handle, changes in Supplier NAMES.

I gave this a lot of attention, and as a result I developed in 1992, for my sons’ school, an Informix based “Class Lists and Postal Labels” mini-application, since the newly and expensively laid in Bursary Suite could not fulfill those functions.

This was, with hindsight, an important step in the train of thought that Brenda had triggered within me.

The critical aspect was that I had begun to think and talk about Codd “Primary Key” columns in a new way, that was reflected in the fact that I explicitly referred to these columns, whose functional role I had re-interpreted, as “aii” columns, where that stands for

“Attribute Independent Identifier”

The application was a success, in that it was functionally effective and efficient; in the label-production aspect it only ever produced ONE label to suit the defined mailing circumstances.

It didn’t for example, produce as many parental / family labels as there were children in the notional target child group. Among other things the underlying information was very effectively normalised, which was greatly facilitated by my use of my “Identifiers”. It had one Subrelation, among others, called “Families”, and another called (I draw on hindsight) “Postal Places”.  Postal addresses were NOT recorded in the “Families” Subrelation.

The school’s IT teacher and technician took a kind interest in what I had done, in transferring the information from Informix into the Micro-SITH “Access” vehicle, and warmed to my new architecture.

I then developed a massively effective “aii” conceptually based application back at Murphys, to administratively support the Company’s Cable Laying Contract with Yorkshire Electricity.

Among other things I pretty much wore a groove in the M1 motorway, on the way to getting the system running in the Company’s offices in Leeds, Sheffield, Hull and Harrogate.

The latter two offices connected to the London Head Office UNIX system via those funny old things called “modems”; the Informix-generated electronic traffic was so light that the performance was not noticeably different from that at the Leeds and Sheffield offices, which were connected to London via ISDN2; all mod cons! This was 1995. The only real hassles were to do with printing, on HP Inkjet printers connected to the Yorkshire office PCs via a parallel cable; when the print failed for some reason or other (paper, ink), it was awkward to re-send.

So I proved a point back then, almost 20 years ago.

It has taken me since then to assemble a useful and conceptually reflective set of words, some of which had to be in effect new as regards how they are used in language.

What a “Subrelation” refers to can be usefully thought of like this.

About two millennia before Christ was around people started reflecting verbal language in written form, and not long after that they started recording information on clay tiles, where the layout is now referred to as “tabular”, as residing in “tables”. The good thing about a table is that you can condense a mighty amount of abbreviated basically propositional information into a small space. I understand that one of the first such tables gave the area of a field by reference to its length and breadth. We now have calculators…..

Some four millennia later Dr Ted Codd made a critical enhancement to the basic and long running (i.e. very successful) concept of a table, by dreaming up his concept of a “Relation”, although, blast it, he so often reverted to calling his “Relations” “tables” in aspects his written descriptive and explanatory work.

He imposed the idea of having the first column in a previous “table” as having have a uniqueness-of-content requirement imposed on it. This was pro-found! It started dragging the way we think and talk out of our minds / heads, and reflecting it in external and shareable Information Systems.

It was mighty in practical effect, but there was a residual snag, that Codd referred to in his “RB-33″ conundrum.

In my view that conundrum, and a host of other conundra, are resolved if you (1) define what is an “Identifier” (elsewhere on this site) and then (2) populate Codd’s “Primary Key Columns”  ONLY with Identifiers; you split out any previous Attribute content of a Primary Key Cell content, and bung it safely into a true and proper “Attribute” column’, leaving the previous and previously conflicting “Identifier” aspect in place.

An Identifier can be usefully also referred to as a “Reference Point”. A philosopher who taught me in University College, Oxford, the wondrous Gareth Evans, had a seminal work posthumously published, via his friend and editor John McDowell, who also taught me; the book is entitled:

“Varieties of Reference”.

Wittgenstein was there OK as well. A Cambridge Man……

To dip, perilously, into philosophical parlance, we used to try to treat variable and transient “NAMES” as “Identifiers”, and that was always going to give us problems, notwithstanding that so much of Codd Relational was fabulous, even though only a bit of it (the very concept of a “Relation”) was actually implemented in RDBMSs.

Updateable NAMES in Information Systems were never going to make reliable points of reference, quite simply because they periodically and inevitably  need updating, to reflect NAME changes in the environment reflected by the Information System; they should always have been freely updateable.  And they are now, in Subrelational.

If you referred to something  by its NAME in an Information System, then that reference was always going to fail when the NAME, entirely validly, was changed, EITHER for the correction of error, OR to reflect a change in the outside world.

So a Relation was a table with an extra property imposed, in essence, on one column, the first “Primary Key” column in the array.

A Subrelation is the last push at that; it has a “Primary Key Identifier Column” in the first, primary, column place, where all of  its cell contents are immutable, for the life of the Information System, and they are also meaning-less, and they are system-unique, so that they are of mathematical necessity also unique within any single PKIC.

Yours, George, Monday 4th November 2013, 18:16 GMT – bloody dark outside!