SV: [RDF] Version handling
Stefan Andersson
stefan@c64.org
Thu, 28 Sep 2000 21:51:11 +0200
Actually, I've just begun to read the book 'The User Illusion : Cutting
Consciousness Down to Size' by Tor N=F6rretranders. (In swedish: ' M=E4rk
V=E4rlden') - I've only gotten like 70 pages into it, but it is reeaaally
interesting with regards to WRAF and the kinds of issues we're battling
here. One of the main points in the book is that 'forgetting'
information (and, I suppose, selecting) is more costly than obtaining
it... and he makes the 1-to-1 connection between information and
entropy. And actually touches on formulas as a strategy of battling
'entropy'.
I've just realized that some of my presumptions with regards to
'information' and caching and what-have-you might have been wrong. Or at
least too shallow.
(But one of the things I think is more correct, is every viewers right
to a subjective POV and _level_ of information - or personal degree of
information entropy, so to speak. It is the _viewer_ that is the
anchoring point, not the _model_.)
Stay tuned.
/Stefan
Jonas Liljegren wrote:
>=20
> Stefan Andersson <Stefan.Andersson@ullmans.com> writes:
>=20
> > > The basic thing we want to do is to filter out previous versions of=
a
> > > statement. Let's say that I chnage the phone number. The old
> > > statement remains but should not normaly be shown.
> >
> > Yeah, there is a few 'atomic' filters - 'current valid version (a.k.a.
> > latest)' that deserves an extra optimization boost, but (as you know)=
I
> > think it's an interesting idea to think of a filter as a 'virtual mod=
el', so
> > you have this vast ocean of resources, and look thru the 'latest'
> > spectacles, you'll only see the model consisting of all the 'latest'
> > resources.
>=20
> How do you combine this with trust?
>=20
> Let's say that a untrusted person updates the information. He revokes
> the earlier data and create a new version.
>=20
> If you say: $s->latest->get($person) it would get the person in the
> model produced by the latest() filter. If you put trust *AFTER*
> latest(): $s->latest->trust($my_list)->get($person) you could end up
> with nothing at al. The latest() filter will only return the latest
> triples, but some of those triples may be untrusted.
>=20
> You would have to create a combined filter because you want to get the
> latest trusted information. Maby like:
>=20
> $filtered_person =3D $s->filter( version =3D> 'latest', trust =3D> $my_=
list )->get($person)
>=20
> The returned person must be a filtered person. Subsequent calls to
> phone_number must use the same filter. Either that, or you have to
> supply the filter to every operation.
>=20
> I'm considering actualy have you submit a context variable wit every
> call. But that could be done implicitly with a thin uniqe filter
> object that encapsulate the real object. The real object would be
> cached and would hold all the data but the encapsuylating thin object
> would be new every time and contain the specific context of the call,
> to know what to filter out. The calling object is not enough context.
>=20
> > > We could base the version handling on statements bound to a point i=
n
> > > time. Every statement belongs to a model and the model has a creat=
ion
> > > date (and maby should have a last-updated-date (in case for "open
> > > models")).
> > >
> > > The date information could be used to see what information is the m=
ost
> > > recent. But I could have two phone numbers so it wouldn't be right=
to
> > > just exclude one of the numbers if the other is more recent. The o=
ld
> > > number must be expired in some way.
> >
> > Actually, that's a rather interesting example. It illuminates the poi=
nt that
> > it is not enough with f.ex. 'home phone'. And I still maintain that y=
ou
> > never 'change' a statement. You revoke an earlier statement, and make=
a new.
> > It kind of solves your problem, doesn't it?
>=20
> I think so.
>=20
> But there could be many agents revoking a statement. The person
> revoking the statement would often not be the same person that made
> the statement. That means that there is no cheap way to integrate
> that status in the system. It would be one full blown statement for
> each revoked statement. Or maby you would revoke an entire model. You
> could also collect the revoked statements in a collection. But that
> would realy hurt performance.
>=20
> For the DBI interface: could we catch most cases by assuming that
> there will usualy only be at most _one_ revokation per statement?
>=20
> Maby we could even let the person insert the revokation statement in
> the original model and by that only allow the model owner to revoke
> the statment?
>=20
> Think about the case there a statement has been revoked by an
> untrusted (anonymous?) agent. How can we handle that in an efficiant
> manner?
>=20
> > > I just ask for an efficient model for the implementation.
> >
> > Well. Based on my work with this kind of models in content managing, =
I'd say
> > that what you need is a 'created' - 'revoked' model, where you'd appl=
y a
> > cached filter on what resources are 'created' but not 'revoked', thes=
e are
> > your current model.
> > (Actually, our CM model was a wee bit more complicated, as you had fo=
ur
> > states: 'created', 'updated', 'published', 'revoked' to separate the =
two
> > levels of publishing - the level where only the author saw the change=
s, and
> > the level where everybody could see the changes.)
>=20
> Nothing stopping you from adding properties like "published".
>=20
> Now. Thinking about it: I have planned for many form of distributed
> properties. They can be distributed over uri-prefixes, collections or
> models. One soulution could be to distribute the revokation statement
> over the intended target.
>=20
> There are two things in this discussion:
> 1. How to represent VC (version control) in RDF::Service
> 2. How to reprecent VC in the DBI interface
>=20
> A requst for a property for a node will trigger the init_props() call
> to the involved interfaces. The type property are separatly handled
> by the init_types() call. I think that this is the most efficient way
> to do it.
>=20
> That means that if you are intrested in any property of a node, all
> properties will be retrieved.
>=20
> This will later be combined with "secondary properties" that will be
> initiated one by one, such as the dynamic properties.
>=20
> Maby we should allow the interface to not set up all the properties.
> But then we must have a way to know what has been returned. Let's say
> that the interface only return the latest version. How will the
> program know that there are other properties if another session wants
> to have all properties? Should we just make an exception for
> versions, or are there a general solution?
>=20
> Each interface can have it's own solution on how to store/handle
> versions. Some may not support versions. But the DBI interface is
> intended to be general purpouse. It could implement revoke statements
> by distributing them over the target statments, if there are more than
> one.
>=20
> > > > Actually - one could say 'version' roughly means 'sufficiently
> > > > equivalent', that is, 'all those instances could fill roughly the=
same
> > > > function, but with small, maybe critical, differences'.
> > >
> > > Ah. Now I think that I understand.
> >
> > Versioning points to the fact that you should be able to say 'this in=
stance
> > is a sub-instance of this instance' a.k.a. 'version'.
>=20
> That dcan't be done if all you do is to revoke one statement and
> create another. That doesn't say that the nes statment is a version
> of the other one.
>=20
> > > You can produce two statements about diffrent things, but that
> > > contradict each other in some way. Which of the statements should =
we
> > > follow?
> >
> > The trick is to isolate the discrepancy - the very thing I was hoping=
we
> > could implement intelligently in WRAF.
>=20
> Well. If you have two models; some of the statements in the model
> could be about the same thing. It would be up to the constraints of
> the object to point aout that there exist a contradiction. The
> contradicting statments could turn up only after some logic machine
> has infered new statments from the original.
>=20
> But this doesn't sau how you will resolve the conflict. I think that
> this is the same problem as in DB replication. In one way or the
> other you choose the one to follow and the one to ignore. This could
> be done by authority, date or by just ask the agent or someone who has
> the authority to make the choice.
>=20
> But that can wait to a later stage. The question now is how to encode
> that one statement or model should be used instead of another (by the
> time that choice has been made).
>=20
> --
> / Jonas - http://jonas.liljegren.org/myself/en/index.html
>=20
> _______________________________________________
> RDF mailing list
> RDF@uxn.nu
> http://www.uxn.nu/cgi-bin/mailman/listinfo/rdf