Fwd (andre at merzky.net): Re: Fwd (andre at merzky.net): Re: Fwd (andre at merzky.net): Re: [saga-rg] context problem
Andre Merzky
andre at merzky.net
Sun Jul 16 12:46:02 CDT 2006
[damned, majordomo seems really broken - forward to the list
again]
----- Forwarded message from Andre Merzky <andre at merzky.net> -----
> Date: Sun, 16 Jul 2006 19:38:54 +0200
> From: Andre Merzky <andre at merzky.net>
> To: Thilo Kielmann <kielmann at cs.vu.nl>
> Cc: Andre Merzky <andre at merzky.net>
> Subject: Re: Fwd (andre at merzky.net): Re: Fwd (andre at merzky.net): Re: [saga-rg] context problem
>
> Quoting [Thilo Kielmann] (Jul 16 2006):
> >
> > Merging 2 mails from Andre:
> >
> > > very good points, and indeed (1) seems cleanest. However,
> > > it has its own semantic pitfalls:
> > >
> > > saga::file f (url);
> > > saga::task t = f.write <saga::task::Task> ("hello world", ...);
> > >
> > > f.seek (100, saga::file::SeekSet);
> > >
> > > t.run ();
> > > t.wait ();
> > >
> > >
> > > If on task creation the file object gets copied over, the
> > > subsequent seek (sync) and write (async) work on different
> > > object copies. In particular, these copies will have
> > > different state - seek on one copy will have no effect on
> > > where the write will occur.
> >
> > I cannot see a problem here: With object copying, you will simply have the
> > same file open twice. And given the operations you do, this might even be
> > the right thing...
> > This example is very academic: can you show an example where the sharing of
> > state between tasks is useful, actually?
>
> The problem here is, that I at a user would expect the write
> to happen at byte 100, but it will happen at byte 0: the
> seek happens on a different object than the write.
>
> What might be a more obvious example, which goes wrong along
> the same lines:
>
> f.write ("line 1\n");
> f.write ("line 2\n");
> f.write ("line 3\n");
>
> That will result in a file
>
> line 1
> line 2
> line 3
>
> whereas the coed
>
> saga::task t1 = f.write ("line 1\n"); t1.run (); t1.wait ();
> saga::task t2 = f.write ("line 2\n"); t2.run (); t2.wait ();
> saga::task t3 = f.write ("line 3\n"); t3.run (); t3.wait ();
>
> will result in a file
>
> line_3
>
> the last write will start on 0, as the previous write
> operated on a different file pointer. In general, you
> cannot execute any two tasks on a single object, at least
> not if any state is of concern, such as file pointer, pwd,
> replica name, stream server port, job id, ...
>
> That is a no-go in my opinion, as it is counter-intuitive,
> and breaks a large number of use cases. And is incosistent
> with the syncroneou method calls.
>
> Yes, you can wreak havoc with state as well:
>
> saga::task t1 = f.write ("line 1\n");
> saga::task t2 = f.write ("line 2\n");
>
> t1.run ();
> t2.run ();
>
> t1.wait ();
> t2.wait ();
>
> will likely result in
>
> linline 2
> e 1
>
> or such - the user does need to think when doing multiple
> async ops at once. I don't see a way around that (and don't
> see a need for it either: we want to make the Grid stuff
> easy, but not revolutionize programming styles).
>
>
> > > I should have added that I'd prefer 3:
> > >
> > > > > 3. when creating a task, all parameter objects are passed "by reference"
> > > > > + no enforced copying overhead
> > > > > - all objects are shared, lots of potential error conditions
> > >
> > > The error conditions I could think of are:
> > >
> > > - change state of object while a task is running, hence
> > > having the task doing something differently than
> > > intended
>
> > Change of state,
>
> That is intentional - see above.
>
>
> > like destruction of objects
>
> Well, that is what we discuss :-) 3 would delay destruction
> until its save (state is not needed anymore).
>
>
> > or change of objects.
>
> What doe you mean here?
>
>
> > Not to speak of synchronization conditions: supposed you
> > have non-atomic write operations (which is everything that
> > writes more than a single word to memory): do you thus
> > also enforce object locking by doing this?
> > If not, you can have inconsistent object state that can be
> > seen by one task, just because another task is halfway
> > through writing the object... (all classical problems of
> > shared-memory communication apply)
>
> See above. You are right, but I don't see a way around
> that, without causing more harm than good (child and bathtub
> come to my mind for some reason...).
>
> BTW: the bulk optimization we have now assumes that tasks
> which run at the same time are, by their very definition,
> independent from each other, do not depend on any specific
> order of execution, and do not depend from each other in
> respect to object state. That are the very points we talk
> here about - I think its a very sensible assumption. I have
> the same behaviour on a unix shell BTW:
>
> touch file
> date >> file &
> date >> file &
>
> I would not be able to make assumptions about the file
> contents... (well, here I could make a save bet, but you
> know what I mean).
>
>
> > > - limited control over resource deallocation
> >
> > this is the same thing as above
> >
> > The problem really is that there is no "object lifecycle"
> > defined. There is no way to define which task or thread
> > might be responsible or even allowed to destroy objects or
> > change objects. Is it???
>
> Yes, that is what I mean with limited control.
>
> We had a discussion on this list and in Tokyo about the
> semantics of cancel(), which touches the same problem:
> should task.cancel() block until resources are freed? As we
> might talk about remote resources, and Grids are unreliable,
> we might block forever. That does not make sense, at least
> not always.
>
> The resolution we came up with is that cancel() is advisory,
> so non-blocking, but can also use a timeout parameter (with
> -1 meaning forever) to block until resources are freed.
>
> Timeouts do not make sense on destructors I believe, but
> 'advisory destruction' does, IMHO.
>
>
> > > The advantages I see:
> > >
> > > - no copy overhead (but, as you say, that is of no
> > > concern really)
> >
> > ok, but minor point.
>
> right. Lets forget that from now on.
>
>
> > > - simple, clear defined semantics
> > no, it is the the most dangerous of the three versions
>
> Well, see above - I think its the most sensible semantics
> :-)
>
>
> > > - tasks keep objects they operate on alive -
> > > objects keep sessions they live in alive -
> > > sessions keep contexts they use alive
> >
> > what is the maening of "alive" here??? Now that you have
> > outruled memory management...
>
> see above: resources get freed if not needed anymore.
>
> >
> > > - sync and asyn operations operate on the same
> > > object instance.
> >
> > Let's forget about "sync" here: it is the task that is
> > running in the current thread, so multiple tasks share
> > object instances.
>
> Well, it would be nice to have same semantics for sync and
> async, don't you think? :-)
>
>
> > > Either way (1, 2 or 3), we have to have the user of the
> > > API thinking while using it - neither is not idiot
> > > proof.
> >
> > Well, we should strive to limit the mental load on the
> > programmer as much as possible...
> >
> > > I think (2) is most problematic, if I understant your
> > > 'hand-over' correctly: that would mean you can't use the
> > > object again until the task was finished?
> >
> > No, it means you will never ever again be allowed to use
> > these objects. (hand over includes the hand over of the
> > responsibility to clean up...)
>
> Right. So you can never do a async read, and then a sync
> seek, and then a async read again. At least not with
> sensible results.
>
> Also, I need to create 100 file instances to do 100 reads?
> Remember that opening a file is a remote op in itself,
> potentially. Then we don't need the task model anymore.
>
> That is broken IMHO.
>
> Cheers, Andre.
>
>
> > Thilo
>
> -- "So much time, so little to do..." -- Garfield
----- End forwarded message -----
--
"So much time, so little to do..." -- Garfield
More information about the saga-rg
mailing list