Tile delaying ...

Hi Chris,

  I had a quick poke at the slurp delaying logic with some printfs:

--- a/browser/src/app/Socket.ts
+++ b/browser/src/app/Socket.ts
@@ -473,6 +473,7 @@ class Socket {
                 // scale delay, to a max of 50ms, according to the number of
                 // tiles predicted to arrive.
                 const delayMS = Math.max(Math.min(predictedTiles, 50), 1);
+ window.console.log('Predicted tiles ' + predictedTiles + ' so delay ' + delayMS);

                 if (!this._slurpQueue) this._slurpQueue = [];
                 this._slurpQueue.push(e);

  And got mostly a tiles 0 so delay 1 message left and right (which I guess is tolerable) - and when I did a big page-down (on my small # of tiles screen on the train) - I got:

13:13:23.705 Socket.ts:476 Predicted tiles 20 so delay 20
...
13:13:23.723 Socket.ts:476 Predicted tiles 20 so delay 20
13:13:23.783 Socket.ts:476 Predicted tiles 0 so delay 1

  Which seemed to indeed wait longer to accumulate tiles from the network.

  Looks like we rarely wait anything like 50ms- although it does look like we're not updating the time to wait or noticing that they are all there - so perhaps there is some time accounting issue whereby we should subtract the slurped tile queue from what is read. Some sort of:

  TileManager.predictTiles*Left*ToSlurp();

  which might cut some time out - particularly for the local case for larger screen refreshes (?)

  HTH,

    Michael.

I think it’s worth clarifying what this delay is trying to achieve, because I’m uncertain at this point. The comment above it makes it sound like it was a performance-saving measure for when tile updates come in lots of tiny batches because redrawing is slow.

If that’s the case, it shouldn’t be necessary. Redrawing isn’t slow anymore and if we’re doing things correctly and not showing partial updates, we shouldn’t be triggering redraws in the case of single tiles that are parts of larger batches. If we are erroneously triggering redraws (and I think we are in some common cases), that should be fixed, probably before removing the delay - not because the delay helps (in testing, I don’t see any visual difference at all with or without), but for consistency of experience.

If it’s not because drawing is slow, 1- it’d be great to update the misleading comment with the real reason :slight_smile: and 2- what is the real reason? In this situation we’re receiving from the server and we don’t expect to send a response, what does delaying our handling of the receipt do to help us that isn’t related to drawing?

My strong suspicion is that we’re using a hammer (semi-arbitrary message-handling delay) to hammer in a screw (visual coherency issues/performance issues due to poor handling of tile updates).

Chris

Hi Chris,

I think it's worth clarifying what this delay is trying to achieve, because I'm uncertain at this point. The comment above it makes it sound like it was a performance-saving measure for when tile updates come in lots of tiny batches because redrawing is slow.

  Yes - that's certainly one element of it - it used to take 50-100ms to re-render the scene IIRC - presumably now we don't do it 5x over it's faster :wink: but even so ...

If that's the case, it shouldn't be necessary. Redrawing isn't slow anymore and if we're doing things correctly and not showing partial updates, we shouldn't be triggering redraws in the case of single tiles that are parts of larger batches.

  What code stops partial updates there ?

If we are erroneously triggering redraws (and I think we are in some common cases), that should be fixed, probably before removing the delay - not because the delay helps (in testing, I don't see any visual difference at all with or without), but for consistency of experience.

  Sure.

If it's not because drawing is slow, 1- it'd be great to update the misleading comment with the real reason :slight_smile: and 2- what is the real reason? In this situation we're receiving from the server and we don't expect to send a response, what does delaying our handling of the receipt do to help us that isn't related to drawing?

  The idea is to reduce flicker, which gives a significant feeling of perceived slow drawing even if it is fast. ie. my contention is that software that doesn't flicker (ie. partially update the screen, and then full update it) -feels- faster than faster software that can render two frames, one broken, one fixed.

My strong suspicion is that we're using a hammer (semi-arbitrary message-handling delay) to hammer in a screw (visual coherency issues/ performance issues due to poor handling of tile updates).

  Quite possibly; as I mentioned undo / re-doing a large visual change even a simple one like re-coloring a spreadsheet orange and back should (ideally) detect a large invalidation, and allow a reasonable delay for the tiles to arrive to get a non-flickering result - so you don't perceive motion down the screen as it changes.

  At least that's the idea. Probably we need some videos to show it working (or not); in general - for a large display update we might want to wait longer - 100ms or so I expect to avoid perceived flicker.

  At least, in most cases, most of the time the extra time appears not to trigger in my quick debugging.

  Do we see long gaps in the profiles though ?

  ATB,

    Michael.

Answering in-line, excuse the bad formatting, this e-mail client has failed and I’ve done my best…

If that’s the case, it shouldn’t be necessary. Redrawing isn’t slow
anymore and if we’re doing things correctly and not showing partial
updates, we shouldn’t be triggering redraws in the case of single tiles
that are parts of larger batches.

What code stops partial updates there ?

The flow of events ought to be along the lines of:

  • Server: Change triggers redrawing of an area

  • Server: Invalidation rectangles are sent to client

  • *Client: Received invalidation rectangles and marks related tiles as invalid

  • *Client: Sends tile request for newly invalid area

  • *Server: Receives tile request

  • Server: Sends related tiles

  • Client: Receives tiles, possibly over some length of time - drawing is paused while we wait for visible, requested tiles

  • Client: Receives final visible tile, unpauses drawing and schedules a redraw

  • Ideally these steps would be skipped

The code that deals with this is scattered about in TilesMiddleware.ts, check out pausedForCoherency and deferDrawing. I don’t think it currently handles invalidations entirely as we/I expect and we need to think about timeouts too (though when expected messages go missing, you can pretty much say goodbye to coherency… I’d be inclined to ask the user to refresh the tab in this situation, rather than trying to continue with random missing messages)

If it’s not because drawing is slow, 1- it’d be great to update the
misleading comment with the real reason :slight_smile: and 2- what is the real
reason? In this situation we’re receiving from the server and we don’t
expect to send a response, what does delaying our handling of the
receipt do to help us that isn’t related to drawing?

The idea is to reduce flicker, which gives a significant feeling of
perceived slow drawing even if it is fast. ie. my contention is that
software that doesn’t flicker (ie. partially update the screen, and then
full update it) -feels- faster than faster software that can render two
frames, one broken, one fixed.

Yes, preaching to the choir here :slight_smile: I count ‘flickering’ as a subset of ‘visual incoherency’ - we should never be displaying partially rendered frames, in-between frames, etc. Sometimes this results in flickering, sometimes it doesn’t, but it always results in bad UX. I don’t think delaying message handling is the right way to deal with potential flickering - given that tile updates happen entirely asynchronously, that’s still no guarantee that they’d happen in the same frame.

My strong suspicion is that we’re using a hammer (semi-arbitrary
message-handling delay) to hammer in a screw (visual coherency issues/
performance issues due to poor handling of tile updates).

Quite possibly; as I mentioned undo / re-doing a large visual change
even a simple one like re-coloring a spreadsheet orange and back should
(ideally) detect a large invalidation, and allow a reasonable delay for
the tiles to arrive to get a non-flickering result - so you don’t
perceive motion down the screen as it changes.

We shouldn’t be delaying to do this - we should deal with messages as soon as we get them, we should just not be partially updating the document tiles when we have the metadata necessary to make sure all visual updates are coherent (i.e. recolouring a spreadsheet will show the whole thing recoloured in a single frame). There’s no need to delay, the server can send us what we need as quickly or as slowly as it likes, we should have the information to do the sensible thing (which is to continue processing everything and delay tile bitmap updates or redraws until all visual updates have been received).

Hopefully that all makes sense?

Chris

Hi Chris,

    Answering in-line, excuse the bad formatting, this e-mail client has
    failed and I've done my best...

=) I'm afraid the modern world of forums does crazy things to E-mail for some reasons - particularly indented blocks which I rather like, let me try to train myself out of that.

    The flow of events ought to be along the lines of:

    - Server: Change triggers redrawing of an area
    - Server: Invalidation rectangles are sent to client
    - *Client: Received invalidation rectangles and marks related tiles
    as invalid
    - *Client: Sends tile request for newly invalid area
    - *Server: Receives tile request
    - Server: Sends related tiles

  Ok - so the '*' pieces will get short-circuited on the server side for all sensible cases to avoid latency; so you can assume you get:

       - Server: Invalidation rectangles are sent to client
       - short delay depending on invalidated size -
       - Server: Sends related tiles

  With no need for another round-trip - so the '*'d bits are already not there.

    - Client: Receives tiles, possibly over some length of time -
    drawing is paused while we wait for visible, requested tiles

  Great.

    The code that deals with this is scattered about in
    TilesMiddleware.ts, check out `pausedForCoherency` and
    `deferDrawing`. I don't think it currently handles invalidations
    entirely as we/I expect and we need to think about timeouts too

  Fair cop. So - I'm well up for doing the delaying at a different level - clearly the delaying is not very polished where it is anyway - it is buggy in that it doesn't adapt to the count of already received tiles which is poor, it probably adds extra latency we don't need as well.

Yes, preaching to the choir here :slight_smile: I count 'flickering' as a subset of 'visual incoherency' - we should never be displaying partially rendered frames, in-between frames, etc.

  Great - as long as we're on the same page that's fine =) I know we work hard to avoid zstd-in-RAM -> texture -> render incoherency which is cool, the network

Sometimes this results in flickering, sometimes it doesn't, but it always results in bad UX. I don't think delaying message handling is the right way to deal with potential flickering - given that tile updates happen entirely asynchronously, that's still no guarantee that they'd happen in the same frame.

  Sure; it's some attempt to balance the stall-forever vs. flicker quandry :slight_smile:

From my limited debugging, I think we have issues around invalidations at the moment - If I update the background colour of a spreadsheet on a large enough screen, I see partial updates currently (which I don't see in LibreOffice). It doesn't really matter about the delay, there's always going to be a situation or a screen big enough to break the heuristics - I believe we have the metadata to just do the right thing without any heuristics.

  Fair enough - if you have a better, cleaner and more consistent solution for the problem and we're both on the same page go for it =) Would just like to see the new solution/fix to replace the older one in one shot if possible :slight_smile:

  ATB,

    Michael.